CV

Education

ETH Zurich
Zurich, Switzerland
09/2021 - 01/2023
  • Thesis topic: Stream processing in a heterogeneous environment, Prof. Timothy Roscoe

  • New ways to use interconnects (PIO, DMA, cache coherent) available at the emerging modern heterogeneous FPGA-based platforms; co-designer of the message passing protocols on Enzian platform, achieved RPC execution over cache coherent interconnect in less than 1 microsecond for 128B data transfers Publication

  • Efficient execution of stream processors and redesigning their architecture for FPGA-based platforms; author of the new design for Timely Dataflow stream processor in Rust for heterogeneous computing; developed FPGA design for stream processing for different data ways of data communication (DMA, PCIe, CC), improved latency by up to 50% for different workloads for pointstamped progress tracking.

  • Analyzed two sources influencing FPGA offload decisions in stream processing: computations and data batch size, and was the first one to create solutions for effective movements of small batches, expanding the possibilities for stream processing offloading.

  • Applied static scheduling compiler techniques in the CIRCT project to isolate simple stream processing operators and achieved up to 40% less generated code and up to 50% less resource utilization.

Moscow Institute of Physics and Technology
Moscow, Russia
09/2016 - 07/2020
  • Focus on Computer Security and Electrical Engineering, GPA: 8.9 / 10

  • Master Thesis: Access management of the control functions to the KVM-based virtual infrastructure

Experience

Research Assistant
ETH Zurich
May 2019 – January 2025
  • Led three research projects with a focus on heterogeneous hardware, code generation, and stream processors

  • Supervised 7 Masters and Bachelor students, in particular:

    • OpenCL Support for Enzian Research Platform

    • Asynchronous execution of Timely Dataflow on heterogeneous architecture

    • Persistence Layer for Lock-Free Data Structures on Enzian - use cache coherent node as a background persistence layer; automatically invalidate and write back write and RCAS requests on the CPU

    • High-throughput communication over coherent links - used prefetching and concurrent invalidates to achieve nearly same performance for 2KiB RPCs as for 128B RPCs

    • Taught Computer Architecture and Big Data courses at ETH for 5 years

Research Intern
ETH Zurich
June 2018 – September 2018
  • Achieved predictable optimization results for stencil computations by creating a guide for pragmas usage in SDAccel tool and experimenting with pragmas and programming patterns to analyze their impact on the code generation (C/C++/OpenCL/FPGA/System Verilog/ Vivado)

  • Enabled automatic application of SDAccel rules for stencil computations by implementing matchers to detect patterns using the Polly isl-based library (C)

Software Developer
NVIDIA Moscow
September 2017 – June 2018
  • Optimized binary translation algorithms by reorganizing instructions and nodes, which are part of the "hot" profiled paths in the basic block graphs to accelerate performance-critical code paths (C/C++)

Research Intern
EPFL, Lausanne, CH
June 2017 – September 2017
  • Achieved improved resource utilization in Dynamatic HLS by designing custom bitwidth optimizations to address inefficient type handling in hardware synthesis(C++/ FPGA/System Verilog)

  • Came up with a set of necessary optimizations in Dynamatic HLS by classifying optimizations in LegUp HLS to choose the applicable set for Dynamatic

Research Assistant
Laboratory of Applied Computational Geophysics at MIPT, Moscow, RU
December 2016 – May 2019
  • Developed an innovative way to model seismic response from facing fractures by using the Chimera grids approach to increase modeling accuracy (C/C++)

  • Optimized the computational latency of hierarchical grids approach for seismic computations by parallelizing it using MPI to allow fast online wave rendering in the OpenGL-based UI tool (C/C++/OpenGL/MPI)

Java Developer
SBDA Group, Moscow, RU
October 2016 – December 2016
  • Significantly enhanced recommendation system coverage for banks by applying fine-grained optimization strategies based on user purchase data, improving system applicability (Java)

Software Developer
Intel, Moscow, RU
August 2015 – May 2016
  • Expanded Intel VTune capabilities by integrating Go language profiling support, improving profiling coverage for new workloads (C/Go)

  • Improved team’s Go garbage collection knowledge base by debugging garbage collection strategies, guiding team’s future design decisions in Go compiler development

Skills

Concepts Systems architecture, Systems design, Computer architecture, Heterogeneous architecture, Cache Coherency, Distributed computing, Linux, Device drivers, Embedded systems, Operating systems, Data processing, Stream processing, FPGA-based design, High Performance Computing, Compiler theory, Compiler optimizations, Algorithms and Data structures, Databases, Circuit design, RTL, FPGA design simulation, FPGA design verification, Synthesis, Timing Analysis, PCIe, AXI, DMA, CXL, Hardware logic design, ARM, X86

Technologies Git / GitHub / GitLab, Qt (C++ GUI), Visual Studio Code, Vim, Emacs, CMake, GDB, LLDB, Clang, GCC, Docker, CI/CD, MongoDB, Spark, Valgrind, MPI, NumPy, OpenGL, OpenCL, Xilinx Vivado, Modelsim

Programming languages C, C++, System Verilog, VHDL, Rust, Assembly, SQL, Python, Java

Additional courses during PhD VLSI (System Verilog programming), Models of Computation, Academic writing, Advanced Systems Lab (queue theory, load balancing)