Vikranth Srivatsa

WukLab, Sysnet, Sky/RiseLab

I'm a 2nd-year PhD student at WukLab, focusing on ML systems and LLM inference, under the guidance of Professor Yiying Zhang. My research covers load balancing, memory management, and efficient scheduling for large language models.

I completed my undergraduate and master's degrees in Electrical Engineering and Computer Science at UC Berkeley, where I worked at RISE Labs with Professor Joseph Gonzalez. My research there included serverless execution across multi-cloud, 5G, and edge computing with Moustafa Adelbaky, and machine learning distribution shift explainability with Yaoqing Yang and Yaodong Yu.

Vikranth Srivatsa

2nd year PhD Student at UCSD

News

Mar 05, 2025
🎉Cognify: Supercharging Gen-AI Workflows With Hierarchical Autotuning to KDD 2025!
Apr 05, 2025
🎉Preble: Efficient Distributed Prompt Scheduling for LLM Serving to ICLR 2025!
May 01, 2024
🎉 InferCept was accepted to ICML 2024!

Fun Projects

Smart Speaker

Smart Speaker

June 2023

Created a Smart Speaker system to study passive listening devices. Built machine learning NLP models to classify different intents for passive listening and speech. Setup architecture and workflow to run smart speaker with dynamic skills for experiment. Wrote noise filtering/detection algorithms in order to improve and detect intents.

Machine LearningNLPHardware
Cloudless: Serverless Execution

Cloudless: Serverless Execution

May 2023

An efficient serverless scheduling framework across client, 5G networks, and the cloud. It uses heuristic and linear programming based algorithms for optimal code placement on compute nodes. It optimizes memory and cost with serverless Kubernetes, AWS, GCP, and Azure.

Distributed SystemsCloud Computing5GEdge Computing
Worst Group Performance

Worst Group Performance

December 2021

Prior work has suggested that overparameterization can hurt test accuracy on rare subgroups. Motivated by the fact that subgroup information is often unknown, we investigate the effect of model size on worst-group generalization under empirical risk minimization (ERM). Our systematic evaluation reveals that increasing model size does not hurt, and may help, worst-group test error under ERM.

Machine LearningDistribution ShiftModel Robustness
Fluid Simulation

Fluid Simulation

May 2021

A 3D Water Simulation using C++ and Navier Stokes Equations. It uses optimized water physics using KD Trees and parallel processing.

GraphicsSimulationC++
By A Thread

By A Thread

December 2020

3D Modeling and animation using Maya about a mouse trying to reach the moon. The video was edited using After Effects and Premiere Pro.

SystemsThreadingPerformance