Oct 14, 2024 MLSys @ WukLab - Can Scheduling Overhead Dominate LLM Inference Performance? A Study of CPU Scheduling Overhead on Two Popular LLM Inference Systems May 14, 2024 MLSys @ WukLab - Preble: Efficient Prompt Scheduling for Augmented Large Language Models