Blog

Optimization

Filter

All Posts self-improvement18 machine-learning18 stanford-cs33618 letters15 deep-learning15 motivation8 discipline7 mindset6 books5 focus4 productivity4 action4 +74 more

Tutorials·January 12, 2026·8 min read

CS336 Notes: Lecture 10 - Inference

LLM inference optimization: understanding the prefill vs decode split, KV cache management, speculative decoding, and why inference is fundamentally memory-bound.

machine-learning inference stanford-cs336 deep-learning

Read

Tutorials·January 8, 2026·6 min read

CS336 Notes: Lecture 6 - Kernels and Triton

Writing efficient GPU kernels with Triton: profiling, benchmarking, kernel fusion, and when to hand-optimize versus using torch.compile.

machine-learning gpu stanford-cs336 triton

Read

Tutorials·January 7, 2026·11 min read

CS336 Notes: Lecture 5 - GPUs

GPU fundamentals for LLM training: memory hierarchy, arithmetic intensity, kernel optimization, FlashAttention, and bandwidth limits.

machine-learning gpu stanford-cs336 hardware

Read

Tutorials·January 4, 2026·10 min read

CS336 Notes: Lecture 2 - PyTorch and Resource Accounting

Resource accounting for LLM training: compute estimates, memory budgets, dtypes, tensors, and mixed precision.

machine-learning pytorch stanford-cs336 gpu

Read