Posts - philkrav

Nintil: Massive Input vs Spaced Repetition Is deep learning a “formal subject”? Deep models and their properties are complex and empirical, like biology. It is one of the few branches of computer science which is experimental. Most hot new papers will be forgotten. So I’d lean towards massive input for learning deep learning quickly. Of course, some fundamentals are worth memorizing, particularly on the engineering side. Deeply understanding Efficiently Scaling Transformer Inference without a firm grasp of tensor parallelism was very difficult, so studying the Megatron paper first was very helpful.

Speculative Decoding

Stuff I've been reading #1