Tags | Damek Davis’ Website

All tags

adaptive-data-analysis

The ladder mechanism for ml competitions 2025-05-04
Adaptive data analysis via subsampling 2025-05-05

ai-alignment

What is constitutional AI? 2025-05-03

ai-for-math

DeepSeek-Prover-V2 overview 2025-05-12
Is AlphaEvolve problem B.1 hard? 2025-05-15

autograd

Using min cut to determine activation recomputation strategy 2025-06-11
Basic idea behind flash attention (V1) 2025-06-12

benchmarks

The ladder mechanism for ml competitions 2025-05-04

cuda

Basic facts about GPUs 2025-06-18

distributed

Modded-NanoGPT Walkthrough II: Muon Optimizer, Model Architecture, and Parallelism 2025-06-02

fine-tuning

All roads lead to likelihood: the value of reinforcement learning in fine-tuning 2025-05-07

finite fields

Linear layouts, triton, and linear algebra over F_2 2025-06-09

gpu

Linear layouts, triton, and linear algebra over F_2 2025-06-09
Basic idea behind flash attention (V1) 2025-06-12
Basic facts about GPUs 2025-06-18

half-baked

Weak baselines 2025-05-04

lean

DeepSeek-Prover-V2 overview 2025-05-12

linear algebra

Linear layouts, triton, and linear algebra over F_2 2025-06-09

min cut

Using min cut to determine activation recomputation strategy 2025-06-11

optimization

policy-gradient

Basic facts about policy gradients 2025-05-06
Getting the hang of policy gradients by reframing optimization as RL 2025-05-08

pytorch

reinforcement-learning

Basic facts about policy gradients 2025-05-06
All roads lead to likelihood: the value of reinforcement learning in fine-tuning 2025-05-07
Getting the hang of policy gradients by reframing optimization as RL 2025-05-08
DeepSeek-Prover-V2 overview 2025-05-12

transformers

Multi head, multi query, and grouped query attention 2025-05-03
What is kv cache? 2025-05-03
Modded-NanoGPT Walkthrough I: initial setup, compiler config, and custom FP8 operations 2025-05-13
Modded-NanoGPT Walkthrough II: Muon Optimizer, Model Architecture, and Parallelism 2025-06-02
Basic idea behind flash attention (V1) 2025-06-12

triton

Linear layouts, triton, and linear algebra over F_2 2025-06-09