Index
|
View all
|
Random pick
|
Tags
Damek Davis
All tags
adaptive-data-analysis
ai-alignment
ai-for-math
benchmarks
fine-tuning
half-baked
lean
optimization
policy-gradient
pytorch
reinforcement-learning
transformers
adaptive-data-analysis
The ladder mechanism for ml competitions
2025-05-04
Adaptive data analysis via subsampling
2025-05-05
ai-alignment
What is constitutional AI?
2025-05-03
ai-for-math
DeepSeek-Prover-V2 overview
2025-05-12
benchmarks
The ladder mechanism for ml competitions
2025-05-04
fine-tuning
All roads lead to likelihood: the value of reinforcement learning in fine-tuning
2025-05-07
half-baked
Weak baselines
2025-05-04
lean
DeepSeek-Prover-V2 overview
2025-05-12
optimization
Getting the hang of policy gradients by reframing optimization as RL
2025-05-08
Modded-NanoGPT Walkthrough I: initial setup, compiler config, and custom FP8 operations
2025-05-13
policy-gradient
Basic facts about policy gradients
2025-05-06
Getting the hang of policy gradients by reframing optimization as RL
2025-05-08
pytorch
Modded-NanoGPT Walkthrough I: initial setup, compiler config, and custom FP8 operations
2025-05-13
reinforcement-learning
Basic facts about policy gradients
2025-05-06
All roads lead to likelihood: the value of reinforcement learning in fine-tuning
2025-05-07
Getting the hang of policy gradients by reframing optimization as RL
2025-05-08
DeepSeek-Prover-V2 overview
2025-05-12
transformers
Multi head, multi query, and grouped query attention
2025-05-03
What is kv cache?
2025-05-03
Modded-NanoGPT Walkthrough I: initial setup, compiler config, and custom FP8 operations
2025-05-13