Notes

🗂️ A Taxonomy of Reinforcement Learning Algorithms

A guide to understanding and categorizing the many flavors of reinforcement learning algorithms, from value iteration to PPO.

📝 Introduction to RL

Notes on reinforcement learning from Steve Brunton’s Data-Driven Science and Engineering book, covering core RL concepts, mathematical formalism, and key ideas

🪈 ML at Scale: Pipeline Parallelism

Pipeline parallelism is a technique for training large ML models, showing how to efficiently partition model layers across devices to optimize distributed training and manage memory constraints.

🔎 Monte Carlo Tree Search

Monte Carlo Tree Search (MCTS) is AI algorithm that makes decisions by strategically sampling possible futures. It builds search trees incrementally, balancing exploration of new paths with exploitation of promising ones, and uses random simulations to tackle problems too complex for exhaustive analysis

🪆 ML at Scale: Tensor Parallelism

Tensor Parallelism is a technique for training large ML models by splitting individual tensors across multiple devices, enabling efficient distributed training of models too large to fit on a single accelerator.

💽 ML at Scale: Data Parallelism

Data Parallelism is a technique for training large ML models by distributing data across multiple devices, enabling parallel processing while maintaining model consistency through gradient synchronization.

🎛️ How do Mixture of Expert Models Work?

A deep dive into Mixture of Expert (MoE) models, exploring how they work, their benefits and challenges, and their role in modern language models like Mixtral.

📦 Archive: The Daily Ink Paper Breakdowns

The Daily Ink is a discontinued newsletter featuring bi-weekly breakdowns of research papers in the domain of machine learning and ML systems. This is an archive of all the articles.

🎲 Probability and Random Processes Cheat Sheet

A comprehensive overview of key concepts and theorems in Probability and Random Processes, covering essential topics such as conditional probability, Bayes’ theorem, independence, and counting principles, serving as a valuable reference for students in EECS126.