πŸ—‚οΈ A Taxonomy of Reinforcement Learning Algorithms

A guide to understanding and categorizing the many flavors of reinforcement learning algorithms, from value iteration to PPO.

October 2024 · 6 min · 1191 words · Arushi Somani

πŸ“ Introduction to RL

Notes on reinforcement learning from Steve Brunton’s Data-Driven Science and Engineering book, covering core RL concepts, mathematical formalism, and key ideas

October 2024 · 12 min · 2450 words · Arushi Somani

πŸͺˆ ML at Scale: Pipeline Parallelism

Pipeline parallelism is a technique for training large ML models, showing how to efficiently partition model layers across devices to optimize distributed training and manage memory constraints.

July 2024 · 12 min · 2379 words · Arushi Somani, Anton Zabreyko

πŸ”Ž Monte Carlo Tree Search

Monte Carlo Tree Search (MCTS) is AI algorithm that makes decisions by strategically sampling possible futures. It builds search trees incrementally, balancing exploration of new paths with exploitation of promising ones, and uses random simulations to tackle problems too complex for exhaustive analysis

July 2024 · 15 min · 3102 words · Arushi Somani

πŸͺ† ML at Scale: Tensor Parallelism

Tensor Parallelism is a technique for training large ML models by splitting individual tensors across multiple devices, enabling efficient distributed training of models too large to fit on a single accelerator.

March 2024 · 8 min · 1555 words · Arushi Somani, Anton Zabreyko

πŸ’½ ML at Scale: Data Parallelism

Data Parallelism is a technique for training large ML models by distributing data across multiple devices, enabling parallel processing while maintaining model consistency through gradient synchronization.

March 2024 · 8 min · 1566 words · Arushi Somani, Anton Zabreyko

πŸŽ›οΈ How do Mixture of Expert Models Work?

A deep dive into Mixture of Expert (MoE) models, exploring how they work, their benefits and challenges, and their role in modern language models like Mixtral.

February 2024 · 6 min · 1202 words · Arushi Somani

πŸ“¦ Archive: The Daily Ink Paper Breakdowns

The Daily Ink is a discontinued newsletter featuring bi-weekly breakdowns of research papers in the domain of machine learning and ML systems. This is an archive of all the articles.

July 2023 · 2 min · 314 words · Arushi Somani

🎲 Probability and Random Processes Cheat Sheet

A comprehensive overview of key concepts and theorems in Probability and Random Processes, covering essential topics such as conditional probability, Bayes’ theorem, independence, and counting principles, serving as a valuable reference for students in EECS126.

May 2021 · 10 min · 1984 words · Arushi Somani