The Daily Ink is a discontinued newsletter featuring bi-weekly breakdowns of research papers in the domain of machine learning and ML systems. This is an archive of all the articles.
- Pipeline Parallelism as a Band-Aid on Memory Limitations
- Large Models as Engines of Computation
- Distilling Models Makes Them Feasible to Use
- Quadratic Complexity Holds Back the Legendary Transformer (Part 2)
- Quadratic Complexity Holds Back the Legendary Transformer (Part 1)
- Composing Models Together Makes Them More Powerful
- Multi-Modal Models Are the Future
- Deep Learning Solves a 20-Year-Long Unsolved Problem in Science (Part 2)
- Deep Learning Solves a 20-Year-Long Unsolved Problem in Science (Part 1)
- Models Can Do Calculus Better than You
- Is a Group of Expert Models Better Than One Very Smart Model?
- Winning the AI Lottery by Buying A Lot of Tickets
- Using Information Retrieval for Code Generation
- Meta’s New Model Is Small and Mighty
- Models Can Control Robots Just Like Humans
- Anthropic Makes AI That Teaches Itself Ethics
- Models Can Magically Learn New Skills at Scale
- Discovering a Better Optimization Algorithm with Evolution
- Talking to Models Requires Special Prompts that Make Them Think Sequentially
- Teaching LLMs to Use Tools and Not Suck at Math
- English Is Just Math in Prettier Clothing
- The Secret to Good Writing Is Editing
- Solving Context Length Constraints by Distillation
- A Large Language Model for SCIENCE
- Optimal Parallelism in ML Training is Possible, says ALPA
- Google Makes a Language Model for Music
- Google’s LaMDA Model Is Too Convincing, and a Researcher is Fired
- Teaching Computers to Think in Abstractions
- The Secret Sauce Behind ChatGPT
- FlashAttention Challenges ML Researchers to Think About Systems-Level Improvements
- Make Models Smarter, Not Larger, with Data Pruning
- DeepMind Attempts to Make AI That Can Do Anything
- Training Compute-Optimal Large Language Models
- Gradient Descent: The Ultimate Optimizer
- Cramming: Training a Language Model on a Single GPU in One Day
- A Neural Corpus Indexer for Document Retrieval