This link section is inspired by the ones from my favourite bloggers such as gwern, guzey or nintil. It presents a semi up-to-date list of my most interesting reads of the last few months.
February 2025
October 2023
- Phi-1.5 Model: A Case of Comparing Apples to Oranges?
- Flash-Decoding for long-context inference
- RingAttention
- https://arxiv.org/abs/2310.01889
- The urge to go full tri dao et al and port that thing from Jax to a CUDA/Triton kernel…
- This would not only enable RingAttention to scale the sequence length by the number of devices used during training, but potentially also achieve higher a Model FLOPs utilization than FlashAtention-2 by computing the full transformer block in a blockwise manner in one kernel
- You could fine-tune a CodeLLaMA 7B to a 4million token context window with just 32x A100s to literally fit every code repository in the context…
- It’s time to be a definite techno-optimist
June 2023
- Large Language Models can Simulate Everything
- Large Language Models as Tool Makers
- Blockwise Parallel Transformer for Long Context Large Models:
May 2023
April 2023
- Scaffolded LLMs are not just cool toys but actually the substrate of a new type of general-purpose natural language computer
March 2023
- Is ChatGPT 175 Billion Parameters? Technical Analysis
- A step towards self-improving LLMs
- Alexey Guzey’s Lifehacks: https://guzey.com/lifehacks/
Sorry, it looks like this this Notion document has not been added to Embed
Notion. Please head to
https://embednotion.com to embed
your Notion document.
Made with 👉 Embed Notion