DeepSeek Open-Sources DeepGEMM: A 300-Line CUDA Library Redefining FP8 Matrix Computation

In a move that sent shockwaves through the AI developer community, DeepSeek today unveiled ​DeepGEMM, an open-source FP8 matrix multiplication library that combines ​simplicity​ and ​raw computational power. Released under MIT License as part of its “Open Source Week” initiative, this 300-line CUDA gem has already sparked comparisons to “compiler sorcery” 🧙♂️ among GPU engineers….