ForeverYoung
RSS FeedNotes on machine learning, CUDA, Python, and beyond. Driving into the distance.
Recent Posts
-
Low-Dimensional Representations: Projection, MRL, and Sparse Representations
In large-scale retrieval systems, embedding cost comes from model inference, vector storage, memory bandwidth, and KNN search. This post compares projection, MRL, and CSR-style sparse representations.
-
A Deep Dive into Sparse Matrices in PyTorch 2.12: COO, CSR, CSC, BSR, and BSC
A look at COO, CSR, CSC, BSR, and BSC in PyTorch 2.12: how sparse matrices are stored, how multiplication is routed, and what one CPU/GPU benchmark says about storage and speed ratios.
-
Think Before You Embed
Production search systems rewrite queries with LLMs before embedding them. Two ICLR 2026 papers ask what happens when elaboration and embedding share a model and a gradient.
-
Data Visualization with Hand-Drawn/Sketchy Style
A survey of tools for creating hand-drawn/sketchy style data visualizations: rough.js, draw.io, matplotlib xkcd, chart.xkcd, and cutecharts.
-
Model Size vs. Inference Speed in Deep Learning
An examination of how FLOPs, parameter count, memory access volume, and memory footprint affect inference speed, with practical network design recommendations for different hardware platforms.