Trent Nelson
  • Articles
  • Projects & Repos
  • Tools
Categories
All (2)
AVX2 (1)
Assembly (1)
C (1)
Free-Threading (1)
GPT2 (1)
LLM (1)
MASM (1)
No-GIL (1)
PyTorch (1)
Python (1)
SIMD (1)

Articles

PyTorch and Python Free-Threading

Unlocking multi-threaded parallel inference on PyTorch models
PyTorch
Python
Free-Threading
No-GIL
LLM
GPT2
This post examines multi-threaded parallel inference on PyTorch models using the new No-GIL, free-threaded version of Python. Using a simple 124M parameter GPT2 model that we train from scratch, we explore the novel new territory unlocked by free-threaded Python: parallel PyTorch model inference, where multiple threads, unimpeded by the Python GIL, attempt to generate text from a transformer-based model in parallel.
Feb 13, 2025
Trent Nelson

The STRING_TABLE Structure

Is Prefix Of String In Table?

A Journey Into SIMD String Processing
AVX2
SIMD
C
Assembly
MASM
This article details an approach for efficiently determining if a given string prefix-matches a set of known strings. That is, do any of the known strings represent the prefix of a given string? A custom data structure is employed with successive implementations benchmarked to find the fastest possible solution.
May 4, 2018
Trent Nelson
No matching items
  • Edit this page
  • Report an issue