Trent Nelson
Articles
Projects & Repos
Tools
Categories
All
(2)
AVX2
(1)
Assembly
(1)
C
(1)
Free-Threading
(1)
GPT2
(1)
LLM
(1)
MASM
(1)
No-GIL
(1)
PyTorch
(1)
Python
(1)
SIMD
(1)
Articles
PyTorch and Python Free-Threading
Unlocking multi-threaded parallel inference on PyTorch models
PyTorch
Python
Free-Threading
No-GIL
LLM
GPT2
This post examines multi-threaded parallel inference on PyTorch models using the new
No-GIL
, free-threaded version of Python. Using a simple 124M parameter GPT2 model that we train from scratch, we explore the novel new territory unlocked by free-threaded Python: parallel PyTorch model inference, where multiple threads, unimpeded by the Python GIL, attempt to generate text from a transformer-based model in parallel.
Feb 13, 2025
Trent Nelson
Is Prefix Of String In Table?
A Journey Into SIMD String Processing
AVX2
SIMD
C
Assembly
MASM
This article details an approach for efficiently determining if a given string prefix-matches a set of known strings. That is, do any of the known strings represent the prefix of a given string? A custom data structure is employed with successive implementations benchmarked to find the fastest possible solution.
May 4, 2018
Trent Nelson
No matching items