Trent Nelson
Articles
Projects & Repos
Tools
Categories
All
(2)
AVX2
(1)
Assembly
(1)
C
(1)
Free-threading
(1)
MASM
(1)
No-GIL
(1)
PyParallel
(1)
PyTorch
(1)
Python
(1)
SIMD
(1)
asyncio
(1)
Articles
DRAFT: From PyParallel to Python Free-Threading
Optimally Exploiting Multiple Cores with Python
Python
PyParallel
Free-threading
No-GIL
PyTorch
asyncio
This article takes a look at the new no-GIL, free-threading functionality introduced in Python 3.13, as well as how it compares and performs against PyParallel’s attempt at explointing multiple CPU cores over ten years ago. Using a demo project named Parallelopedia, we demonstrate the benefits afforded by the new functionality, whereby large data structures (tens to hundreds of gigabytes) are loaded in main memory and then accessed in parallel from Python—a feat not previously possible with the contemporary
multiprocessing
approach. Additionally, we investigate multi-threaded, parallel inference (generation) of transformer-based Deep Neural Networks via PyTorch, and explorer the limits of what can and can’t currently be done, as well as a roadmap for future PyTorch support.
Trent Nelson
Is Prefix Of String In Table?
A Journey Into SIMD String Processing
AVX2
SIMD
C
Assembly
MASM
This article details an approach for efficiently determining if a given string prefix-matches a set of known strings. That is, do any of the known strings represent the prefix of a given string? A custom data structure is employed with successive implementations benchmarked to find the fastest possible solution.
May 4, 2018
Trent Nelson
No matching items