We Hit 100% GPU Utilization–and Then Made It 3× Faster by Not Using It
We Hit 100% GPU Utilization–and Then Made It 3× Faster by Not Using It

www.daft.ai
Embedding Millions of Text Documents With Qwen3

We Hit 100% GPU Utilization–and Then Made It 3× Faster by Not Using It
Embedding Millions of Text Documents With Qwen3
“we found a way to make the same workload 3× faster, and it didn’t involve maxing out GPU utilization at all. That story’s for another post, but first, here’s the recipe that got us to near-100%.”
In case anyone is here specifically for that part of the post title.
Saved me a click, highly appreciated!