Update to my text-generation-webui docker image - now with ExLlama support!
Update to my text-generation-webui docker image - now with ExLlama support!
Docker
Took me some time to figure this one out, and unfortunately requires a significantly larger image (need so much more of nvidia's toolkit D: couldn't figure out a way to get around it..)
If people prefer a smaller image, I can start maintaining one for exllama and one without, but for now 1.0 is identical minus exllama support (and I guess also from an older commit) so you can use that one until there's actual new functionality :)
Also, do note that the model needs to be made with gptq-for-llama, not autogtpq