How does the text-to-image-plugin work and how can I run it locally? It's amazing!!!!

Original post here

I run ComfyUI on my blisteringly slow RTX 3060 12GB locally and Perchance's text-to-image-plugin absolutely blows it out of the water. The detail is INSANE. The intricate fractals made several people I asked – and even ChatGPT – think it's Midjourney. I have not even done anything remotely CLOSE to this in ComfyUI.

So, what exactly is Perchance doing here? I don't want "I think it's xyz model" answers, I would like to see you all to try to replicate this image here as closely as possible. The exact settings I used for this are:

Positive prompt: A demonic god made of a swirling substance stands before a ruined city
Negative prompt is empty
No style
CFG: 7.0
Seed: 666
Image size: 512x768
I do not know what model this is, I have seen people claiming it is Chroma but I highly doubt it, as they have used Stable Diffusion 1.5(?) for a while (that model uses 512x768 in fact).

@perchance@lemmy.world I'd like to hear from you as to how I can achieve this locally, you're the one who manages it after all :)