Visual artists draw from visual references, not words, as they imagine their work. So when language is in the driver’s seat of making art, it erects a barrier between the artist and the canvas.
Visual artists draw from visual references, not words, as they imagine their work. So when language is in the driver’s seat of making art, it erects a barrier between the artist and the canvas.
I’ve gotten into AI assisted art in the past month. I would agree that a pure text-to-image approach does imply a lot of creative control given over to the AI tool - sometimes that results in happy accidents, sometimes that leads to very generic looking generations.
There are a wealth of tools and techniques at an artists fingertips (and free or cheap) that help constrain the generation to a visual thinker’s sketch or apply style to the image. Most AI platforms incorporate image-to-image and serviceable artworks can be generated from a very rough sketch of a composition. Text-to-image can be constrained with extensions like controlnet (automatic 1111, stable diffusion) where you can take a reference image or a black and white image of diffused shapes indicating depth and have the generated image tied such that you can have very predictable compositions.
Pure text-to-image, I see the writer’s point. However that’s really only scratching the surface of what can be done and not a fair assessment of “AI art isn’t suited for visual thinkers” in my opinion. Taking an AI output and tossing it into photoshop (or Krita) as a foundation to be worked on is also a valid path - you could then take that worked image and then do image-to-image on it and see what you get. To me, it’s more of a collaboration with the tool of AI rather than an all powerful genie. If I have a strong visual idea in my head, I sketch it, or even photograph me doing it and use that as a base for the AI to work with.