www.danlemire.com: September 2022

Saturday, September 17, 2022

AI Art exploration continued

Text to Image AI art generation, is making it possible to create things that before could only be realized through someone else's creative expression or sharing.

The above picture is AI art, generated by an open source machine learning model. This text-to-image model was developed by StabilityAI with the goal of generating images from natural language prompts. The open sourced version of the model was estimated to have cost $600,000 to train using some of the latest available GPU processors (NVIDIA A100).

After stable diffusion generated this image, I used another tool called RealESRGAN, another open source tools that is used to make the image much larger. The stable diffusion model was trained on images that are typically 512x512 pixels in size, and so to get larger, but consistent image, you can upsize it. I first learned about this tool on a youtube channel that I have been subscribed to for several years. It's really amazing how you can go back in time and watch old videos that were made before you discovered a channel, and you find something really interesting.

If you look closely at the image above, you'll start to see places where things don't look right, and this is the real hard problem to be solved here. The AI get's a great deal right, and even the problems you see ere can be resolved with iterations. In fact, in the intervening few weeks since stable diffusion was released, there are have been an increasing number of tools (also open sourced) that are built to help with the workflow needed to be successful with creating new art that is not noticeably created by AI.

Oddly enough, the real controversy here is that people are already having a hard time discerning between human generated art, and AI generated art.

I'm sure in the next few weeks we will start to see additional capabilities in their nascent stages. I've already seen previews of AI generation for videos, and no doubt 3d objects is not far behind. I'm especially excited about the workflows that will enable the creation of highly detailed and interesting metaverse experiences without having to require hundreds of hours of development effort.