There are many valid, ethical concerns about the use of these prompt-driven generative AI models—sustainability and ecological impact, inherent bigotry/bias, exploitative labor practices, violations of copyright and content use consent…the list goes on—and this post is not to enumerate or discuss those, although they would be excellent discussion topics to bring up with students in class. This post is aimed at educators, whose institutions are encouraging/requiring them to incorporate the technology into their teaching.
The first of several image iterations generated by my friend’s PC setup
I thought I was done with my (underwhelming) exploration into an Elphaba costume design using generative AI, but my blog crossposts to Facebook and I got into a fascinating discussion with an illustrator/cartoonist friend about running generative AI on a home computer over which some control can be exerted.
He began tinkering with setting up his own PC to run the open-source machine-learning model Stable Diffusion, with the aim to eventually load his own art onto it, so it could iterate in his own style with his own characters (a goal he has yet to achieve). He had a PC for gaming with a good video card and a lot of hard drive space, that he repurposed for this experiment.
He loaded Stable Diffusion onto it and the freeware user interface ComfyUI. This allows him to exert some control over the path the AI takes from input to output. He began playing around with the first of Copilot’s Elphaba images that it created for me.
A refresher on what Copilot generated for me
How ComfyUI looks when you’re using it
His second iteration— each of these took under 30 seconds to generate
These were a lot closer to what I envisioned, and he decided to run with the design concept and see if he could tweak it to generate several more iterations closer to what I had hoped my second Copilot iteration would be.
He also spent a fair amount of time patiently explaining to me what he had done to set up his own machine and answering my questions about how the software works, how the user interface works, and what kind of power draw it produces.
Another iteration
Another iteration
Another iteration
Apparently, his machine uses no more significant amount of power than playing a graphics-heavy video game.
As I understand it, what makes generative AI tools like ChatGPT and Copilot use so much power is a combination of the fact that they run off conversational prompt inputs, and “learn” from every iteration generated. This machine my friend has set up with the intent of hopefully someday generating iterations of his own artistic style, doesn’t “learn” from each generated iteration, and it runs on ComfyUI workflows as opposed to conversational prompts.
My friend said he had learned about how to set this machine up through a combination of YouTube videos (here’s an introduction to Stable Diffusion, for example) and subreddits devoted to DIY AI art. He stressed the questionable legality of some of the stuff he’d come across in various forums, and also that much of the pioneering work in this area has been done by people with a prurient interest in generating cartoon pornography. So, if you were interested in exploring further, do not do so unaware of those caveats.
Learning even the most minimal overview of how AI image generation works when a UI workflow Is involved, the whole concept of the process feels more exciting to me than all the hype around tools like ChatGPT, which are framed as magical artificial intelligence models far beyond our puny human comprehension.
I have a knee-jerk negative response to that framing, whereas this feels intriguing and filled with potential. I probably won’t dive deeper into it, because I don’t have a professional need or a personal interest in generating AI art, but if I did, this would be the path I would follow.
Comments
Post a Comment