J.S. Cruz

AI Art

I saw an image some time ago that I liked, so I went to search for the source and saw that it was probably AI generated.

I really like the image and was disappointed when I found that a person didn’t draw it, which I suspect happens to many other people. This is strange though, isn’t it? Is there a difference between a computer-generated image (C-image) and a human-drawn image (H-image)?

(You should keep my previous essay in the back of your mind.)

Let’s not consider the question as to an artwork should stand by itself, i.e., what should it matter that an image was drawn by a person or generated by an algorithm; let’s only open one can of worms at a time. Let’s assume that, sure, there is one (otherwise we[1] would react the same to both types of images). What is it? It is an image quality difference? I don’t think so: C-images are often of better technical quality than H-images. Is it a thematic difference, i.e., the subject of the image? I don’t think so either: there’s no reason to believe that a sufficiently detailed prompt can’t produce an image that reflects upon some theme previously thought to be reserved just for H-images. Is it a style difference? Again, I don’t think that’s it: you can find countless examples of C-images in any style (e.g., photo-realistic, “in the style of John Singer Sargent”, impressionist, etc.), and it’s not obvious to me that C-images can’t be generated in an unique style through careful prompting.

What are some more possible differences? How about connection to the real world? Can C-images provide some sort of social commentary, or be something like Picasso’s Guernica? Again, I think prompting can take care of this.

Computer-generated image of a close-up of a man in a suit and tie wearing a helmet. Computer-generated image of three deformed people, two of whom appear to be adults and one a child, looking at what appears to be white-red-blue flag.
C-images from the prompt "Something that serves as a piece of commentary regarding the current situation in France.". The left image shows a man wearing a helmet with a suit and tie — perhaps a politician? The right image is a failed generation. Generated by DreamStudio.

How about capacity to represent abstract concepts, like “love” or “hate”? Simple prompting goes a long way, and I’d say the representation is clearer than in many H-images.

Computer-generated image of two people touching heads at a sunset. Computer-generated image of two people kissing on a pier. Behind them are two red streaks forming a heart. Computer-generated image of unidentified objects in front of clouds.
C-images from the prompt "The abstract concept of love.". The first two images nicely evoke love, the third not so much. Generated by DreamStudio.

You may argue about how well C-images respect/fulfill some criterion (in the last images of the above examples, it’s obvious they don’t fulfill any criteria), but I’d rebut that you can argue the exact same point for H-images and, if you’ve seen modern art, you are well prepared to do so.

Computer-generated image of a woman with a deformed face squatting in a small, open wooden structure. Computer-generated image of a woman looking down a railway track. Computer-generated black and white image of a deformed woman squatting on a pier looking at the sea.
C-images from the prompt "Homesickness, in whichever way you deem fittest to show.". Minus the face deformity, I can see the feeling of homesickness. The second image is generic enough to be either "melancholy" or "ambition". The third image doesn't evoke homesickness at all, I think. Generated by DreamStudio.

Is it just the use of technology in the creative process? Most art nowadays is digital (even excluding Photoshop), but even the art that isn’t uses technology. What is the difference between different brush sizes in Krita and difference brush sizes in actual brushes? There’s even the concept of progress, with different paint materials and different brush hair types. People use technology to create H-images and are considered real artists who produce real art, so its presence in the process doesn’t see to be a problem[2].

Given all above, the only real difference I can identify is the existence of a human artist.

Ok, so the artist must add something unique to an H-image, something that a generative model doesn’t add to a C-image. What is it?

I’m going to wager a guess that that something isn’t anything in the image itself — both lower-level decisions (line here, blur there, this colour over there) and higher-level decisions (decisions about what to paint, how to paint it) — given what we’ve seen above, but rather outside of it.

The thing that readily pops into my mind is “soul”: humans imbue H-images with soul, while computers don’t. Well, what is soul? I can try a definition: soul is the part of the artist that gets left behind in an work of art.

I admit this is a somewhat bad definition (“humans add soul to an image, where soul is the thing that is added to an image by humans”), but it nicely overloads with the criticism that C-images are soulless: of course!, there’s no human to leave anything behind.

By this definition, soul is an intrinsically human thing, i.e., it relates art to people in a way only accessible to humans. But is it really? What qualifies an artist to “leave” something behind when he creates an image, but doesn’t qualify a computer to do the same?

Is it the human technique? Time spent training? Accumulated or passed-down knowledge? For any human-applicable criteria I can think of, I can think of a computer-applicable analogue.

The only human-exclusive artist-artwork relation I can find is something that I’ll call “myth”: artistic expression is part of this millennia-old tradition that humans have, and adding one more artwork, one more image, one more song is contributing to an arc we’ve been collectively developing for thousands of years. Seeing an artwork that does not come from a human mind being added to this arc just “feels wrong”, feels like an “invasion”, since we properly consider as ours — or as human-defining — the things we’ve been doing in cave walls or sand since forever[3].

So, C-images invoke a negative feeling on us because they don’t have soul, and they don’t have soul because they’re not part of the conversation we’ve been having throughout history. But is it really true that a C-image (or, generically, a computer-generated artwork) can’t add something to the human mythology/arc of history?

I’d say it isn’t: C-images are objects of a very human domain: instead of adding to our development in aesthetics, they add to our development in mathematics. If H-images are expressions of our creative ability. developed over thousands of years, C-images are expressions of our analytical ability, equally developed over thousands of years, from the Pythagorean theorem to algebra to differentials to dot products to neural networks, with countless steps by countless minds in between.

The historical conversation doesn’t happen through colours in canvases, but rather in scrolls, books, and articles.

If you’ll allow the flowery language, if an H-image is a triumph of human expression, isn’t a C-image a triumph of human knowledge?

So I’ll go along enjoying C-images disappointment-free. The image I really liked was a filigree theme with black and gold colours — both things which I adore and very hard to find H-images of!

[1] I say "we", but it might well be the case that this is just me and that other people don't have these concerns. In that case take this essay as therapeutic. ↩️

[2] What about degree? If you argue that degree, not presence, matters, you'll end up in the heap paradox, and you'll have to explain why a computer-aided drawn image is an H-image and why a computer-generated image is a C-image. ↩️

[3] I also think this is equally applicable to language generation by a computer: everyone's reacting very strongly to ChatGPT because we think of language as an exclusively human domain. I would say the difference here is that language is completely human-intrinsic and art creation has an extrinsic dimension, if even in the mechanical sense: the direct object of language creation is another person, and you can do this unassisted; the direct object of artistic creation is a canvas (from cave walls to A4 paper), and you need some tool to do it (from charcoal to resin paint). ↩️

Tags: #ai #philosophy