Vorläufiger Text / Text not final
Café in Frankfurt
Angefangen hat die Arbeit damit das ich eine Künstliche Intelligenz1 beauftragt habe mir Bilder zu erzeugen von Straßencafés in Frankfurt. Ich finde es interessant etwas das unmittelbar vor mir in meinem Computer ist, und noch nie in einem Café in Frankfurt gesessen hat, genau darüber Fragen zu stellen. Ich war überrascht, die Bilder sahen tatsächlich aus wie Fotos aus meiner unmittelbaren Umgebung. Es gab es diese Cafes nicht, sie sahen nur so aus wie Cafés in Frankfurt eben aussehen. Als nächstes wollte ich etwas Spaß haben, und lies mir Bilder erzeugen in denen Batman in einen dieser frei “erfundenen” Cafés in Frankfurt Kaffee trinkt. Später hat sich Batman mit seinem Freund Luke Skywalker und der Altkanzlerin Merkel zum Kaffee getroffen.
When AI image generation came around, I wondered about the question of what images I should ask for. The idea made me smile to request software inside a computer about my immediate outside world, I decided to ask for images of cafes in Frankfurt. To my surprise, the cafes generated by the software looked like realistic versions of cafes in Frankfurt, even though they were not real places. Intrigued by the possibilities, I asked the program to generate an image of Batman drinking coffee in one of these cafes. Later, I played around with the program and imagined scenarios where Batman would have coffee with his friends, Luke Skywalker and the former chancellor of Germany, Angela Merkel. Looking back, the feeling reminds me of playing with toys and using my imagination.
Taking the Toy apart – Randbevorzugung als Primärvorgang2
The following phase falls very much in line with the image of a kid playing with toys, taking the toy apart, stretching it to the limits, misusing it, and breaking it multiple times, walking around at the margins to figure out what is possible and where things fall apart.
Initially, I used a command line tool to generate the images. So, every image starts with a command that includes many variables to play around with. This is what it looks like:
swift run StableDiffusionSample --step-count 15 --save-every 0 --seed 29 --image-count 1 --compute-units cpuAndGPU --disable-safety --resource-path ../models/coreml-stable-diffusion-v1-5_original_compiled --guidance-scale 7.5 "Instagram selfie of Batman drinking coffee with the city of Frankfurt am Main in the background" --negative-prompt "painting" --output-path ../images/image-015
For example, every image is generated with a number of steps. Typically, it is recommended to use between 20-50 steps. I was curious about what happens when I go below that, and I wanted to understand what would happen visually if I didn’t stop at 50 but instead go 512 steps. So I ran the command to create an image with 1 step, then 2 steps, then 3 steps, and finally 512 steps. It took my computer 2 days to create all the images, and here is the result:
- Stable Diffusion ↩︎
- Randbevorzugung als Primärvorgang, Imre Hermann, Budapest 1923 ↩︎