AI Art Generation Handbook/Promptcrafting

From Wikibooks, open books for an open world
Jump to navigation Jump to search

What is promptcrafting ?

[edit | edit source]
DALL.E Monkey coding the apps to rush for the project deadlines

Promptcrafting a.k.a Promptcraft is combination of 2 words :

Prompt and craft .

Based on the words definition from Wiktionary:

Prompt : A sequence of characters that is displayed to indicate that a computer is ready to receive input

Craft : To construct, develop something (like a skilled craftsman).

It is akin more to like a human user trying to describe to a robot (AI Image model) about the general ideas that have for their final image output. Sometimes, the AI image model may get the ideas but at times, it may not get the ideas.

What is prompt ?

[edit | edit source]

In context of AI Art Generation, a prompt is the instructions in form of text input that are going to be processed by AI Art Generation Model for it to generate images. Although the current AI Art Generation Model are keep improving day by day , we may need to treat the instructions to be as descriptive as possible to get approximately what we wanted.

For successful text - to - image generation , a good prompt usually follow this format :

  1. Which kind of the medium / art that it is going to be ? ( The medium here is usually divided to 2 types of category: paintings and photography)
  2. Which artist style that you are going to follow ?
  3. Which type of framing techniques that you are going to use ?
  4. Which type of lighting techniques you are going to use ? (Notably for medium that are more to photos)

To learn more about prompts, you can head directly to chapter : Prompting in Stable Diffusion Style to understand the basis of the prompt.

Note that the prompts discussed is tested on popular AI to Text Generation model : Bing Image Creator

[ART MEDIUM] + of [MAIN SUBJECT], [PERSPECTIVE], by [ARTIST], in the style of [STYLE], [MOOD], [OTHER DETAILS], [BOOSTERS]

Word Ordering

[edit | edit source]

As per norm of English language structures, the "subject" should be at the front of sentence so that the text encoder shall put this in higher priority during the image generations . Thus , this will leads the AI Image model will have a higher chances of generating the images according to your requirements.

First example, we want the rhinoceros to be part of design of the dollar note currency as per this Indonesia currency examples:

Hence, the "subject" in this case is the dollar note as we can see the rhinoceros in left images is generated without being part of designs on dollar notes because the "subject" is put to the back of sentences unlike the images in the right.

Prompts in DALL-E 2 Javan rhinoceros wearing a business suit screaming aloud with hands on the cheek while seeing the stock price crash

as design on dollar note

Dollar note showing

Javan rhinoceros wearing a business suit screaming aloud with hands on the cheek while seeing the stock price crash

Images
DALL·E Javan rhinoceros wearing a business suit screaming aloud with hands on the cheek while seeing the stock price crash as design on dollar note
DALL·E - Dollar note showing Javan rhinoceros wearing a business suit screaming aloud with hands on the cheek while seeing the stock price crash

Second examples, we wanted the rhinoceros to paint the girl with Pearl Earring but in the left images, the word rhinoceros is at the front of the prompt making the "rhinoceros concept" bleeding into "Girl with Pearl Earring" instead. Instead, putting the word rhinoceros at the back making the AI Images generations as it should

Prompts in DALL-E 3 Anthropomorphic rhinoceros wearing business suit touching up oil painting "Girl with Pearl Earring" with brushes Oil painting of "Girl with Pearl Earring" is being touched up by anthropomorphic rhinoceros wearing business suit with brushes
Images

Modifier

[edit | edit source]

Modifier in a sense is the language of AI Art Generation models and it is able to tweak the generated images into different aesthetics / according to what you are looking for.

Usually, modifier consists such as following:

(a) Art Medium

(b) Artist Style

(c) Lighting Techniques

(d) Framing Techniques

(e) Camera Types

One or more modifiers maybe added to create the unique image generation and the word ordering may changes according to your needs

References

[edit | edit source]

https://www.youtube.com/watch?v=F1X4fHzF4mQ

https://www.reddit.com/r/promptcraft/comments/x67fr3/stable_diffusion_keywords_for_enhancing_photos/

https://docs.google.com/spreadsheets/d/1inZdBt7zJZnM-B-V0OPxob8tWEmFFVTeaBjcsMzKrzo/edit#gid=0

https://docs.google.com/document/d/1Vb-4onScxOso1gqgXx7q80mnNL2JDKD9dTm3KKgiFD0/edit