AI Art Generation Handbook/Training/Dataset
Type of training
[edit | edit source]Before start on training, consider first the concept you want to use
As far as per limited studies goes, it seems like Dreambooth able to perform four types of training
(i) Introduce totally new concepts to models
As per current existing version of Stable Diffusion ,although SD model able to generate a variety types of images but however there are quite few things unable to be generated by SD model such as eyeballs.
(ii) Adding dataset to existing concepts but create a separate "token"
This is more usual route of existing concept of "male" / "woman" but you want to add dataset of faces of yourself to create a a look that more resembled of yourself into dataset images.
(iii) Finetuning the existing concept
A concept that already existed but due to limitations of CLIP / limited dataset of images, it may not be able to generate properly.
Such as new concepts of centaur (which is basically a half men and half centaur)
(iv) Force the existing concepts to learn different concepts
For examples, you may force the existing keyword "bank" that is strongly related to the bank as in depositing your money in to the "bank" but refers to the river bank. This practices is strongly discouraged as many concepts that is related to that keyword is related and may cause "model collapse"
Source of Images
[edit | edit source]There are wide range of free photos to choose from when you want to train a model of images you found.
Here is the list of free stock photo site that you can use
Wikimedia Commons (https://commons.wikimedia.org/wiki/Main_Page)*
Unsplash (https://unsplash.com/)
Pexels (https://www.pexels.com/)
Pixabay (https://pixabay.com/)
Flickr Creative Commons ( https://www.flickr.com/creativecommons/ )
FreeImages ( https://www.freeimages.com/ )
Public Domain Picture (https://www.publicdomainpictures.net/)
Game Art for Glitch ( https://www.glitchthegame.com/public-domain-game-art/)
Josh Game Asset ( https://gameassets.joshmoody.org/)
Quality of Images
[edit | edit source]You may heard a lot from lots of Dreambooth tutorial mentioning dataset must have quality. The quality of the output image produced by the AI Art generative model is directly related to the quality of the input image used to train the model. If the input images used to train the model are of low quality, contain noise or artifacts, or are poorly composed, the resulting output images will also have similar issues.
Image should have following attributes:
(a) Diverse but consistent -
This is an example to make a diverse dataset to train for object/style but make sure the subject (in this case rhino) is always the center of the training subject.
<Note this is for references only, your specific use case may be different from what is stated here:>
Framing:
-
Closeup
-
High angle shot
-
Rear shot
-
Low angle shot
Activity:
-
Grazing grass
-
Swimming in water
-
Drinking
-
Mating
-
Sleeping
-
Laying down
Lighting:
-
Night time
-
Under shade
-
Silhouette
-
Morning daylight
-
Afternoon light
Type of medium:
-
Black n white film stills
-
Cave wall painting
-
Stamp of rhinoceros
-
Wall art mural
-
Fabric Patterns
-
Lantern
-
Art statue
-
Crayon drawings
Note: Whenever possible, try not to include images for training if they have following characteristic:
(i) Have more than 1 subject (although same subjects) in same picture
(ii) Have distinct but common features (If you trained with 1 horns, ensure all images dataset ideally have 1 horn)
(b) Noisy or compression artefacts and such
(c) Blurred or not enough resolution