Summary
- Use clear, conversational language, and concrete nouns in AI prompts for better image results.
- Longer prompts may work well with specific AI tools, but stick to being clear as possible.
- Add descriptive context, consider composition, and be mindful of AI’s creative limitations.
Mastering clear, detailed prompts is the key to generating AI images with ChatGPT. But how do prompts work, and how can we make them better?
The Basics
AI prompts in general work most effectively with clear, structured prompts. For AI images, you always want to specify the subject or main focus of the image, along with context and details, and provide some information on style and aesthetic (such as artistic style). However, there’s more to it which we’ll explore in this guide, along with prompts you can modify yourself.
I’ll be using ChatGPT (DALL-E) to generate my AI images, but you can use other tools like Midjourney or Stable Diffusion to name a few— they are all different!. You need to be a ChatGPT Plus subscriber to do any real tinkering with AI image generation. However, as a free user, you get two image generations a day. Image generation with AI takes up significant resources, and is not free.
Use Natural, Conversational Language
Overall, it’s best to use conversational and plain language when crafting your prompts. The more specific and clear you are with the prompt, the better your results will be. Since ChatGPT is a chatbot, it was trained on how humans speak in conversation, as well as context awareness.
All prompt-based AI image generators understand natural language, but not all of them do it equally well, so your results may vary depending on the sophistication of the tool.
Is a Longer Prompt Always Better?
It’s important to note that some AI image generators work well with longer prompts (50+ words), while others work better with short 10-20 word prompts. It’s a good idea to test out different lengths to find out what works best for each tool, based on your unique requirements. I’ve found that ChatGPT works better with more direct and detailed prompts. The more specific and clear you are, the better.
Overly complex or verbose prompts can sometimes confuse the AI. It’s about balance—detailed yet clear prompts that work the most effectively.
However, it’s best to avoid using terms that have conflicting meanings as it may confuse the AI generator. For example, using words like “bright” and “night” may have different meanings.
Short prompt example:
Generate an image of a playful gray cat in a park during daylight.
Here’s the generated image:
Longe prompt example:
A playful dark gray cat with soft, fluffy fur frolics in a green park under the warm daylight. The cat chases a fluttering butterfly near a patch of wildflowers. Sunlight filters gently through the trees, casting dappled shadows on the ground. The mood is cheerful and lively, with a realistic, hyperreal, lifelike style, and long-shot view.
Here’s the generated image:
Stick To Concrete Nouns for the Main Subject
When it comes to writing about the main subject, you want to use more concrete nouns and things you can see, such as living or physical things, or places. Well, not just things you can see, but anything that can be perceived by the five senses.
However, more abstract concepts like “happiness,” “enlightenment,” and “fear” do allow for more creative expression, but the output image will be less likely to match any vision you have. So it’s best to combine these terms with more specific vocabulary.
Use Descriptive Language
The next step is to add context and details to not just your subject but also how you want the background to look. The easiest way to add context and details is to consider three key elements:
- What is happening or what the subject is doing.
- How it is happening—the manner or style.
- Where it is taking place—the setting or environment.
You can add adjectives to give your images more depth. When it comes to adjectives, it is best to add a couple without overcrowding the prompt to best match the visuals you are looking to generate.
Use the examples provided for each category as a guide to help you create your prompt:
- Mood: Serene, energetic, somber, dreamy, vintage, suspenseful, cheerful, humorous, eerie.
- Lighting: Bright, muted, backlit, natural, golden hour, neon, intimate, moonlit, high-contrast, and cool.
- Setting: Urban, natural, countryside, fantasy, historical, underwater, small town, professional.
- Personality: Commanding, reserved, adventurous, elusive, cheeky, sophisticated, rugged, nurturing.
- Color: Monochrome, muted tones, striking, earthy, fluorescent, metallic, cool tones, warm tones.
- Style: Realistic, abstract, minimalist, maximalist, cartoonish, vintage, contemporary, avant-garde.
Simple prompt:
Generate an image of a street with a bicyclist riding down it.
Here’s the generated image:
Enhanced prompt with adjectives:
A quaint cobblestone street in a European village, lined with pastel-colored houses adorned with flower boxes. The early morning sunlight casts soft shadows, and a bicyclist rides leisurely down the street. A café with outdoor seating sits at the corner, inviting passersby. The mood is cheerful, inviting, and nostalgically warm.
Here’s the generated image:
Don’t Forget About Composition
Framing is a key part of making AI-generated images visually striking because it shapes how the elements are arranged and how the subject comes across to the viewer. For example, you can use “close up,” “medium shot,” “wide shot,” or “point-of-view,” to specify angle and distance. By providing framing details, the image is more likely to match your vision.
Simple Prompt:
“Generate an image of a lion in the wild.”
Here’s the generated image:
Enhanced Prompt:
“Image of a wide shot of a regal lion resting atop a sunlit rock in the wild, surrounded by golden savanna grass. The scene captures the soft glow of sunset, with the sky painted in hues of orange and pink. In the background, acacia trees dot the horizon, and a gentle breeze stirs the tall grass. The mood is peaceful yet powerful.”
Here’s the generated image:
Understanding Creative Limitations
Just like text generation with AI has its limits, there are limits to AI when it comes to image generation and creativity as well. It lacks the human perspective as well as personal real-life experience and emotion. Since it is trained from data (mostly online data) and training rules, AI is limited by the quality, human depth, and diversity of the data used in its training.
In my opinion, there’s no actual originality. Unlike humans, who are able to express themselves in out-of-the-box ways, but it can be a good tool that works alongside human creativity.
AI image generation isn’t magic, but with thoughtful, precise prompts, it feels close. Dive in, experiment, and let these tools fuel your creativity—not replace it.
Source link