App Design
iOS

Prompt Design for AI Creativity

Duration
June 2023
Role
UI/UX Designer
Responsibility
Prompt Design
Company
PhotoGrid

Overview

We leveraged pre-installed models to craft prompts, creating various styles for users to apply. This approach allows users to experiment with diverse aesthetic expressions, enriching their experience with the technology.

01 Objective

This project aims to enhance user creativity in collage creation through the integration of AI technology, while simultaneously accumulating the team's experience in AI feature development.

02  Role & Deliverables

I contributed to the prompt and model parameter settings.

03  Challenge

Learning the interface and basic operations of Stable Diffusion, and understanding the principles of prompt engineering.

04  Outcome & Impact

The conversion rate for AI Creativity is 0.16%. The most popular styles among users (by download count) are Butterfly Fairy, Chibi Comic Characters, Barbie, and Goddess. The download data clearly shows user preferences, aligning with their usual liking for certain materials like butterfly stickers and brushes.

Background

This feature, located in the second layer of Grid and the tool bar of Edit, operates in an img2img mode. Utilizing Stable Diffusion and LoRA models along with prompts, it generates a variety of styles. Users can use their own photos or selfies to generate alternate versions of themselves in a parallel universe.

located in the tool bar of Edit and the second layer of Grid

My responsibility was to fine-tune the img2img prompts for the pre-installed civitai open-source model. This ensures that the generated images retain the original features of the uploaded user photos. Additionally, I controlled the process to prevent influence from the native model database, avoiding the creation of NSFW images.

Stable diffusion web UI screentshot

For basic settings, I typically set the Steps to 30 for richer imagery. The CFG Scale, which determines the degree of prompt influence, is set between 7-20. The Denoising Strength decides how much of the original image is retained, and I usually set it between 0.5-1 to control the extent to which the original image is influenced by the model.

Prompt Writing

Initially, when practicing, I tended to use natural language, as my first exposure to Generative AI Art tools was with Midjourney. I‘ve already become accustomed to the writing style of Midjourney prompts. My early attempts all started with the approach of Midjourney‘s playbook.

According to its playbook, the recommended usage is:

✍️ [Artist] [medium] of a [subject] in the style of [artist/style], [artist/style] and [artist/style]

Another method I've experimented with is: Subject + Description + Style/Aesthetic, as suggested in the article 'Master the Art of Writing Image Generation Prompts for AI in 5 Min'. There are also Prompt Templates available for reference.

Image Resource: Medium

Taking the Christmas AI filter 'Xmas Feeling' as an example, this filter uses the ProFantasy model, which has a more fantastical feel. To enhance the Christmas atmosphere, LoRA is added in the prompt to fine-tune the generation results.

Generated works by ProFantasy model / Resources: Me & Civitai

Prompt Structure

This is how I organize prompts, by focusing on a more structured and logical flow.

Prompt Structure
✍️ Full Prompt
<lora:Christmas girl-1 (3):0.7>,Christmas professional photography, Christmas angel, snow, snowflakes, best quality, masterpiece, illustration, realistic, photo-realistic, amazing, finely detail, incredibly absurdres, huge filesize, ultra-detailed, highres, extremely detailed CG unity 8k wallpaper, ray tracing, close-up, upper body, A beautiful young girl resembling a popular actress, celebrating Christmas alone in a cozy living room filled with Christmas decorations. She is wearing Christmas outfit, a very long red velvet skirt with white fur trim, matching red velvet gloves and hat. Her makeup is perfectly done with bold red lipstick, smoky eyeshadow, rosy blush and long false eyelashes. She looks elegant next to the Christmas tree. There is a magical indoor snowfall atmosphere inside the room. The room is illuminated by the lights of the Christmas tree and candles creating a cozy Christmas mood, a few bokeh
  1. Model Identifier and Modifiers: <lora:Christmas girl-1 (3):0.7> - Using Christmas-themed LoRA to add a festive atmosphere.
  2. General Theme and Setting: "Christmas professional photography, Christmas angel, snow, snowflakes"
  3. Quality and Style Descriptors: "best quality, masterpiece, illustration, realistic, photo-realistic, amazing, finely detail" - These words instruct the AI to focus on high-quality, realistic, and detailed rendering.
  4. Technical Specifications: "incredibly absurdres, huge filesize, ultra-detailed, highres, extremely detailed CG unity 8k wallpaper, ray tracing" - This part specifies the desired resolution, level of detail, and graphic rendering techniques (like ray tracing).
  5. Framing and Perspective: "close-up, upper body" - This indicates the desired framing of the subject in the image, focusing on a close-up of the upper body.
  6. Subject Description: "A beautiful young girl resembling a popular actress, celebrating Christmas alone in a cozy living room filled with Christmas decorations"
  7. Apparel and Appearance: "wearing Christmas outfit, a very long red velvet skirt with white fur trim, matching red velvet gloves and hat" - This part details the subject’s attire, adding to the Christmas theme.
  8. Makeup and Styling: "Her makeup is perfectly done with bold red lipstick, smoky eyeshadow, rosy blush and long false eyelashes"
  9. Atmosphere and Lighting: "magical indoor snowfall atmosphere, room illuminated by the lights of the Christmas tree and candles" - It sets the mood and lighting of the scene, emphasizing a magical and cozy Christmas atmosphere.
  10. Specific Scene Elements: "next to the Christmas tree", "a few bokeh" - These are specific details about the scene's composition, like the proximity to the Christmas tree and the inclusion of bokeh effects.
Negative Prompt Structure

I surmise that some models' databases might contain NSFW content, leading to the need for adjusting prompt weights and increasing the CFG Scale ratio. After trying various methods to prevent the generation of nudity and collecting numerous negative prompts from the internet, I found the best combination to use.

🚫 Negative Prompt
(worst quality:2), (low quality:2), (normal quality:2), lowres, ((monochrome)), ((grayscale)), skin spots, acnes, age spots, extra fingers, fewer fingers, strange fingers, bad hand, ((((bad anatomy)))), bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, jpeg artifacts, signature, watermark, username, sunburn, ((simple background)), hermaphrodite, long neck, mutated hands, poorly drawn hands, poorly drawn face, mutation, deformed, bad proportions, malformed limbs, extra limbs, cloned face, disfigured, gross proportions, (((missing arms))), (((missing legs))), (((extra arms))), (((extra legs))), plump, bad legs, error legs, bad feet, kid face,((misaligned nails)),((misaligned fingers)), Asian-Less-Neg

Here's the main structure:

  1. Quality Indicators: "(worst quality:2), (low quality:2), (normal quality:2), lowres" - These terms explicitly instruct the AI to degrade the quality of the image, aiming for lower resolution and poorer quality.
  2. Color and Tone Settings: "((monochrome)), ((grayscale))" - Specifies that the image should be in grayscale or monochrome, eliminating color.
  3. Physical Imperfections: "skin spots, acnes, age spots" - Instructs the AI to include skin imperfections.
  4. Anatomical Irregularities: "extra fingers, fewer fingers, strange fingers, bad hand, ((((bad anatomy)))), bad hands, missing fingers, extra digit, fewer digits, mutated hands, poorly drawn hands, poorly drawn face, mutation, deformed, bad proportions, malformed limbs, extra limbs, disfigured, gross proportions, (((missing arms))), (((missing legs))), (((extra arms))), (((extra legs))), bad legs, error legs, bad feet, ((misaligned nails)), ((misaligned fingers))" - A detailed list of anatomical errors and mutations to be introduced or emphasized.
  5. Image Errors and Artifacts: "cropped, jpeg artifacts, signature, watermark, username" - Specifies common image flaws and unwanted additions like watermarks and signatures.
  6. Background and Composition: "((simple background))"
  7. Unusual Physical Traits: "hermaphrodite, long neck, kid face" - Requests the inclusion of specific and unusual physical traits.
  8. Cultural Sensitivity Note: "Asian-Less-Neg" - Appears to be a directive for cultural sensitivity, potentially instructing the AI to avoid negative stereotypes associated with Asian features.
Generation Results

Below are the effects generated from different photos.

No items found.

All styles that I generated:

All styles that I generated

The Butterfly Fairy style that I created is the most popular among users.

Prompt Records

Refer to the Airtable document to view various styles and the generation results.

Outcome

The conversion rate for AI Creativity is 0.16%. The most popular styles among users (by download count) are Butterfly Fairy, Chibi Comic Characters, Barbie, and Goddess. The download data clearly shows user preferences, aligning with their usual liking for certain materials like butterfly stickers and brushes. It is suggested to continuously track user preferences, create variants with similar elements, and test different styles and festive atmospheres in prompts to create the next big hit. Also, I recommend to observe the download volumes of popular open-source models.

The Best-selling AI Filter
Reflection

In this project, I was involved in prompt output and tuning. I found that Stable Diffusion offers much more flexibility than Midjourney, allowing for different generation effects with various LoRA applications. Generating the desired images with prompts that are nearly natural language was quite fulfilling.

Outro: Operating an AI art account on Xiaohongshu, 'Little Red Book' (China's Instagram)

To be honest, I never thought I would open a Xiaohongshu account (download the app to view all notes) to showcase my prompts and the generation results, but I heard from friends that gaining traffic there is easy. My company also has its own Xiaohongshu account, and with generative AI being popular in the first half of 2023, I decided to jot down my daily prompts on Xiaohongshu as a trial. I started blindly by following social media trends, beginning with the topic of astrology, then moving to anime characters, and creating images I wanted to make, like Kpop idols.

Beginning with the topic of astrology, then moving to anime characters, and then Kpop idols.

Later, I began to share tutorials and prompts, though the increase in followers was modest. It wasn't until the Disney Princess series, featuring familiar princesses like Elsa, Belle, Ariel, Aurora, Cinderella, Jasmine, and Rapunzel, that I really caught traction.

I began to share tutorials and prompts, though the increase in followers was modest.
It wasn't until the Disney Princess series that I really caught traction.

Within three months, my account reached 2,000 followers. I used analytics to explore fan preferences and found myself caught up in the rush of generating content for traffic. The interaction and love from the fans were addictive. However, after being diagnosed with covid and taking a three-week break in mid-2023, I took some time to seriously contemplate the meaning of life. I realized that chasing traffic was just a temporary dopamine rush and didn't bring me real happiness, leading me to pause updates.

Still, I'm happy to receive messages and comments from fans who like the images I generated. Frankly speaking, I do not consider myself an artist; I just wanted to share interesting prompts and beautiful images through AI's eyes without owning or profiting from them.

Looking back, I was able to produce 5-6 notes weekly, seeing the progress of the Midjourney database and my own prompt writing structure. Exchanging ideas on prompt writing and community management with Chinese designers was an intriguing experience. I'll continue to explore text-to-image and the rest of gen AI World.