From Ghibli dreams to AI magic: OpenAI’s image generator stuns the internet; but free users will have to wait

/2 min read

ADVERTISEMENT

Since its launch, social media has been flooded with AI-generated images, including a wave of Studio Ghibli-inspired visuals.
From Ghibli dreams to AI magic: OpenAI’s image generator stuns the internet; but free users will have to wait
OpenAI says it will use the money to invest in research and products and attract top talent.  Credits: Getty Images

Ever since OpenAI rolled out its image generator, users have seemingly put it to use in countless ways. So much so that CEO Sam Altman, in a post on Wednesday, admitted that the tool’s popularity has exceeded his expectations, leading to a delay in its rollout to the free tier.

What sets the ChatGPT image generator apart?

On Tuesday, OpenAI introduced image generation to ChatGPT, enabling users to create visuals directly within the app using its advanced reasoning model, GPT-4o.

Fortune India Latest Edition is Out Now!

Read Now

Since its launch, social media has been flooded with AI-generated images, including a wave of Studio Ghibli-inspired visuals. Users have dubbed it a "Ghibli fest," referencing the renowned animation studio behind Academy Award-winning films like Spirited Away and The Boy and the Heron.

“At OpenAI, we have long believed that image generation should be a core capability of our language models,” the company stated in a blog post on Tuesday.

What makes GPT-4o’s image generation stand out—beyond being free to users (though, as Altman noted, with some delays)—is its enhanced accuracy and control. The model excels at rendering text correctly, precisely following prompts, and leveraging its built-in knowledge and chat context.

Users can transform uploaded images or use them as visual inspiration, making it easier to create exactly what they envision. Prompts can include images, text, or nearly anything else one would use to generate a text response.

While other AI models struggle to manage around 5–8 distinct objects in an image, GPT-4o claims to handle up to 10–20, offering better control over object placement, traits, and relationships.

For instance, if you ask an AI model to generate an image of a cat sitting on a red couch next to a woman reading a book, with a lamp, a window, and a potted plant in the background, older models might misplace or distort some elements if the number of requested objects exceeds their capabilities.

GPT-4o, however, claims to handle 10–20 objects more effectively, ensuring that relationships between multiple elements remain intact—such as keeping the book in the woman’s hands rather than floating mid-air.

This tighter integration makes AI-generated visuals not only more detailed but also more practical and reliable as a creative tool.

OpenAI says it trained its models using a vast collection of online images paired with corresponding text, allowing them to learn not just how images connect with language but also how different visual elements relate to one another.

With extensive post-training refinements, the model has achieved a high level of visual fluency, enabling it to generate images that are not only aesthetically appealing but also contextually accurate and practical for real-world use.

In essence, OpenAI is positioning GPT-4o’s improved image accuracy as its key differentiator.

Additionally, in a move to enhance transparency, OpenAI stated that "all generated images come with C2PA metadata, which will identify an image as originating from GPT‑4o."

For now, OpenAI appears to be riding yet another wave of popularity with its image-generation tool, making generative AI more accessible than ever. However, its long-term practicality—especially for free users—remains to be seen.

Fortune India is now on WhatsApp! Get the latest updates from the world of business and economy delivered straight to your phone. Subscribe now.