OpenAI has enhanced ChatGPT’s image editing and generation capabilities, enabling users to refine visuals through conversational prompts and create detailed graphics with legible text. This update broadens the chatbot’s potential for both business and personal applications.
During a recent livestream, OpenAI demonstrated how users can interact with ChatGPT to modify images, such as altering backgrounds or adding elements.
However, OpenAI acknowledges that image generation may produce inaccuracies, including fabricated text, particularly with less detailed prompts. The system also faces challenges with small text, non-Latin alphabets, and rendering times, which can extend up to a minute for complex images due to increased detail.
These features are now available via the GPT-4o model for both free and paid users, with a phased rollout for API developers planned over the coming weeks.
ChatGPT’s improved text rendering within images facilitates the creation of professional-grade diagrams, infographics, and logos. Users can now generate photorealistic menus or maps, and provide more intricate composition instructions.
OpenAI is strategically positioning ChatGPT as a versatile platform, integrating search, voice assistance, and video generation. The latest image enhancements aim to solidify ChatGPT’s utility across various domains and maintain its competitive edge against rivals like xAI, which have also incorporated image generation.