Dr Marcel Scharth
Earlier this week OpenAI launched聽聽(鈥渙鈥 for 鈥渙mni鈥), a new version of the artificial intelligence (AI) system powering the popular ChatGPT chatbot. GPT-4o is promoted as a step towards more natural engagement with AI. According to the聽, it can have voice conversations with users in near real-time, exhibiting human-like personality and behaviour.
This emphasis on personality is likely to be聽. In OpenAI鈥檚 demos, GPT-4o sounds friendly, empathetic and engaging. It tells 鈥渟pontaneous鈥 jokes, giggles, flirts and even sings. The AI system also shows it can respond to users鈥 body language and emotional tone.
Launched with a streamlined interface, OpenAI鈥檚 new version of the ChatGPT chatbot appears designed to increase user engagement and facilitate the creation of new apps based on its text, image and audio capabilities.
GPT-4o is another leap forward for AI development. However, the focus on engagement and personality raises important questions about whether it will truly serve the interests of users, and the ethical implications of creating AI that can simulate human emotions and behaviours.
OpenAI envisions GPT-4o as a more enjoyable and engaging conversational AI. In principle, this could make interactions more effective and increase user satisfaction.
Studies show users are聽聽to trust and cooperate with chatbots exhibiting social intelligence and personality traits. This could prove relevant in fields such as education, where studies have聽聽AI chatbots can boost learning outcomes and motivation.
However, some commentators worry users may become overly聽聽to AI systems with human-like personalities or聽聽by the one-way nature of human-computer interaction.
GPT-4o immediately inspired comparisons 鈥 including from OpenAI boss Sam Altman 鈥 to the 2013 science-fiction movie聽, which paints a vivid picture of the potential pitfalls of human-AI interaction.
In the movie, the protagonist, Theodore, becomes deeply fascinated and attached to Samantha, an AI system with a sophisticated and witty personality. Their bond blurs the lines between the real and the virtual, raising questions about the nature of love and intimacy, and the value of human-AI connection.
While we should not seriously compare GPT-4o to Samantha, it raises similar concerns. AI companions are already here. As AI becomes more adept at mimicking human emotions and behaviours, the risk of users forming deep emotional attachments聽. This could lead to over-reliance, manipulation and even harm.
While OpenAI demonstrates concern with ensuring its AI tools behave safely and are deployed in a responsible way, we have yet to learn the broader implications of unleashing charismatic AIs onto the world. Current AI systems are not explicitly designed to meet human psychological needs 鈥 a goal that is hard to define and measure.
GPT-4o鈥檚 impressive capabilities show how important it is that we have some system or framework for ensuring AI tools are developed and used in ways that are aligned with public values and priorities.
GPT-4o can also work with video (of the user and their surrounds, via a device camera, or pre-recorded videos), and respond conversationally. In OpenAI鈥檚 demonstrations, GPT-4o comments on a user鈥檚 environment and clothes, recognises objects, animals and text, and reacts to facial expressions.
骋辞辞驳濒别鈥檚听聽AI assistant, unveiled just one day after GPT-4o, displays similar capabilities. It also appears to have visual memory: in one of Google鈥檚 promotional videos, it helps a user find her glasses in a busy office, even though they are not currently visible to the AI.
GPT-4o and Astra continue the trend towards more 鈥渕ultimodal鈥 models that can work with text, images, audio and video. GPT-4o鈥檚 predecessor, GPT-4 Turbo, can process text and images together, but not audio and video. The original version of ChatGPT, released less than two years ago, was based only on text.
GPT-4o is also significantly faster than its predecessor.
The ability to work across audio, vision and text in real time is considered crucial to develop advanced AI systems that can understand the world and effectively achieve complex and meaningful goals.
叠耻迟听聽that GPT-4o鈥檚 text capabilities are only incrementally better than GPT-4 Turbo and competitors such as Google鈥檚 Gemini Ultra and Anthropic鈥檚 Claude 3 Opus.
Will major AI labs be able to sustain the recent rapid pace of improvement by continuing to built bigger and more sophisticated models? This is a hot topic of debate among experts, and the outcome will determine the impact of the technology over the coming years.
础听聽aspect of GPT-4o鈥檚 launch is that, unlike its GPT-4 family precursors, the new AI system is available to all users in the free version of ChatGPT, subject to usage limits.
This means millions of users worldwide just got an upgrade from GPT-3.5 to a more powerful AI system with more features. GPT-4o is significantly more useful than GPT-3.5 for various purposes, such as work and education. The impact of this development will become more apparent over time.
OpenAI鈥檚 unveiling of GPT-4o disappointed enthusiasts for ever more powerful AI systems, who hoped GPT-5鈥檚 arrival was imminent after over a year since GPT-4鈥檚 launch.
Instead, this week鈥檚 unveiling of GPT-4o and Google鈥檚 latest AI announcements emphasise the features being incorporated into their products. These new developments point to possibilities such as more sophisticated virtual assistants capable of performing complex tasks on behalf of users, involving richer interaction and planning.
Marcel Scharth is a聽Lecturer in Business Analytics at the University of Sydney Business School.聽This article originally appeared in聽.