Google introduced its latest artificial intelligence model, Gemini, which the search giant believes is its most powerful and capable for general usage to date. There are three versions of Gemini: Gemini Ultra which is the largest and most advanced version; Gemini Pro which is the middle power and capability; and Gemini Nano which is a compact version for use in certain products like smartphones.
Gemini is powered to understand and work with not just text but images, video, and audio as well. This makes it “multimodal” compared to most other language AI models which are primarily text-based. Google claims Gemini Pro outscores OpenAI’s GPT-3.5 model and Gemini Ultra bests even GPT-4 on multiple standard benchmarks for artificial intelligence capability.
The company highlighted the potential of Gemini’s visual powers during the launch, showing demos where it could interpret and discuss images and videos, respond to someone drawing pictures with suggestions, and answer questions about research papers containing graphs and math equations.
Gemini Pro starts rolling out today to enhance Google’s conversational AI chatbot Bard which is similar to OpenAI’s popular ChatGPT product. Google says this upgrade represents Bard’s biggest leap in quality and reasoning ability since first launching. The full Gemini Ultra model will come to Bard and other products starting next year after extensive testing.
Gemini shows Google urgently responding to OpenAI’s splash in AI with products like ChatGPT and the recent hype surrounding competitor Microsoft potentially integrating similar AI into its search engine Bing. With Gemini representing hundreds of millions in development costs, its launch is critical for Google to reassert its AI leadership in technology and capture future revenue opportunities from AI cloud services.