Google unveils its next-Gen AI model, Gemini

Google on Wednesday unveiled its next-generation and multimodal AI (artificial intelligence) model, Gemini, to rival its rival Microsoft-backed OpenAI’s GPT-4. The company describes Gemini as “the most capable, flexible, and general AI model to date.”

Developed by a team of researchers from Google’s now merged AI divisions DeepMind and Google Brain, this new large language model (LLM) can generalize and intuitively understand different types of information, including text, code, audio, image, operated Can and combine. , and videos.

Gemini 1.0, the company’s first release, comes in three different sizes: Pro, Ultra, and Nano. While the Gemini Ultra is designed “for highly complex tasks”, the Gemini Pro provides scaling “across a wide range of tasks” and the Gemini Nano is the company’s most efficient model “for on-device tasks”.

Google CEO Sundar Pichai said in a statement, “We’re taking the next step on our journey (as an AI-first company) with Gemini, our most capable and generalizable model yet, across multiple key benchmarks. Has state-of-the-art performance.” Introduction to the blog post about the announcement.

“Our first release, Gemini 1.0, is optimized for different sizes: ultra, pro and nano. These are the first models of the Gemini era and the first realization of the vision we had when we founded Google DeepMind earlier this year.

According to Google, Gemini Ultra’s performance outperformed current “state-of-the-art” models, including ChatGPT’s most powerful model, GPT-4, on 30 of 32 widely used academic benchmarks used in LLM research and development. Did.

With a score of 90.0% on massive multitask language understanding (MMLU), Gemini Ultra is the first model to outperform human experts (89.8%) as well as GPT-4 (86.4%), which uses a combination of 57 topics Mathematics, physics, history, law, medicine and ethics to test both world knowledge and problem-solving abilities, the company said.

Additionally, Gemini can “understand, understand, and generate high-quality code in the world’s most popular programming languages, such as Python, Java, C++, and Go.” Its ability to work in different languages and reason about complex information makes it one of the leading base models for coding in the world.

Starting Wednesday, Gemini Pro is now available for free to anyone with a Google Account through the Google Bar service. It is available in English in over 170 countries, including the US, with services expected to expand to different modalities and support new languages and locations in the near future.

Additionally, Gemini Nano is now available on the Pixel 8 Pro smartphone and is likely to be available in other Pixel models soon. Finally, the most powerful model, Ultra, which is being tested outdoors, is not expected to be released to the public before early 2024.