Extra! Extra! Gemini Exp 1114: The best large model ever! Beats o1-Preview + Claude 3.5 Sonnet!

Nov 16, 2024#AI363

AI Translation

This post is translated from Chinese into English through AI.View Original

AI-generated summary

Google DeepMind's latest model, Gemini Exp 1114, has achieved a significant milestone on the Chatbot Arena, ranking first overall with over 6000 community votes. The LLM Arena is a platform for community-driven evaluation of large language models (LLMs). Gemini Exp 1114 scored 1344, surpassing ChatGPT 4.0's score of 1340, marking a notable achievement for Google's models. Additionally, Gemini Exp 1114 is tied for first place in the math arena and is available for interaction on Google AI Studio.

Google DeepMind's latest version, Gemini Exp 1114, has achieved significant success on the Chatbot Arena, rising to the top of the overall leaderboard with over 6,000 community votes and performing excellently in multiple areas:

First, we need to understand what LLM Arena is. LLM Arena (or Chatbot Arena) is a platform for evaluating LLMs, primarily aimed at promoting community-driven LLM performance assessments. It is one of the most prestigious evaluation platforms.

https://lmarena.ai/

From the overall leaderboard, Google's new model Gemini (Exp 1114) scored a remarkable increase of 40+, achieving a score of 1344, while the latest version of ChatGPT 4.0 scored 1340. This seems to be the first time a model from Google has achieved such results.

3f911749b2df6a302d0cddcb8e9a4b5d

Gemini-Exp-1114 is tied for first place in the math arena, performing on par with o1:

Currently, Gemini-Exp-1114 can be experienced in conversation at Google AI Studio.

https://aistudio.google.com/

The Terminator is coming.