GPT5 Wrapped in GPT2-Chatbot's Shell Mysteriously Launched? - Comparative Abilities & Free Trial
After waiting for a long time and watching OpenAI release many small updates to GPT, we finally have possible information about a new model today, which is likely to be GPT5!
In less than a day, a mysterious new large model named "gpt2-chatbot" is spreading like wildfire.
Without official documentation, it was released in the LMSYS Arena, competing with GPT4 and other models. Some speculate it might be a model OpenAI is about to release.
First, there are a lot of discussions about this model. Here I have compiled some of them and added comparisons with GPT4 using the same examples. I suggest you try the model yourself and see how it differs from GPT4. I will add more details to this article's comments after more details are revealed.
Reasoning Abilityโ
The gpt2-chatbot exhibits incredible reasoning abilities. It also answers difficult AI questions with a more human-like tone.
Question: Which is heavier, a kilogram of feathers or a kilogram of lead?
gpt2-chatbot reply: A kilogram of feathers and a kilogram of lead weigh the same. The main difference lies in their volume and density: a kilogram of feathers takes up more space and looks much larger, while a kilogram of lead is compact. This often leads to a classic trick question, but in terms of weight, a kilogram is a kilogram!
GPT4 reply: A kilogram of feathers and a kilogram of lead weigh the same, both being 1 kilogram. The difference lies in their volume and density. Feathers occupy more space because they have a lower density.
It seems like this example results in a tie between gpt2-chatbot and GPT4.
Math Skillsโ
The gpt2-chatbot solved International Mathematical Olympiad (IMO) problems at once.
Although it didn't solve all the problems in the IMO test, it was still impressive.
gpt2-chatbot Replyโ
GPT4 Replyโ
So, who won this round? Some speculate that the gpt2-chatbot is fine-tuned for agent functionality.
Coding Abilityโ
An engineer from an AI startup, Chase, confirmed that the gpt2-chatbot's coding ability is better than the latest versions of GPT-4 or Claude Opus. "It performed better on all our toughest coding prompts than GPT4 or Claude Opus."
ASCII Art Skillsโ
The gpt2-chatbot "leads any other model" in ASCII art.
However, some question whether the ASCII art was copied 1:1 from the internet, with gpt2 simply being better at accurately recalling training data.
So, Who Released the gpt2-chatbot?โ
There are no official documents, and no one knows.
Here are the two most agreed-upon speculative theories:
- Secretly released GPT-5: Early OpenAI could be benchmarking it.
- It is the 2019 OpenAI GPT-2, fine-tuned with an assistant dataset.
If it's the latter, it means the initial pre-training of GPT was very strong and, after fine-tuning 4 years later, can still outperform other models.
How to Try It for Freeโ
The gpt2-chatbot can be accessed directly at https://chat.lmsys.org. Currently, many people are trying it, but response times are slow, and each conversation is limited to 8 rounds.
OpenAI's Official Reactionโ
Altman just posted that he "has a soft spot for gpt2," hinting at GPT5?
Conclusion
After announcing the GPT4 memory function update, unlimited GPT4, and free registration for GPT3.5, OpenAI has quickly entered the "GPT5" frenzy. Personally, I think the gpt2-chatbot is very likely a warm-up for the release of the new model.
In an interview less than a month ago, Altman also mentioned that this new model (not referred to as GPT5 in the interview) would be released this year.
All signs point to
GPT5, it's really coming!