Skip to main content

GPT5 Wrapped in GPT2-Chatbot's Shell Mysteriously Launched? - Comparative Abilities & Free Trial

After waiting for a long time and watching OpenAI release many small updates to GPT, we finally have possible information about a new model today, which is likely to be GPT5!

In less than a day, a mysterious new large model named "gpt2-chatbot" is spreading like wildfire.

Without official documentation, it was released in the LMSYS Arena, competing with GPT4 and other models. Some speculate it might be a model OpenAI is about to release.

First, there are a lot of discussions about this model. Here I have compiled some of them and added comparisons with GPT4 using the same examples. I suggest you try the model yourself and see how it differs from GPT4. I will add more details to this article's comments after more details are revealed.

Reasoning Abilityโ€‹

The gpt2-chatbot exhibits incredible reasoning abilities. It also answers difficult AI questions with a more human-like tone.

Question: Which is heavier, a kilogram of feathers or a kilogram of lead?

gpt2-chatbot reply: A kilogram of feathers and a kilogram of lead weigh the same. The main difference lies in their volume and density: a kilogram of feathers takes up more space and looks much larger, while a kilogram of lead is compact. This often leads to a classic trick question, but in terms of weight, a kilogram is a kilogram!

GPT4 reply: A kilogram of feathers and a kilogram of lead weigh the same, both being 1 kilogram. The difference lies in their volume and density. Feathers occupy more space because they have a lower density.

It seems like this example results in a tie between gpt2-chatbot and GPT4.

Feathers and Lead

Math Skillsโ€‹

The gpt2-chatbot solved International Mathematical Olympiad (IMO) problems at once.

Although it didn't solve all the problems in the IMO test, it was still impressive.

gpt2-chatbot Replyโ€‹

Math Skills

GPT4 Replyโ€‹

Math Skills

So, who won this round? Some speculate that the gpt2-chatbot is fine-tuned for agent functionality.

Coding Abilityโ€‹

An engineer from an AI startup, Chase, confirmed that the gpt2-chatbot's coding ability is better than the latest versions of GPT-4 or Claude Opus. "It performed better on all our toughest coding prompts than GPT4 or Claude Opus."

Coding Ability

ASCII Art Skillsโ€‹

The gpt2-chatbot "leads any other model" in ASCII art.

ASCII Art

ASCII Art

However, some question whether the ASCII art was copied 1:1 from the internet, with gpt2 simply being better at accurately recalling training data.

So, Who Released the gpt2-chatbot?โ€‹

There are no official documents, and no one knows.

Here are the two most agreed-upon speculative theories:

  • Secretly released GPT-5: Early OpenAI could be benchmarking it.
  • It is the 2019 OpenAI GPT-2, fine-tuned with an assistant dataset.

If it's the latter, it means the initial pre-training of GPT was very strong and, after fine-tuning 4 years later, can still outperform other models.

GPT5 Speculations

How to Try It for Freeโ€‹

The gpt2-chatbot can be accessed directly at https://chat.lmsys.org. Currently, many people are trying it, but response times are slow, and each conversation is limited to 8 rounds.

Free Trial

Free Trial

OpenAI's Official Reactionโ€‹

Altman just posted that he "has a soft spot for gpt2," hinting at GPT5?

OpenAI Reaction

Conclusion

After announcing the GPT4 memory function update, unlimited GPT4, and free registration for GPT3.5, OpenAI has quickly entered the "GPT5" frenzy. Personally, I think the gpt2-chatbot is very likely a warm-up for the release of the new model.

In an interview less than a month ago, Altman also mentioned that this new model (not referred to as GPT5 in the interview) would be released this year.

All signs point to

GPT5, it's really coming!