5 key differences between GPT-4 and its predecessors

Short description

OpenAI has launched GPT-4, a new artificial intelligence model that can understand and analyse images, making it a “multimodal” tool. The new model has also been better trained than its predecessors to avoid being cheated or manipulated, with a capacity for a longer memory of earlier conversations and improved proficiency in multiple languages. Another notable difference is that the new model can have different “personalities” which can be tweaked depending on the nature of the interaction. The new model has already been used in applications ranging from virtual volunteering to language-learning bots on Duolingo.

5 key differences between GPT-4 and its predecessors

OpenAI’s new GPT-4 artificial intelligence model has made its big debut and is already being used in everything from a virtual volunteer for the visually impaired to an improved language learning bot on Duolingo. But what makes GPT-4 different from previous versions like ChatGPT and GPT-3.5? Here are five of the most significant differences between these popular systems.

First of all, what does the name itself mean? Although ChatGPT was originally described as GPT-3.5 (and thus several iterations ahead of GPT-3), it is not itself a version of the large OpenAI language model, but rather a chat interface to the model that provides it. The ChatGPT system that has become popular over the past few months was a way to interact with GPT-3.5 and is now a way to interact with GPT-4

So, let’s get down to the differences between a regular chatbot and its new, improved successor.

1. GPT-4 can see and understand images

The most notable change to this universal machine learning system is that it is “multimodal,” meaning it can understand more than one “modal” type of information. ChatGPT and GPT-3 were limited to text: they could read and write, but that was pretty much it (although that was enough for many applications).

However, GPT-4 can analyze images and find relevant information on them. You can ask him to describe what is in the picture, but more importantly, his understanding goes beyond that. In an example provided by OpenAI, GPT-4 explains the cartoon’s joke with an image of a ridiculously huge iPhone jack, but even more revealing is its partnership with Be My Eyes, an app used by blind and visually impaired people that allows volunteers to describe what sees their phone.

Image credits: Be My Eyes

In the video for Be My Eyes, GPT-4 describes a pattern on a dress, identifies a plant, explains how to get to a certain exercise machine at the gym, translates a label (and suggests a recipe), reads a map, and performs a number of other tasks, showing that she really understands the meaning of the image , if you ask the right questions. It knows how a dress looks, but it may not know if it is suitable for an interview.

2. GPT-4 is more difficult to cheat

Although today’s chatbots often give the right answers, it’s easy to confuse them. A little coaxing might convince them that they’re just explaining what a “bad AI” does, or some other fiction that allows the model to talk about anything and everything, in sometimes strange and even disturbing ways. People are even collaborating on “jailbreak” queries that quickly bring down ChatGPT and their other frameworks.

GPT-4, on the other hand, has been trained on a variety of malicious requests that users have kindly provided to OpenAI over the past year or two. With these data, the new model copes much better than its predecessors with “practicality, controllability and refusal to go beyond the prescribed limits.”

As OpenAI describes it, GPT-3.5 (which worked with ChatGPT) was a “trial run” of the new learning architecture, and they applied the lessons learned to a new version that was “unprecedentedly stable.” They were also able to predict her capabilities better, which led to fewer surprises.

3. GPT-4 has a longer memory

Large language models are trained on millions of web pages, books, and other textual data, but when they actually communicate with a user, there is a limit to how much information they can “keep in their heads” (which is acceptable). This limit for GPT-3.5 and the older version of ChatGPT was 4096 “tokens”, which is roughly 8000 words or about four to five pages of a book. Thus, the model lost track of events after they had passed far “back” in its attention function.

GPT-4 has a maximum token size of 32,768 – that’s 215if you’re wondering why the number looks familiar. This roughly equates to 64,000 words or 50 pages of text – enough for an entire play or short story.

This means that during a conversation or text generation, the model will be able to remember up to 50 pages. That way, she will remember what you talked about 20 pages ago, or if writing a story or essay, the model can refer to events that happened 35 pages ago. This is a very rough description of how the attention mechanism and token counting work, but the general idea is to expand memory and the capabilities that go with it.

4. GPT-4 is more multilingual

World II is dominated by native English speakers, and everything from data to testing to scientific articles is in that language. However, the capabilities of large language models are applicable to and should be available in any written language.

GPT-4 takes a step in this direction by demonstrating its ability to answer thousands of multiple-choice questions with high accuracy in 26 languages, from Italian to Ukrainian to Korean. It does best with Romance and German languages, but generalizes well with other languages ​​as well.

Initial testing of language capabilities is promising, but far from full adoption of multilingual capabilities; the test criteria were translated from English from the beginning, and the multiple-choice questions are not a complete representation of ordinary language. However, GPT-4 performed remarkably well on a task for which it was not specifically trained, indicating the possibility that GPT-4 would be much friendlier to non-English speakers.

5. GPT-4 has different “personalities”

“Control” is an interesting concept in AI, it means the ability to change one’s behavior on demand. This can be useful, for example, when playing the role of a sympathetic listener, or dangerous when people convince the model that she is angry or depressed.

GPT-4 integrates controllability more natively than GPT-3.5, and users will be able to change “ChatGPT’s classic personality with fixed verbiage, tone, and style” to something more suited to their needs. “Within Reason” the team is quick to point out, pointing this out as the easiest way to get a model out of role.

This could be done by activating the chatbot with messages like “Pretend you’re a DM in a tabletop RPG” or “Answer as if you were a person being interviewed for cable news.” But really, you were just giving suggestions “by default” to the GPT-3.5 person. Now developers will be able to set an opinion, communication style, tone or method of interaction from the beginning.

The article was translated using GPT-4 without corrections.


If you look events in Mykolaiv – https://city-afisha.com/afisha/

Related posts