[ad_1]
OpenAI today shipped its long-awaited text-generating AI model, GPT-4. This is an interesting task.
GPT-4 improves on its predecessor GPT-3 in important ways. For example, it provides a more factual description and allows developers to more easily prescribe its style and behavior. It’s also multimodal in the sense that it can understand your images, and you can add captions and describe in detail what your photos are about.
However, GPT-4 has serious drawbacks. Like GPT-3, this model “hallucinates” facts and makes basic reasoning errors. In one example from OpenAI’s own blog, GPT-4 describes Elvis Presley as “the actor’s son.” (Neither of his parents were actors.)
To better understand the GPT-4 development cycle, its capabilities, and its limitations, TechCrunch spoke with OpenAI co-founder and president Greg Brockman in a video call on Tuesday.
When asked to compare GPT-4 and GPT-3, Brockman replied with one word: “different.”
“It’s just different,” he told TechCrunch. “There are still many problems and mistakes. [the model] …However, I was able to actually see a leap in skills such as calculus and law, going from being very weak in certain areas to actually being quite good compared to humans.
Test results support his claims. In the AP Calculus BC exam, GPT-4 scores 4 out of 5 and GPT-3 scores 1 (his GPT-3.5, an intermediate model between GPT-3 and his GPT-4). also gets his 4 points). The simulated bar exam, GPT-4, is passed with scores around the top 10% of candidates. GPT-3.5 scores hovered around the bottom 10%.
One of the more interesting aspects of GPT-4 is the aforementioned multimodality. While GPT-3 and GPT-3.5 could only accept text prompts (e.g. “Write an essay about giraffes”), GPT-4 gets both image and text prompts and can Can perform actions (e.g. giraffe image Serengeti with the prompt “How many giraffes are you seeing here?”).
This is because GPT-4 was trained on images. and Text data, while its predecessor was only trained on text. According to OpenAI, the training data “was obtained from a variety of licensed, authored, and publicly available data sources and may contain publicly available personal information.” Data has previously embroiled OpenAI in legal trouble.)
GPT-4’s ability to understand images is very impressive. For example, display the prompt “What’s wrong with this image?” Please explain panel by panel,” GPT-4, along with an image of three panels showing a fake VGA cable plugged into an iPhone, gives a breakdown of each image panel and correctly explains the joke. (“The humor in this image plugs a large, outdated VGA connector into a small, modern smartphone charging port”).
At this time, only one launch partner has access to GPT-4’s image analysis capabilities. It is a support app for the visually impaired called Be My Eyes. Brockman said the wider rollout will always be “slow and deliberate” as OpenAI weighs the risks and benefits.
“There are policy issues that need to be addressed and addressed, such as facial recognition and how images of people are processed,” Brockman said. “For example, we need to figure out where the danger zones are, where the red lines are, and uncover that over time.”
OpenAI addressed a similar ethical dilemma with its text-to-image system, DALL-E 2. After disabling the feature initially, OpenAI allowed customers to upload and edit human faces using his AI-powered image generation system. At the time, OpenAI said face-editing capabilities were possible by upgrading safety systems to “minimize the potential for harm” from attempts to create deepfakes and sexual, political and violent content. claimed to have become
Another perennial prevents GPT-4 from being used in unintended ways that could cause psychological, financial, or other harm. Hours after model release, Israeli cybersecurity startup Adversa AI bypasses OpenAI’s content filters to generate phishing emails, gay offensive descriptions, and other highly objectionable text on GPT-4 I have published a blog post showing how to do this.
This is not a new phenomenon in the language model domain. Meta’s BlenderBot and OpenAI’s ChatGPT have also been encouraged to say some very offensive things, even revealing sensitive details about their inner workings. We were hoping that GPT-4 could bring significant improvements in terms of moderation.
When asked about GPT-4’s robustness, Brockman said the model had undergone six months of safety training and internal testing showed it could respond to requests for content not permitted by OpenAI’s usage policy. 82% lower and possibly 40% higher. Produces a more “fact-based” response than GPT-3.5.
“We spent a lot of time trying to understand how GPT-4 works,” said Brockman. “Spreading it out into the world is how we learn. We are constantly updating, including numerous improvements to make the model more scalable in any personality or mode. I am doing it.”
Frankly, the early real-world results aren’t all that promising. Beyond Adversa AI testing, Bing Chat, Microsoft’s chatbot powered by GPT-4, has been shown to be highly vulnerable to jailbreaking. Using carefully tuned input, users were able to confess their love to the bot, threaten to harm it, defend the Holocaust, and invent conspiracy theories.
Brockman did not deny that GPT-4 is lacking here. However, he highlighted the model’s new mitigation maneuverability tools, including an API-level feature called “system” messages. System messages are essentially instructions that set the tone and establish boundaries for GPT-4 interaction. For example, a system message might look like this:you I never have Give your students answers, but always ask the right questions to help them learn how to think for themselves. ”
The idea is that system messages act as guardrails, preventing GPT-4 from turning around.
“Understanding the tone, style and content of GPT-4 was a big focus for us,” said Brockman. “I think we’re starting to understand a little more about how to do engineering, how to have repeatable processes that have predictable results that really work for people.”
My conversation with Brockman also touched on the GPT-4 Context Window. This refers to text that the model can consider before generating additional text. OpenAI is testing a version of his GPT-4 that can “remember” about 50 pages of content. That’s 5 times more than his normal GPT-4 can hold in “memory”, and 8 times more than GPT-3.
Brockman believes enhanced contextual windows will lead to new, hitherto untapped applications, especially in the enterprise. He envisions AI chatbots built for enterprises. This AI chatbot leverages context and knowledge from a variety of sources, including cross-departmental employees, to answer questions in an informed, conversational way.
It’s not a new concept. But Brockman argues that GPT-4 answers are far more helpful than answers from chatbots and search engines today.
“Previously, models didn’t have any knowledge of who you were, what you were into, etc.,” Brockman said. “With that kind of history [with the larger context window] No doubt it will be more capable…it will turbocharge what people can do. ”
[ad_2]
Source link