A scientific article related to Microsoft and published on GitHub contains information about the shortcomings of GPT-4 from OpenAI.
The authors of this article say that, with a high degree of probability, the mentioned configuration of machine intelligence will follow the jailbreak instructions, which provide for ignoring the security measures built into this version of digital consciousness. The researchers claim that with certain formulations and manipulative semantic constructions, GPT-4 is easier than other large language models to force to generate text materials containing biased judgments and narratives that offend or unreasonably negatively evaluate certain phenomena and groups of people.
The authors of the article, who tested the AI configurations, came to the conclusion that, in general, the mentioned OpenAI development has a higher level of reliability compared to GPT-3.5 in standard tests within the framework of typical verification procedures. At the same time, according to them, GPT-4 is characterized by a greater degree of vulnerability, given the jailbreak system or user prompts. This negative feature of the machine intelligence configuration is because this model more accurately follows the prompts. In this context, there is a situation when following the instructions as much as possible is not beneficial. It should be clarified that it implies those guidelines for activities that are based on the irresponsible and to some extent illegal use of advanced technology.
The research team worked with Microsoft products to find out whether the discovered vulnerabilities affect the functioning of customer-oriented services. It was found that artificial intelligence apps use several mitigation approaches to eliminate potential harm that may occur at the level of the technology model. The results of the study were presented to OpenAI. The developer of GPT-4 admitted the existence of potential vulnerabilities in motherboards. According to some media, this means that the work on correcting errors and patches was carried out before the publication of the article.
GPT-4, like all LLMs, must receive instructions before starting work on processing a user request and creating content that meets the needs of the client. The LLMS jailbreak involves the use of hints that are formulated in such a way as to force the AI configuration to perform a task that is not part of its purpose. In a more specific sense, this means that the machine intelligence model ignores the framework of action that was originally built into it.
GPT-4, as the researchers found, is more likely to generate toxic text than GPT-3.5 when receiving jailbreak prompts. Also, the authors of the article say that this configuration of AI more often agrees with materials that contain biased judgments.
Separately, the researchers noted that certain prompts can cause GPT-4 to provoke a leak of confidential data. Potentially, a similar probability exists when using all LLMs, but the mentioned OpenAI development is susceptible to it to a greater extent.
The researchers also published open-source code on GitHub, which they used to test the models. They noted that their goal is for other representatives of the research community to also join the work to prevent cases of LLM vulnerabilities from turning into instruments of harm.
As we have reported earlier, OpenAI Launches ChatGPT for Businesses/