ChatGPT-Beyond the Hype and Fright

2023-06-14
12 min read
Featured Image

ChatGPT: Beyond the Hype and Fright.

Over the past few months ChatGPT has been the talk of the town. In case you have been living under a rock, ChatGPT is an artificial intelligence (AI) chatbot developed by OpenAI. So-called AI was already widely used in industry and research, but when the chatbot by OpenAI was made available for free to the public, it abruptly shifted from a gatekept technology only experts can handle to a tool everyone can use. To date, ChatGPT4 is the fastest-growing consumer application in history, reaching 100 million monthly active just two months after its launch.

Everyone was quickly impressed by the outputs provided by the chatbot. In fact, the experience is made very intuitive to the user and probably, its most impactful feature is that it mimics human language to the point you might feel it actually understands what you are asking. But what started as plain astonishment quickly veered towards controversy.

We don’t need to tell you about the amount of hype there has been around ChatGPT since its launch. But just as a quick recap to show the extent to which the debates over ChatGPT unfolded, let’s take a look at some illustrative (not exhaustive) examples.

ChatGPT has rekindled debates around the power of AI. To begin with, can an artificial intelligence think as humans? Will it ever? This debate has a long history in the field of philosophy of mind. And it comes closer to the general public from time to time, whenever a new tech development appears and can seemingly do something impressive enough for humans. In fact, this is exactly what happened when the first computers appeared in the late 1940s and looked something like this:

Today nobody would even wonder if a machine of this sort is intelligent like a human. What is considered to be intelligence is a forward-moving goalpost that follows along the course of technological developments. And this is also what happened with the launch of ChatGPT.

Some specialists in human and artificial cognition claim that new language models like the one used by ChatGPT have developed abilities that were up to now considered unique to humans, like theory of mind (ToM). Meaning, the ability to assign unobservable mental states to others. However, when we read beyond the clickbaity titles, research articles in these lines describe limited definitions of ToM or very preliminary results to their analysis.

Some researchers, like the authors of this paper at Microsoft, claimed that GPT-4 shows traits of what is called artificial general intelligence (AGI). In short, AGI systems are those that can solve any cognitive or human task in ways that are not limited to how they are trained. It is worth noting that conveniently, Microsoft invested $10 billion in OpenAI in January, adding on previous rounds of investment in 2019 and 2021.

Other experts show a very different perspective, though. According to Michael Wooldridge, a professor of computer science at the University of Oxford who has being doing research in the field of AI for the last 30 years, insists:

The results are impressively realistic, but the “basic statistics” are the same. “There is no sentience, there’s no self-contemplation, there’s no self-awareness.

And there was not only hype. On the flipside, there was also fright. If this technology could do all these things it was repeatedly advertised to do, then there were also good reasons to worry. And many of them are actually fair sources of concern. Like the appropriation of creative work as data for training without recognition or remuneration to creators or the potential bias of these AI systems that will impact our daily lives.

Some others may have taken it a bit too far:

But is there grounds for all the hype and fright created around it?

Both lines of reasoning lead us to believe that ChatGPT is so powerful it could radically change the world as we know it in a flash, for good or for evil. Much of the hype and fright around ChatGPT is tied to a narrative that mystifies how AI works. A poor understanding of AI actually makes it seem like magic. But there is no magic behind this piece of software. Contrary to what its name leads us to believe, artificial intelligence is not intelligent in the way humans are. The fact that AI can perform behaviors that we consider human-like doesn’t mean it can think like humans, or even think at all, feel, and act out of self-interest like humans. This doesn’t imply AIs are harmless: they are developed by humans, and so reflect the intentions, limitations and involuntary biases of those involved in the development or implementation of this technology.

So, let’s try and untangle the mess by taking a look at how this model works. Of course, we will only provide a simplified explanation, but it should be enough to get the picture and make evident what technologies like GPT-4 are made of.

How does it work?

As Stephen Wolfram puts it, ChatGPT is always fundamentally trying to produce a “reasonable continuation” of whatever text it’s got so far. And how does it do it? In a very schematic way of explaining it, GPT produces a ranked list of words along its matching probability of occurring next and then chooses one from that list using a weight. But GPT, nor any other large language model, for that matter, works directly with text. AIs are mathematical machines, and they only use numbers. Text, audio, images, or video are all converted to numbers; the trick is how. Old chat systems used to assign a unique list of numbers to each word. For example, “cat” can be represented with 5 numbers: (7, 3, 15, 28, 10). The numbers associated with each word are chosen automatically so that if the words mean similar things, they are close to each other. Words that are semantically very different use very distant numbers. When ChatGPT receives text as an input, it needs to translate natural language (meaning human language) to machine-readable language, or what is the same, to numbers.

In order to understand how ChatGPT works we first need to cover the very basics on the idea of embeddings. An embedding is basically a way of representing something with a list of numbers. Word embeddings are numeric vectors representing words, of course. And they have the property that words that are closer in meaning are represented by numbers that are closer to each other. ChatGPT and modern systems can represent full documents (like 8000 words) with a long list of numbers. In fact, GPT-4 uses 1536 numbers for each document. Each soup recipe on the Internet, each piece of news article, and each blog post is linked to its own list of 1536 numbers, and again, two texts that are “similar” are assigned very close numbers.

ChatGPT uses embeddings to represent a very large number of dimensions, but the same intuition can be reduced to a 2-dimensional space.

So, what the model behind it does is to:

first convert the input text into a list of numbers that represents that chunk of text, or an embedding. It assigns a number to each word. The next step is to find the probabilities of different words (or tokens) that could come next. It operates on the original embedding and produces a new one. That is, a new array of numbers that represent the answer for the words that are more likely to happen, one after the other. Finally, through a piece of its neural net architecture called a “transformer”, it assigns weights to transform the original embeddings to the final ones in the output. And so, it produces a list of probabilities for what token should come next iteratively.

Of course, the engineering behind it is complex and would require some technical background to fully understand it. However, the intuition behind it is not as obscure as some have made it seem. Again, quoting Stephen Wolfram, “the ultimate elements involved are remarkably simple.“

Does it really work?

GPT-4 is a large multimodal model capable of processing image and text inputs and producing text outputs. According to its creators, one of the main goals of this kind of model is to “improve their ability to understand and generate natural language text, particularly in more complex and nuanced scenarios.And we admit: It does a very good job. It is certainly amazing that we can compress almost everything we write into a list of numbers (long, yes, but not impressively long), and that by just looking at probabilities we can create highly realistic and accurate text. But this is not intelligence, it’s mostly clever book-keeping and number counting. And let’s not forget: we are talking about a statistical model. As such, it has strengths and limitations. And those limitations are clear to those who created the model. It is not fully reliable and often invents facts and makes reasoning mistakes. In fact, if you read the technical report, they warn you:

Great care should be taken when using language model outputs, particularly in high-stakes contexts, with the exact protocol (such as human review, grounding with additional context, or avoiding high-stakes uses altogether) matching the needs of specific applications. See our System Card for details.

Does it work for research?

Well, it often does provide useful information if you ask the right questions. The problem is, just like with the search engines we have used so far, papers or books, you can’t really take the outputs at face value. ChatGPT is designed to give you an answer, ideally one that you are happy with. In other words, it is trained to give answers that appear legit by the human eye. But from time to time it prioritizes making the user happy with the answer over providing an accurate “I don’t know” answer. This is what happened with the made up articles it showed a journalist researching a profile on Lex Fridman:

Is it safe to use?

There’s also concerns about security. On one hand, there’s the obvious concerns about data breaches. As we have already explained, ChatGPT stores insane amounts of data, even data coming from prompts fed by users. Let’s imagine an employee of a company uses the bot to get code to analyze a private dataset with confidential information. Technically, the employee is not openly sharing the information with anyone else. And the company’s private dataset is not explicitly published. However, this data might be used to respond to the queries of other users, potentially making sensitive information available. This means the user loses control over how the information they share in conversation with ChatGPT. Because of this, some companies are restricting its use. And some entire nations even.

On the other hand, there are worries about users that could trick ChatGPT into helping them with malicious goals, like exploiting the privacy vulnerabilities mentioned right above, spreading disinformation or outright scamming people. In the jargon, this is called prompt-injection. And it is used to pin down the attempt to manipulate the behavior of language models by crafting the input prompts. It’s easier to pull off, as it requires less technical skills than other cybersecurity attacks.

To date, security researchers are unsure about how to mitigate indirect prompt-injection attacks. There are patch fixes to particular problems, but no silver bullet. Just like any other new tech development (or spaceship), ChatGPT has security vulnerabilities we might still not be aware of.

Is it ethically built?

Going beyond the question of accuracy of its outputs and whether it’s safe to use, what about the process through which it was built?

In part, the richness of ChatGPT’s answers come from the insane amount of text scraped from the internet it was trained on. But, as you can imagine, this huge training dataset had a bit of everything you can find on the Internet. Quite literally. So, ChatGPT needed to build a moderation system to filter out toxic language and harmful instructions before it ever reached the user. And so OpenAI looked for a fast and cheap solution: sending thousands of snippets of text to an outsourcing firm in Kenya to do this. Spoiler: it was fast, it was cheap and it was ethically irresponsible to say the least.

According to an investigation by Time Magazine, data labelers were paid a take-home wage of between around $1.32 and $2 per hour. During their nine-hour shifts they had to read and label between 150 and 250 passages of text describing situations in graphic detail like child sexual abuse, bestiality, murder, suicide, torture, self harm, and incest. The employees interviewed by TIME explained they were mentally scarred by the work.

Just a couple of months away from its launch, ChatGPT has already stirred things up on the legal front. Lawyers didn’t want to miss out on the perks of using ChatGPT to ease their workload. For example, there’s the case of the lawyer who used the chatbot to help prepare a court filing. If you remember what we talked about when addressing the tool’s limitations for research, then you might have guessed how that went down. A hint: it didn’t go well. But this might just seem like a funny story compared to the legal issues that OpenAI is facing around the World.

The legal challenges the company has to face range from bans of the tool at the national level, like in the case of Italy to a mayor in Australia who wants to sue OpenAI for defamation provided by ChatGPT. There’s also ongoing lawsuits from other companies, like the case of a class action lawsuit accusing Microsoft, GitHub, and OpenAI of scraping licensed code to build GitHub’s AI-powered Copilot tool.

(Provisional) Conclusions

ChatGPT has rekindled debates around the power of artificial intelligence (AI), its strengths and its limitations. An open letter recently signed by more than 1000 AI experts and industry executives asks: Should we risk loss of control of our civilization? As researchers, we think it is always worth questioning what we know and being open to new answers. But we should also question the questions and the people behind them. There are interests at stake defining the direction in which the production of knowledge moves. Of course, the question over the risks of AI taking over the world as we know it is interesting to pose. However, there are more immediate issues to discuss around the uses of AI. The focus on future apocalyptic scenarios where evil tech takes over the World distracts us from the harms of AI that are currently stemming from the deployment of AI systems.

It’s nonsense to reject the advancement of tech developments, but it is also irresponsible to take them at face value. We propose engaging with an explainability approach. From our perspective, adopting an explainable AI approach means committing to help people understand the reasoning behind decisions made with AI or by AI. This is why we created EXPLAIN.chat.in.a.box, a data visualization project we developed to show how ChatGPT sees data patterns from within. We will present it at Sónar+D festival this week.

But we’ll tell you more about EXPLAIN.chat.in.a.box another time. Stick around if you’d like to see what ChatGPT sees.