The Fascinating development of AI: From ChatGPT and DALL-E to Deepfakes Part 3

Are we heading for the point of no return?

Jan 05, 2023

This is part 3 of the history of ChatGPT, DALL-E, and Deepfakes. If you missed part 2 you can read it here.📌

Black and white photo of a human hand touching a robot hand in the style of "The Creation of Adam" painting — Black and White Photo of Human Hand and Robot Hand by Tara Winstead

You think that’s air your breathing now? That’s how Sentdex a popular YouTube channel on machine learning signed off on his video about ChatGPT1 after using it simulate a Linux Operating System and then writing a simulated Python program in it. The reference is to The Wachowski’s hit movie The Matrix2 (1999) and feels apt as ChatGPT seems more akin science fiction. It can write poems about data science, murder mysteries, write code, and even help fix bugs. At first glance it seems as if it can do anything, but it does have limits. Some of them are due to content filters the OpenAI team have placed on it, others are due to the limitations of the model. To find out some of its shortcomings you can just ask it.

Asking ChatGPT to list 5 things it can't answer. Things that it has no training on, things that require the web, things that are illegal, or inappropriate, subjective questions, questions beyond it's capabilities

While this hasn’t stopped people from circumventing the filters, it does raise the question about how this technology could be used in the future. But while it may seem that ChatGPT just exploded onto the scene, the work around chatbots has been going on for many years.

Just Chatting👄

One of the most famous, examples of a chatbot was ELIZA, named after Eliza Doolittle, a character in George Bernard Shaw's play Pygmalion. It was created in the 1960s by Joseph Weizenbaum3. It used simple rules to respond to input, such as turning user input like “I am feeling happy” into questions like “Do you enjoy being happy?”, or “Do you get happy often?”. ELIZA can still be used today in the form of a Rogerian psychotherapist in the Emacs4 text editor by launching the function M-x Doctor5. Other chatbots were developed through the years included website based ones like Jabberwacky6 (released in 1997), and it’s successor Cleverbot7 (released in 2008) which I remember using myself.

An example of a chatbot gone wrong was Microsoft’s Tay AI8. Released in early 2016, It was designed to mimic a 19 year old American girl, and was targeted at 18-24 year olds. Microsoft set up a Twitter account for Tay called TayandYou that would tweet, and learned based on conversations it had. However as highlighted in the wired article It's Your Fault Microsoft's Teen AI Turned Into Such a Jerk9, things didn’t go as planned for Microsoft. Warning hateful and offensive tweets below…

gerry @geraldmellor

"Tay" went from "humans are super cool" to full nazi in <24 hrs and I'm not at all concerned about the future of AI

In less than 24 hours Tay had been corrupted to say hateful and racist things, causing Microsoft to quickly backpedal, taking Tay offline. While the Twitter account still exists, its tweets are protected. It is unclear whether the tweets can still be viewed if you follow Tay or if they have all been deleted. While Microsoft’s attempt at an social AI failed, in 2015, only one year prior to the Tay debacle, a company called OpenAI would be founded. They would produce a technology so powerful and convincing, that it would put them on a crash course with Microsoft.

GPT across time ⌛

2018 introduced the foundation for ChatGPT with a paper called Improving Language Understanding by Generative Pre-Training10. This marked the creation of GPT-1. In the article it says

We demonstrate that large gains on these tasks can be realized by generative pre-training of a language model on a diverse corpus of unlabeled text, followed by discriminative fine-tuning on each specific task

Generative pre-training solves a common problem when building Large Language Models (LLM). Finding large amounts of labeled text on which to train is difficult and scales in difficulty depending on the domain. By initially training on the unlabelled data, you can initialize the model weights, and create a model that is more generalized. Once a model has been created, it can be tuned on specific labelled tasks in domains like coding, and stories. This approach generates an improvement that OpenAI claims is better than models generated on labelled texts and designed for specific domains.

GPT-2 was the first time OpenAI’s technology went viral. Announced in February 2019, it came with impressive examples of text generated by prompts created by the researchers. In its press release OpenAI said

Due to our concerns about malicious applications of the technology, we are not releasing the trained model.

Soon after the media picked up the story, and articles with headlines like this from the NYPost began to appear. This AI is so good at writing, its creators won’t release it11. The most popular story created by GPT-2 was a fake news article about the discovery of Unicorns in the Andes by a group of researchers. This was all possible due to the increased power of GPT-2. OpenAI said…

GPT-2 is a direct scale-up of GPT, with more than 10X the parameters and trained on more than 10X the amount of data.

While GPT-2 marked a 10x increased from GPT-1 the following year ( July 22nd 2020) would mark an even bigger leap. The paper for GPT-3 detailed the improvements, with the model being over 100 times the size of GPT-2. Now at 175 billion parameters it was able to generate even more interesting and believable text. This caught the eye of Microsoft, who would exclusively license the technology12 for GPT-3 in September of that year. This would become the backbone for Codex the model that powered GitHub Co-Pilot13 in June of 2022.

The present 🎁

Which leads us finally into ChatGPT14. ChatGPT is a culmination of all the work on Generative Pre-training Transformers developed by OpenAI since 2018. It uses version 3.5 of the GPT model and has taken the world by storm. It was released on November 30th, 2022 to critical acclaim, and quickly gained users.

Greg Brockman @gdb

ChatGPT just crossed 1 million users; it's been 5 days since launch.

Sam Altman @sama

little openai update: gpt-3, github copilot, and dall-e each have more than 1 million signups! took gpt-3 ~24 months to get there, copilot i think around 6 months, and dall-e only 2.5 months.

The huge influx of users highlights a major problem with models like these. They are very expensive. Tom Goldstein made an attempt to calculate the running cost of ChatGPT and the numbers are eye watering.

Tom Goldstein @tomgoldsteincs

I estimate the cost of running ChatGPT is $100K per day, or $3M per month. This is a back-of-the-envelope calculation. I assume nodes are always in use with a batch size of 1. In reality they probably batch during high volume, but have GPUs sitting fallow during low volume.

Currently ChatGPT is in the free research preview stage. OpenAI has not said how long ChatGPT will remain in that stage, but as it currently stands the free publicity has not hurt them. And if you read the FAQ15 you will see a little line about your conversations

Will you use my conversations for training?
Yes. Your conversations may be reviewed by our AI trainers to improve our systems.

So the over 1 million users currently using ChatGPT will help improve the model. Eventually they will have to find some way to monetize it. Time will tell what this will end up looking like, but many power users have already said they would be willing to pay for the service when it is monetized. But with how naturally some users have integrated ChatGPT into their workflow, users have expressed concerned about a world where generated AI text is this easy to make.

Fake News 📰 and Bots 🤖

Websites and services have been scrambling to stem the tide of AI generated content, as things like AI art with DALL-E16, and videos with Deepfakes17 have become easier to produce. Stack Overflow18 a popular website where programmers can get high quality answers to programming questions, recently banned answers generated by ChatGPT19 and GPT models. They cite the convincing tone ChatGPT gives even when giving wrong answers as being harmful to the standards they uphold on their website. They are not the only ones concerned about misinformation.

YouTube has had a comment spam problem for a long time. In a video released 8 months ago popular YouTuber Marques Brownlee highlights the issue. Many popular channels are overrun with bots impersonating the channel in the comments. They usually contain the profile picture of the channel and link to a telegram or whatsapp chat. By impersonating the channels, they are trying to scam people out of money.

They are pretty easy to spot, but they wouldn’t be doing them if it didn’t work. With GPT style chatbots these bots can become more realistic sounding.

A common internet tactic to "prove" you are real is to produce a unique photo of yourself that can't be found on the internet. With AI technologies this is becoming easier to do as well. Besides YouTube, Stack Overflow, and Twitter, this could be especially detrimental in the political sphere as bots have already been used during high profile elections20 to disrupt discourse.

Tools like the Hugging Face GPT-2 Output Detector21 can help to detect generated text, but they aren’t perfect and the progress to improve the models is far outpacing the tools to detect them. With GPT-422 rumored to be coming out next year with a model size 600x larger than GPT-3, it can be assumed that the output will be even more realistic than what ChatGPT on version 3.5 can produce.

What does it all mean? 🤔

Generative pre-trained large language models are changing the way we interact with computers. The lifelike dialogue they produce is something we could only have dreamed of sixty years ago. ChatGPT has played a big role in democratizing the access to these types of models. Previously these were only accessible to researchers, institutions, people with money, and programmers. I believe we are at an inflection point with advances in AI generated video, pictures, and dialogue. Even as I finish up this post an article just broke rumoring Microsoft potentially adding ChatGPT like functionality to Bing23. Given Microsoft's exclusivity agreement with OpenAI over GPT-3, there previous foray with Tay, and their AI assistant Cortana, it's at least safe to assume Microsoft has a vested interest in the space. Where will this all lead? Only time will tell.

If you made it this far thanks for reading! I’m still new here and trying to find my voice. If you liked this article please consider liking and subscribing. And if you haven’t why not check out another article of mine!