KINOMOTO.MAG

AI Basics: Lesson 06

Generative AI & LLMs

Generative AI is like a special type of smart assistant that learns from tons of information humans have created. Imagine it as a virtual brain that studies everything it’s fed, like books, articles, and conversations, to learn how language works.

These brainy models, called large language models (LLMs), are like super-smart students that have read billions of words. They don’t just understand language; they can also do things like solve problems and understand complex tasks.

Think of LLMs as big purple circles filled with memory. The more memory they have (parameters), the smarter they are and the more they can do.

Created by Kinomoto.Mag with Midjourney

You have to use prompts to talk to them, which are like asking questions or giving instructions. By fine-tuning them, you can customize their responses to fit different needs without starting from scratch.

Unlike regular computer code, where you write instructions in a formal language, with LLMs, you can talk to them just like you talk to a friend. When you ask them something, they generate a response, just like having a conversation. The response they give is called a completion, and the process is called inference.

What are LLM’s?

Large Language Models (LLMs) are like super-smart computer programs that have been trained to understand and generate human language. Imagine them as virtual brains that have read and learned from huge amounts of text, like books, articles, and conversations. They’ve become so good at understanding language that they can do things like answer questions, write stories, and even translate languages.

These models have been trained on massive datasets containing billions of words over several weeks or months, using powerful computers to process all that information. The more data they’re trained on, the smarter they become.

Think of LLMs as digital language experts that you can interact with by giving them prompts, which are like instructions or questions in natural language. They use their vast knowledge to generate responses that are often very accurate and human-like.

LLM use cases and tasks

Imagine that LLMs and generative AI are like versatile tools that can do a lot more than just chat. While chatbots are popular and get a lot of attention, the technology behind them, called next-word prediction, can be used for many other tasks in generating text.

For instance, you can ask an LLM to write an essay based on a topic you give or summarize a conversation you provide. It can even translate languages or convert natural language into computer code. For example, if you need a piece of Python code to calculate the mean of the data columns, you can ask the model to generate it for you.

LLMs can also do smaller tasks like finding specific information in a text, such as identifying people or places mentioned in a news article. This is called named entity recognition, where the model uses its understanding of language to find and categorize words.

Another exciting area is connecting LLMs to external data sources or other computer programs through APIs. This allows the model to access information it doesn’t already know and interact with real-world systems.

As LLMs become larger and more complex, their understanding of language grows too. This understanding is stored within the model’s parameters, which are like its brain. Even smaller models can be trained to perform specific tasks really well.

The incredible progress we’ve seen in LLMs in recent years is thanks to the architecture that powers them.

Created by Kinomoto.Mag with Midjourney

What are API’S?

Imagine you’re in a restaurant. You have a menu in front of you, and you use that menu to order food from the kitchen. In this scenario, the menu is like an API.

Now, think of a software application as the kitchen. It has all the ingredients and tools needed to prepare the food perform tasks or provide services. But just like you can’t walk into the kitchen and start cooking without knowing what ingredients are available or how to use the equipment, other software programs can’t interact with the application without some guidance.

That’s where the menu, or API, comes in. It’s like a list of options that tells other software programs how they can communicate with the application. It defines what requests they can make and what responses they can expect. So, instead of having to understand all the inner workings of the kitchen (the software application), other programs can simply use the menu (API) to order what they need.

For example, let’s say you have a weather app on your phone. That app gets its weather data from a weather service’s API. When you open the app, it sends a request to the weather service’s API, asking for the current weather in your area. The API then processes that request and sends back the relevant weather information, which the app displays to you.

Created by Kinomoto.Mag with Midjourney

Text generation before transformers

Let’s understand that while generative algorithms are not a new concept, the technology behind them has evolved over time. Previous generations of language models relied on something called recurrent neural networks (RNNs). These were powerful for their time but had limitations, particularly in terms of the amount of computing power and memory required to perform well at generative tasks.

Imagine you have an RNN trying to predict the next word in a sentence. With only one previous word to work with, its prediction might not be very accurate. As you try to make the RNN look at more preceding words, you need to allocate significantly more resources, but even then, the model might still struggle to make accurate predictions.

The challenge lies in the complexity of language. Words can have multiple meanings, and understanding them often requires looking at the whole context of the sentence or even the entire document. For example, the word “bank” could refer to a financial institution or the side of a river, and its meaning depends on the context.

This complexity led to a breakthrough in 2017 with the introduction of the transformer architecture, famously described in the paper “Attention is All You Need” by Google and the University of Toronto. This new approach changed the game for generative AI. The transformer architecture is efficient, capable of processing large datasets, and most importantly, it learns to pay attention to the meaning of words as it processes them.

In essence, attention is all you need, as the title of the paper suggests. This shift in architecture paved the way for the advancements we see in generative AI today.