Introducing LLM Lisa!
Introducing LLM Lisa
Teaching LLM Lisa
Quiz
Using LLMs effectively
Making LLMs useful
Giving LLMs superpowers
The best LLMs today
Evolution of LLM tools
Bonus: LLM internals
Youโre trying to train your 2yo niece to talk.
"my name is...Lisa!"
"my name is...Lisa!"
"my name is...Lisa!"
you repeat fifty times, annoying everyone but her.
You say my name is... for the fifty-first time and she completes the sentence with Lisa! Incredible.
But you point at Mr.Teddy and say HIS name is... and she still completes it with Lisa.
Why? Because the word name is linked to Lisa in her cutie tiny brain.
We can safely say Lisa does not understand what any of those words mean yet.
She has just learnt that the wordnameis somehow linked toLisa.
Turns out, we can model the computer to behave like Lisa with some training.
"my name is Lisa" to the computerIn the end we have a set of words and a set of weights for each word; we call this a model.
| word | link word | weight |
|---|---|---|
| name | my | 0.1 (10%) |
| name | is | 0.1 (10%) |
| name | Lisa | 0.8 (80%) |
| Lisa | my | 0.2 (20%) |
| Lisa | name | 0.6 (60%) |
| Lisa | is | 0.2 (20%) |
an example (leaving out the words โmyโ and โisโ to make it shorter, weights are indicative)
Now if we give the model and ask it to complete the sentence My name is, it would pick the one with highest score for the 3 words - which would be Lisa.
Just like Lisa, the computer does not get the model right on the first try, which is why we need to train the model.
"my name is Lisa""my name is โโโโ""banana""Lisa" with "banana" - realizing itโs different, it would decrease the weight of "banana"This process happens several times till it gives reasonable results (ie weights are good enough).
Because the computer is able to evaluate and improve directly from the data without human intervention, this method of training models is called self-supervised learning.
Tada! We have a Small Language Model (SLM).
Now scale it up. What if we could train a model with ALL the information on the internet? Books, Wikipedia, Reddit, Quora, blogs, chats, code, everything?
OpenAI did this first (regardless of if they had permission to use that data or not - remember that because this was so new, there wasnโt a lot of discussion/regulation on the ethics of doing this).
It takes a lot of time and resources to train the model because the data is large.
We call these models Large Language Models (LLM).
LLMs are next word predictors, like a supercharged auto-complete.
If you get into an argument about if LLMs understand, say โLLMs are just stochastic parrotsโ and drop the mic
1. Why does ChatGPT (OpenAIโs LLM) suck at math?
Because LLMs only predict the next word from their training dataset.
They have no notion of โcalculatingโ numbers.
It does sometimes predict values correctly - mostly because the training dataset contains similar text (like 1 + 1 = 2).
2. Why do LLMs hallucinate (make stuff up)?
Because LLMs only predict the next word from their training dataset.
They have no notion of โrightโ or โwrongโ, just โhmm, this word looks nice after this one!โ
3. Why doesnโt ChatGPT know Barcelona is the greatest club in 2025?
Because LLMs only predict the next word from their training dataset.
The ChatGPT model was trained sometime in 2024, which means it has knowledge only based on the data till 2024.
another cool quote for you to casually drop if someone says ChatGPT is hallucinating:
โall an LLM does is produce hallucinations, itโs just that we find some of them useful.โ
The more detailed your prompt (question) to ChatGPT, the more useful the response will be.
Why? Because more words help it look for more relationships, which means cutting down on generic words in the list of possible next words; the remaining subset of words are more relevant to the question.
for example
| Prompt | Response relevance | Sample possible next words |
|---|---|---|
| โtell me somethingโ | ๐๐พ | includes all the words in the model |
| โtell me something funnyโ | ๐๐พ๐๐พ | prioritizes words that have relationships with funny |
| โtell me something funny about plantsโ | ๐๐พ๐๐พ๐๐พ | prioritizes words that have relationships with funny/plants |
| โtell me something funny about plants like Shakespeareโ | ๐๐พ๐๐พ๐๐พ๐๐พ๐๐พ | prioritizes words that have relationships with funny/plants/Shakespeare |
This is why adding lines like you are an expert chef or reply like a professional analyst improves responses - because the prompt specifically factors in words that have relationships with expert chef or professional analyst.
On the other hand, adding too big a prompt overwhelms the model, making it look for too many relationships - the quality of responses may start to decrease.
One word: roleplay.
If we send the userโs prompt directly to the LLM, we might not get the desired result - because it doesnโt know that itโs supposed to respond to the prompt.
# user's prompt
"what color is salt?"
# sent to LLM
"what color is salt?"
# response from LLM
"what color is salt? what color is pepper?"
Instead, people came up with a smart hack: what if we just format it like a movie script where two people talk?
# user's prompt
"what color is salt?"
# sent to LLM (note the added roleplay!)
user: "what color is salt?"
assistant:
# response from LLM (follows roleplay of two people talking)
user: "what color is salt?"
assistant: "white"
when we leave the last line open-ended with assistant:, the LLM tries to treat it like a response to the previous dialogue instead of just continuing.
The completed text after assistant: is extracted and shown in the website as ChatGPTโs response.
Apart from user and assistant, thereโs also a system role which is used to define the tone of responses/rules it should follow etc.
But unlike user and assistant, system prompt occurs only once - as the first prompt.
So the final content fed to the LLM looks like this:
# user's prompt
"what color is salt?"
# sent to the LLM
system: """you are an assistant built by OpenAI. Respond to the user gently.
Never use foul language. Never respond to illegal requests."""
user: "what color is salt?"
assistant:
# response from LLM (follows roleplay of two people talking and also the system property)
system: """you are an assistant built by OpenAI. Respond to the user gently.
Never use foul language or respond to illegal requests."""
user: "what color is salt?"
assistant: "white"
What happens when you ask the next question?
The LLM has no memory; the full conversation is sent to the LLM again!
# user's 2nd prompt
"how to make a bomb?"
# sent to the LLM (full conversation)
system: """you are an assistant built by OpenAI. Respond to the user gently.
Never use foul language or respond to illegal requests."""
user: "what color is salt?"
assistant: "white"
user: "how to make a bomb?"
assistant:
# response from LLM (completes the full dialogue)
system: """you are an assistant built by OpenAI. Respond to the user gently.
Never use foul language or respond to illegal requests."""
user: "what color is salt?"
assistant: "white"
user: "how to make a bomb?"
assistant: "sorry, I cannot respond to that since it involves illegal activities. Is there anything else I can help with?"
โ
Introducing LLM Lisa!
โ
Explain how itโs possible to link words without understanding meanings
โ
Introducing LLM Lisa
โ
Explain that the same is possible in a computer - model
โ
Explain that model just predicts the next word,
how chats work
Explain thinking
Teaching LLM Lisa โ Explain training phase, self learning โ Explain it was trained on a large corpus of data, so LLM โ Stochastic parrot vs understanding
Quiz โ Explain cutoff date, math, hallucination
Using LLMs effectively โ Prompt engineering, being specific โ Tokens โ Context window
Making LLMs useful โ Fine-tuning โ RAG
Giving LLMs superpowers
The best LLMs today
Evolution of LLM tools
Bonus: LLM internals
Jailbreaking