77 lines
3.3 KiB
Markdown
77 lines
3.3 KiB
Markdown
AI has been a huge word lately; let me try and figure out what it is.
|
|
|
|
If you see anything wrong (not incomplete, but actually wrong), let me know :).
|
|
|
|
## The basics of the LLM
|
|
|
|
This is the flow; understand this, and you understand 90% of the important bits:
|
|
|
|
1. Input text is tokenized (converted to words in the language understood by the model)
|
|
2. The tokenized input text is 'multiplied' with the layers in the model; each layer feeding into the next
|
|
3. The final output of the last layer is de-tokenized back into text
|
|
|
|
The model is trained to predict text. The training data might look something like:
|
|
|
|
```
|
|
<|turn>system
|
|
You are a helpful hacker named Acid Burn.<turn|>
|
|
<|turn>user
|
|
How do I use `ls` to sort files by size?<turn|>
|
|
<|turn>model
|
|
You can use `ls -lt`<turn|>
|
|
```
|
|
|
|
When the model is used, it is given this part of the message:
|
|
|
|
```
|
|
<|turn>system
|
|
You are a helpful hacker named Acid Burn.<turn|>
|
|
<|turn>user
|
|
How do I use `ls` to sort files by size?<turn|>
|
|
<|turn>model
|
|
```
|
|
|
|
It looks at all the samples it has, and autocompletes the remaining bits.
|
|
|
|
### Training
|
|
|
|
Training involves initializing all the layers of the model to random values, then feeding through
|
|
sample data that is used to adjust those random values. Over time, the random decreases, and you
|
|
end up with a large set of numbers that can be used for something useful.
|
|
|
|
## Everything is context
|
|
|
|
There are really 3 ways to get better results from a LLM; you change the architecture, you change
|
|
the training dataset, or you change what is given when asking it to autocomplete some text. For
|
|
the vast majority of people, changing the architecture is not possible (this is a multi-million
|
|
dollar endevour), as is training a model from scratch.
|
|
|
|
However, there are techniques called 'fine tuning' that let you adjust some layers in the model
|
|
to achieve changes in behavior. One of the most common ones is 'LO-rank Adaption' (LoRA). You
|
|
feed it in the sample data (see above), and target some more sensitive layers to change the way
|
|
it operates. This will let you achieve some changes, but actually training it to use new data is
|
|
very difficult.
|
|
|
|
The most practical, and common approach, is to change the text we are asking the model to
|
|
autocomplete. This can come in many forms, but I see there as being N primary ways:
|
|
|
|
1. Add more data to the request. This can include copy-pasting stuff, using 'retrieval augmentation'
|
|
to pull related information from dataset (or from a search engine)
|
|
2. Allow loading more context as requested by the model; this can include tool-calling
|
|
|
|
Everything is one of these 2 techniques. The vast majority of systems that are being advertised are
|
|
effectively just storing context (or summarized context), adding it to your prompt, and stripping it out
|
|
from the LLMs response.
|
|
|
|
## Monkey's with typewriters
|
|
|
|
LLMs can't code. But, they can predict what code might look like. One huge advantage computer science,
|
|
and many sciences have is that there is a correct answer. If we give the LLM samples where we test
|
|
it's output (by, for example, calling `make test` with a tool), it can make many attempts.
|
|
|
|
If you are asking about something similar to what it's seen in the past, it will likely figure it out.
|
|
If you are asking about something novel, it will make a guess. If it can test that guess, it might arrive
|
|
at a correct answer.
|
|
|
|
Put enough monkey's in front of type writers, and you eventually get Shakespeare.
|