AGI | Artificial General Intelligence (AGI), the point at which AI matches or exceeds the intelligence of humans. |
Generative AI | AI systems that create new content rather than just analyzing existing data. |
Foundation Models | Large pre-trained models that serve as the base for various applications. |
Architecture | Structural design of the model. Most modern LLMs use Transformer architectures with attention mechanisms. |
Attention Mechanisms | Components allowing models to weigh importance of different words when generating text. |
Tokens | Basic units LLMs process; can be words, parts of words, or characters. |
Tokenization | The process of breaking text into tokens. |
Parameters | Learnable weights in the neural network that determine model capabilities. More parameters (measured in billions) generally mean more knowledge and abilities. |
Context Window | Maximum amount of text (measured in tokens) an LLM can consider at once. |
.safetensors | Secure file format for storing model weights that prevents arbitrary code execution during loading. |
Completion/Response | Text generated by the LLM in response to a prompt. |
Temperature | Setting that controls randomness in responses—higher values produce more creative outputs. |
Prompt | Input text given to an LLM to elicit a response. |
Prompt Engineering | Skill of crafting effective prompts to get desired results from LLMs. |
Few-shot Learning | Providing examples within a prompt to guide the model toward specific response formats. |
Instruction Tuning | Training models to follow specific instructions rather than just predicting next words. |
Hallucination | Hallucination in Large Language Models (LLMs) refers to when the model generates false, misleading, or non-factual information that sounds plausible but is incorrect. |
Embeddings | Vector representations of words/text that capture semantic meaning and relationships. |
RAG (Retrieval-Augmented Generation) | Enhancing LLM responses by retrieving relevant information from external sources. |
Training | The process of teaching an AI model by feeding it data and adjusting its parameters. |
Inference | Process of generating text from the model (as opposed to training). |
Fine-tuning | Process of adapting pre-trained models to specific tasks using additional training data. |
RLHF (Reinforcement Learning from Human Feedback) | Training technique to align LLMs with human preferences and improve safety. |
Epoch | The number of times a model training process looked through a full data set of images. E.g. The 5th Epoch of a Checkpoint model looked five times through the same data set of images. |
float16 | Half Precision, 16-bit |
float32 | Full Precision, 32-bit |