RAG vs. Fine-Tuning Cover.

RAG vs. Fine-Tuning: Choosing the Right Approach for Your LLM Applications

In the rapidly evolving landscape of artificial intelligence, large language models (LLMs) have become powerful tools for a wide range of applications. However, these models come with inherent limitations that need to be addressed for optimal performance. Two methods stand out for enhancing LLM capabilities: Retrieval Augmented Generation (RAG) and Fine-Tuning. But which approach is right for your specific use case? Let鈥檚 break down the differences, strengths, and ideal applications for each. ...

September 3, 2024 路 3 min 路 Da Zhang

Run Large Language Models Locally on Your Mac: A Comprehensive Guide

Running Large Language Models Locally on Your Mac The world of AI is rapidly evolving, and now you can run powerful large language models right from your MacBook. Gone are the days when you needed massive cloud infrastructure to experiment with AI. In this guide, I鈥檒l walk you through several methods to run LLMs locally, with a deep dive into Ollama - the most user-friendly option. Local LLM Methods for Mac Comparison of Local LLM Platforms Platform Ease of Use Model Variety Resource Requirements GPU Support Ollama Very High Good Low-Medium Optional LM Studio High Moderate Medium Yes Hugging Face Transformers Low Extensive High Yes I will focus on Ollama in this blog since it provides APIs for building LLM applications and a command line interface for terminal enthusiasts. ...

July 4, 2024 路 4 min 路 Da Zhang

Advanced Prompt Engineering: Unlocking the Full Potential of LLMs

Advanced Prompt Engineering: Unlocking the Full Potential of LLMs After spending countless hours working with large language models (LLMs), I鈥檝e discovered that the difference between mediocre and exceptional results often comes down to how you frame your requests. While basic prompting can get you decent outputs, advanced prompt engineering techniques can transform these AI systems into powerful collaborators that deliver precisely what you need. Beyond the Basics: Strategic Prompting Techniques Role and Context Framing One of the most powerful techniques is establishing a specific role and context for the LLM: ...

July 10, 2023 路 4 min 路 Da Zhang

Build Your Own Applications with the OpenAI API

OpenAI Python Library Install the OpenAI Library 1 2 # install from PyPI pip install --upgrade openai Import the relevant Python Libraries and Load the OpenAI API Key If you don鈥檛 have an API Key, Get your API Key here. 1 2 3 4 5 6 7 8 import openai import os from dotenv import load_dotenv, find_dotenv _ = load_dotenv(find_dotenv()) # read local .env file openai.api_key = os.getenv('OPENAI_API_KEY') # more secure Here, we use a .env file to store our OpenAI API Key. ...

July 3, 2023 路 4 min 路 Da Zhang

Essential LLM Concepts

Concept Description AGI Artificial General Intelligence (AGI), the point at which AI matches or exceeds the intelligence of humans. Generative AI AI systems that create new content rather than just analyzing existing data. Foundation Models Large pre-trained models that serve as the base for various applications. Architecture Structural design of the model. Most modern LLMs use Transformer architectures with attention mechanisms. Attention Mechanisms Components allowing models to weigh importance of different words when generating text. Tokens Basic units LLMs process; can be words, parts of words, or characters. Tokenization The process of breaking text into tokens. Parameters Learnable weights in the neural network that determine model capabilities. More parameters (measured in billions) generally mean more knowledge and abilities. Context Window Maximum amount of text (measured in tokens) an LLM can consider at once. .safetensors Secure file format for storing model weights that prevents arbitrary code execution during loading. Completion/Response Text generated by the LLM in response to a prompt. Temperature Setting that controls randomness in responses鈥攈igher values produce more creative outputs. Prompt Input text given to an LLM to elicit a response. Prompt Engineering Skill of crafting effective prompts to get desired results from LLMs. Few-shot Learning Providing examples within a prompt to guide the model toward specific response formats. Instruction Tuning Training models to follow specific instructions rather than just predicting next words. Hallucination Hallucination in Large Language Models (LLMs) refers to when the model generates false, misleading, or non-factual information that sounds plausible but is incorrect. Embeddings Vector representations of words/text that capture semantic meaning and relationships. RAG (Retrieval-Augmented Generation) Enhancing LLM responses by retrieving relevant information from external sources. Training The process of teaching an AI model by feeding it data and adjusting its parameters. Inference Process of generating text from the model (as opposed to training). Fine-tuning Process of adapting pre-trained models to specific tasks using additional training data. RLHF (Reinforcement Learning from Human Feedback) Training technique to align LLMs with human preferences and improve safety. Epoch The number of times a model training process looked through a full data set of images. E.g. The 5th Epoch of a Checkpoint model looked five times through the same data set of images. float16 Half Precision, 16-bit float32 Full Precision, 32-bit

February 3, 2023 路 2 min 路 Da Zhang