AI Development

TODO

fine tuning is taking a pre-existing model and training it further with your own data set.
you might want to fine tune instead of DAG

resources

AI Tools concepts

AI Agents concepts

top lvl goals and challenges

Governance & Guardrails

Prompts are reviewed for bias, hallucination risk, safety, and regulatory compliance.
Enterprises often apply structured fallback prompts if the primary one fails.

Prompt Chaining

Multi-step workflows where outputs from one prompt are fed into the next (e.g., extract → classify → summarize).

tools

LangChain / LlamaIndex: Useful for chaining prompts, managing context, building apps on top of LLMs.

Embedding and Vector DB

Embedding

is a numerical representation of data (e.g., text) in vector form—usually a list of floating-point numbers.
The goal: capture the meaning or semantic similarity of content.
Two similar texts will have embeddings that are close to each other in vector space.

šŸ” Example:
"Hello world" → [0.12, -0.44, 0.88, ...] # A 1536-dimension vector
"Hi there" → [0.10, -0.40, 0.85, ...] # Very similar vector

🧠 Use Case:

A vector DB

stores and indexes embeddings.

🧠 How LLMs Use Embeddings + Vector DB

Typical RAG pipeline:

  1. Chunk + embed all documents
  2. Store vectors in a vector DB
  3. At query time:
    • Embed the query
    • Search for similar vectors
    • Return top results
  4. Inject retrieved text into prompt → LLM

Retrieval-Augmented Generation (RAG)

RAG, or Retrieval-Augmented Generation, is an AI framework that enhances the accuracy and relevance of large language model (LLM) responses by integrating information retrieval from external knowledge sources before generating text.

A REST API that fetches user data from a DB and inserts it into a prompt that is then passed to an LLM can be considered a basic or simplistic form of RAG. But it lacks the sophistication of semantic search and relies on exact match/structured query

RAG search is a hybrid approach that:

RAG Pipeline (Simplified)

  1. Preprocess Data
    • Split documents into chunks (e.g., 500 words)
    • Generate embeddings for each chunk
    • Store embeddings in a vector database
  2. At Query Time
    • Embed the user’s question
    • Use vector search to retrieve top-k similar chunks
    • Inject those into the prompt: Based on the following documents: [doc1], [doc2], ... Answer: "How do I register for sales tax in Quebec?"
  3. Send to LLM and get answer

āœ… Why RAG is Useful

Problem LLMs Have How RAG Helps
Hallucinations Grounds answers in real facts
Limited context window Retrieves only relevant info
No access to custom data Injects private/company data
Outdated model knowledge Real-time retrieval from fresh sources

MAKER Framework

AI Theory - MAKER framework

ways to make chatbot agentic

This is a problem we are tackling at Snapshot. as of 2026-01-14 We are trying to build a solution with the 3rd option, let's see if it succeeds

Here are the options to consider:

  1. RAG search should be the last option because it's not accurate by nature, it is a similarity search.

  2. defined interfaces using the backend server

  1. Read replica of DB, readonly user, and give the agent free access to query this DB on the fly.

Storing data generated by AI

We ask LLMs to generate JSON data. (Some LLMs have a parameter to set the output format to JSON) and store the output in a SQL relational DB.

Some data/summaries generated by AI we save in a single column of type JSON. This is useful for experimental outputs we need to iterate a lot over.

For the rest of the data we know the schema of and know that shouldn't change much in the future, keep using regular relational structure.