AI Development

TODO

https://www.youtube.com/watch?v=0Zr3NwcvpA0
similarity search as an alternative to vector embeddings

fine tuning is taking a pre-existing model and training it further with your own data set.
you might want to fine tune instead of DAG

You want consistent tone/format (e.g., legal summaries, internal policies)
You have lots of repetitive tasks and want more consistency
Prompt engineering isn’t enough
You want to reduce token usage (less prompt length)

resources

Andrej Karpathy Youtube and following him in general
- ex director of AI @ Tesla. His presentation on computer vision helped me a lot in thesis
- Deep dive into LLMs like ChatGPT
Deep Dive into LLMs like ChatGPT (Andrej)

AI Tools concepts

AI Agents concepts

top lvl goals and challenges

reduce hallucinations

Governance & Guardrails

Prompts are reviewed for bias, hallucination risk, safety, and regulatory compliance.
Enterprises often apply structured fallback prompts if the primary one fails.

Prompt Chaining

Multi-step workflows where outputs from one prompt are fed into the next (e.g., extract → classify → summarize).

tools

LangChain / LlamaIndex: Useful for chaining prompts, managing context, building apps on top of LLMs.

Embedding and Vector DB

Embedding

is a numerical representation of data (e.g., text) in vector form—usually a list of floating-point numbers.
The goal: capture the meaning or semantic similarity of content.
Two similar texts will have embeddings that are close to each other in vector space.

🔍 Example:
"Hello world" → [0.12, -0.44, 0.88, ...] # A 1536-dimension vector
"Hi there" → [0.10, -0.40, 0.85, ...] # Very similar vector

🧠 Use Case:

Search
Recommendation systems
Semantic similarity
Retrieval-augmented generation (RAG)

A vector DB

stores and indexes embeddings.

Enables fast similarity search: "Find the top-5 documents most similar to this query."
Often used in RAG systems to retrieve context before prompting an LLM.

🧠 How LLMs Use Embeddings + Vector DB

Typical RAG pipeline:

Chunk + embed all documents
Store vectors in a vector DB
At query time:
- Embed the query
- Search for similar vectors
- Return top results
Inject retrieved text into prompt → LLM

Retrieval-Augmented Generation (RAG)

MAKER Framework

AI Theory - MAKER framework

ways to make chatbot agentic

This is a problem we are tackling at Snapshot. as of 2026-01-14 We are trying to build a solution with the 3rd option, let's see if it succeeds

Here are the options to consider:

RAG search should be the last option because it's not accurate by nature, it is a similarity search.
defined interfaces using the backend server

high maintenance - for new interface you need to go to backend, define schemas, run migrations, perhaps add more code in ai-manager as well

Read replica of DB, readonly user, and give the agent free access to query this DB on the fly.

low maintenance
uncontrolled and highly experimental

Storing data generated by AI

We ask LLMs to generate JSON data. (Some LLMs have a parameter to set the output format to JSON) and store the output in a SQL relational DB.

Some data/summaries generated by AI we save in a single column of type JSON. This is useful for experimental outputs we need to iterate a lot over.

For the rest of the data we know the schema of and know that shouldn't change much in the future, keep using regular relational structure.