AI Development
TODO
- https://www.youtube.com/watch?v=0Zr3NwcvpA0
- similarity search as an alternative to vector embeddings
fine tuning is taking a pre-existing model and training it further with your own data set.
you might want to fine tune instead of DAG
- You want consistent tone/format (e.g., legal summaries, internal policies)
- You have lots of repetitive tasks and want more consistency
- Prompt engineering isnāt enough
- You want to reduce token usage (less prompt length)
resources
- Andrej Karpathy Youtube and following him in general
- ex director of AI @ Tesla. His presentation on computer vision helped me a lot in thesis
- Deep dive into LLMs like ChatGPT
- Deep Dive into LLMs like ChatGPT (Andrej)
AI Tools concepts
top lvl goals and challenges
- reduce hallucinations
Governance & Guardrails
Prompts are reviewed for bias, hallucination risk, safety, and regulatory compliance.
Enterprises often apply structured fallback prompts if the primary one fails.
Prompt Chaining
Multi-step workflows where outputs from one prompt are fed into the next (e.g., extract ā classify ā summarize).
tools
LangChain / LlamaIndex: Useful for chaining prompts, managing context, building apps on top of LLMs.
Embedding and Vector DB
Embedding
is a numerical representation of data (e.g., text) in vector formāusually a list of floating-point numbers.
The goal: capture the meaning or semantic similarity of content.
Two similar texts will have embeddings that are close to each other in vector space.
š Example:
"Hello world" ā [0.12, -0.44, 0.88, ...] # A 1536-dimension vector
"Hi there" ā [0.10, -0.40, 0.85, ...] # Very similar vector
š§ Use Case:
- Search
- Recommendation systems
- Semantic similarity
- Retrieval-augmented generation (RAG)
A vector DB
stores and indexes embeddings.
- Enables fast similarity search: "Find the top-5 documents most similar to this query."
- Often used in RAG systems to retrieve context before prompting an LLM.
š§ How LLMs Use Embeddings + Vector DB
Typical RAG pipeline:
- Chunk + embed all documents
- Store vectors in a vector DB
- At query time:
- Embed the query
- Search for similar vectors
- Return top results
- Inject retrieved text into prompt ā LLM
Retrieval-Augmented Generation (RAG)
Retrieval-Augmented Generation (RAG)
MAKER Framework
ways to make chatbot agentic
This is a problem we are tackling at Snapshot. as of 2026-01-14 We are trying to build a solution with the 3rd option, let's see if it succeeds
Here are the options to consider:
-
RAG search should be the last option because it's not accurate by nature, it is a similarity search.
-
defined interfaces using the backend server
- high maintenance - for new interface you need to go to backend, define schemas, run migrations, perhaps add more code in ai-manager as well
- Read replica of DB, readonly user, and give the agent free access to query this DB on the fly.
- low maintenance
- uncontrolled and highly experimental
Storing data generated by AI
We ask LLMs to generate JSON data. (Some LLMs have a parameter to set the output format to JSON) and store the output in a SQL relational DB.
Some data/summaries generated by AI we save in a single column of type JSON. This is useful for experimental outputs we need to iterate a lot over.
For the rest of the data we know the schema of and know that shouldn't change much in the future, keep using regular relational structure.