"Understanding Retrieval-Augmented Generation (RAG) and Vector Databases for Not-Quite Dummies" on the Pure AI Web Site

I contributed some technical content and opinions in an article titled “Understanding Retrieval-Augmented Generation (RAG) and Vector Databases for Not-Quite Dummies” on the Pure AI web site. See https://pureai.com/Articles/2025/03/03/Understanding-RAG.aspx.

In a nutshell, RAG adds custom content to a natural language system so that the responses generated by a user prompt are more detailed. The custom content is stored in a special kind of database called a vector database.

Suppose you work for the Acme Company, which creates products and services for the hospital industry. Acme currently uses a generative system based on ChatGPT. And suppose a potential new customer asks your AI system, “Why should I use the Acme blood analysis machine?”

Because your chat system is based on the GPT-x large language model, your system has a good grasp of English grammar, and has a good deal of general knowledge available from online sources such as Wikipedia. Your system can give a general answer related to the advantages of automated blood analysis versus manual analysis. But your system can’t give detailed information to potential customers.

The solution is to store all kinds of Acme technical and marketing documents into a vector database. A RAG system accepts a user prompt, sends the prompt to the vector database, gets relevant information from the database, adds the information to the original prompt, and then uses the GPT-x functionality to generate a detailed response.

The key technology that enables RAG is a vector database. Standard SQL databases store highly structured information but are not suitable for storing documents. The RAG process starts with a document of custom information in ordinary text form.

Documents are broken down into tokens (a word or word fragment). Tokens are mapped to integers. Integers are mapped to a vector of floating-point values, called word embeddings. The vectors are stored in a vector database.

There are several vector database search algorithms. Three common techniques are hierarchical navigable small world (HNSW), approximate nearest neighbors oh yeah (ANNOY), and locality-sensitive hashing (LSH).

RAG and vector databases are standard features of all current natural language AI systems.

In the 1970s and 1980s, before digital photography, a standard feature of wedding photos was artistic double exposure. Here are three examples that didn’t turn out as well as the photographer had hoped for. My favorite is the one in the center with the man-bouquet, but the photo on the right, with creepy parents of the bride hovering menacingly over the ceremony, is disturbingly awesome too.