A plain vanilla OpenAI natural language AI application (ChatPGT) can answer general questions about topics such as the ones you can find on Wikipedia, because the underlying LLM model was trained using many sources, including Wikipedia. But in order to answer questions about information in proprietary documents, such as a company handbook, you need to feed the proprietary documents to the system. One simple but crude way to do this is to directly include the text from a document into the query/context/question. But if you are working with large documents, or if you have many documents, you probably need to use RAG (retrieval-augmented generation).
Until recently, creating a RAG system was extremely difficult and time-consuming. For a set of PDF documents, you had to use OCR to extract the text, chunk the text into blocks, convert the blocks into vector embeddings, store the embeddings form of the documents into an external vector database (such as Chroma), write tricky code to query the vector database, and write code to integrate the vector query results into the question/context. This process is a huge challenge.
But a few months ago, the OpenAI Responses API simplified creating RAG systems dramatically. The Responses API does most of the work for you. But to use the Responses API for RAG, you must create a special vector database store on the OpenAI servers, and upload your specialized documents into it.
I decided to investigate. My first step, in this blog post, was to create a vector store. And I wanted to make sure I could also delete the vector store so I wouldn’t be charged storage fees for all eternity.
The ideas are best explained by the examining the output of my demo and the program code. I create a vector store and then immediately delete it:
Begin OpenAI vector store demo
Creating vector store
Done
{'id': 'vs_68b9a8e212b48191b8375930e6c3ed41',
'name': 'dummy_vector_store', 'created_at': 1756997858,
'file_count': 0}
Deleting the vector store
Done
VectorStoreDeleted(id='vs_68b9a8e212b48191b8375930e6c3ed41',
deleted=True,
object='vector_store.deleted')
End demo
The demo program code is essentially some OpenAI documentation from platform.openai.com/docs/api-reference/vector-stores that I refactored a bit.
# create_openai_vector_store.py
from openai import OpenAI
def create_vector_store(store_name):
# assumes a global-scope client object
try:
v_store = client.vector_stores.create(name=store_name)
store_info = {
"id": v_store.id,
"name": v_store.name,
"created_at": v_store.created_at,
"file_count": v_store.file_counts.completed
}
return store_info
except Exception as e:
print("FATAL in create_vector_store() " + str(e))
def delete_vector_store(store_id):
try:
info = client.vector_stores.delete(vector_store_id=store_id)
return info
except Exception as e:
print("FATAL in delete_vector_store() " + str(e))
# -----------------------------------------------------------
try:
print("\nBegin OpenAI vector store demo ")
key = "sk-proj-_AX7bGTXUwg-qojh2T5Z2CVXrox" + \
"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" + \
"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" + \
"xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
client = OpenAI(api_key=key)
print("\nCreating vector store ")
store_info = create_vector_store("dummy_vector_store")
print("Done ")
print(store_info)
store_id = store_info["id"] # need to delete store
print("\nDeleting the vector store ")
delete_info = delete_vector_store(store_id)
print("Done ")
print(delete_info)
print("\nEnd demo ")
except Exception as e:
print("Fatal error in program" + str(e))
OK, that was an interesting investigation. Next, I need to explore how to use a vector store together with the Responses API to implement a RAG system.

In most AI systems, you send a request over the Internet and a response is sent back to you. In the late 1930s, there were fascinating telephone based systems to request and get music. These were the days before juke boxes.
Machines were located in bars and diner restaurants. The machines had dedicated phone lines. Customers dropped a dime into the machine, which activated the telephone, and then they’d talk to a woman in an operations center and request something like, “Play song #107 for me.” On the other end, operators listened to the request, physically put the requested record on a player, and play the song to the customer through the telephone.
Quite remarkable! I learned about these devices while I was watching the old 1945 movie “The Shanghai Cobra” featuring Chinese detective Charlie Chan. In that movie, one of these music-by-phone devices was in a diner had a built-in poisoned needle that was used to murder people. The operations center was secretly located in the basement of the diner and so the evil villain could see who was requesting the song and trigger the poisoned needle when the victim dropped a coin in the machine.
The rise of jukeboxes in the late 1940s quickly put these music-by-phone systems out of business.
Left: Here is an operations center for the Shyvers remote music network that was popular in the Seattle area in the 1930s and 1940s. Women in the operations center received song requests by phone and played the associated record.
Center: This is a “Shyvers Multiphone”, one of the remote music devices.
Right: This is a “Choice by Voice” machine, used by a different network system.

.NET Test Automation Recipes
Software Testing
SciPy Programming Succinctly
Keras Succinctly
R Programming
2026 Visual Studio Live
2025 Summer MLADS Conference
2025 DevIntersection Conference
2025 Machine Learning Week
2025 Ai4 Conference
2025 G2E Conference
2025 iSC West Conference
You must be logged in to post a comment.