Bottom line: I revisited extractive question-answering using the Hugging Face code libraries. Compared to my previous exploration, the current version of Hugging Face was much, much easier to use.
Several months ago, I put together a demo of an AI natural language processing QA (question-answering) task. In QA, you set up a “context”, which is source information, such as a Wikipedia article or paragraph about the planet Venus. Then you issue a “question” such as, “What is the diameter of Venus?” QA gives you an answer.
There are two kinds of QA. The first kind is called extraction. The QA system scans the context and finds a start word/token and an end word/token, and extracts the text between them. Typically, multiple start-end-text results are identified and the best one is returned as the answer, along with a confidence score.
The second kind of question-answering is called generative. It is much, much more complicated. Generative QA works along the lines of ChatGPT. The entire context is analyzed, then a response is built up word-by-word. This approach can produce complex answers that combine different parts of the context. Generative QA is useful when a question starts with “why”.
Anyway, for a project I’m working on (being able to answer questions about a legal contract), extractive QA is good enough. I put together a demo that’s remarkably short (see below). I set up a context that’s a paragraph from the Wikipedia entry on the planet Venus:
context = """Venus is one of the four terrestrial planets in the Solar System, meaning that it is a rocky body like Earth. It is similar to Earth in size and mass and is often described as Earth's "sister" or "twin".[31] Venus is very close to spherical due to its slow rotation.[32] It has a diameter of 12,103.6 km (7,520.8 mi)—only 638.4 km (396.7 mi) less than Earth's—and its mass is 81.5% of Earth's, making it the third-smallest planet in the Solar System. Conditions on the surface of Venus differ radically from those on Earth because its dense atmosphere is 96.5% carbon dioxide, causing an intense greenhouse effect, with most of the remaining 3.5% being nitrogen.[33] The surface pressure is 9.3 megapascals (93 bars), and the average surface temperature is 737 K (464 °C; 867 °F), above the critical points of both major constituents and making the surface atmosphere a supercritical fluid of mainly supercritical carbon dioxide and some supercritical nitrogen."""
The output of the demo program is:
The question: What is the diameter of Venus? QA extraction results: confidence: 0.6748 start idx: 297 end idx: 308 answer: 12,103.6 km End QA demo
The confidence score (0.6748) is very high for extractive QA, and in fact, the answer (12,103.6 km) is correct. The start and end indices aren’t particularly useful for this example.
The most difficult part of the demo was dealing with versioning of the PyTorch (version 2.7.1) and transformers module (version 4.53.1). I relied on error messages to finally get compatible versions, as well as the revision number (626af31) for the distilbert SQUAD (Stanford Question Answering Dataset) model.
One problem with the whole extractive question-answering idea is that I suspect the technique may no longer be useful. I’ve done some experiments with the OpenAI (ChatGPT) programmatic API to extract an answer from a text file. Although it’s too early for me to state a strong opinion, the OpenAI API technique seems to be much simpler to implement, and give better results. Of course, the Hugging Face technique is free and the OpenAI API technique requires payment.
Anyway, an interesting experiment.
Question-Answering is now a standard AI technique.
Here are three memorable (to me) old songs that feature a singer asking a question.
Do You Wanna Hold Me? Bow Wow Wow (Annabella Lwin)
River Deep Mountain High (Do I Love You?) Tina Turner
Do You Wanna Dance? The Mamas and the Papas (Michelle Phillips)
Demo program:
# qa_extraction_demo_3.py
# ultra-simplified
# Anaconda 2023.09-0 Python 3.11.5 PyTorch 2.7.1 CPU
# Hugging Face tokenizers-0.21.2 transformers-4.53.1
from transformers import pipeline
qa_model = pipeline("question-answering",
model="distilbert/distilbert-base-cased-distilled-squad",
revision="626af31")
context = """Venus is one of the four terrestrial planets
. . . (see above) . . .
and some supercritical nitrogen."""
question = "What is the diameter of Venus?"
results = qa_model(question = question, context = context)
print("\nThe question: ")
print(question)
print("\nQA extraction results: ")
print("confidence: %0.4f" % results['score'])
print("start idx: %d" % results['start'])
print("end idx: %d" % results['end'])
print("answer: " + str(results['answer']))
print("\nEnd QA demo ")




.NET Test Automation Recipes
Software Testing
SciPy Programming Succinctly
Keras Succinctly
R Programming
2026 Visual Studio Live
2025 Summer MLADS Conference
2026 DevIntersection Conference
2025 Machine Learning Week
2025 Ai4 Conference
2026 G2E Conference
2026 iSC West Conference
You must be logged in to post a comment.