The Problem That Started It (A Supervisor’s Question)
It started with a simple moment in the lab.
My supervisor, YunSuen Pai, came to me with a problem that every HCI researcher quietly suffers from:
“We have thousands of CHI papers… how do we actually use them when we need to write, design a study, or find gaps—without spending weeks doing Ctrl+F across PDFs?”
The obvious answer is “use an LLM.”
The real HCI answer is: you don’t just need answers—you need trust.
Because in research, it’s not enough for a system to be fluent. It needs to be:
- grounded (show sources)
- auditable (prove where the answer came from)
- privacy-preserving (papers stay local)
- fast (so it fits real workflow)
That’s how HCI‑LLM (a.k.a. HCI Research Assistant) was born: a fully local RAG-based research assistant for exploring 8,000+ CHI conference papers.

What I Built (In One Sentence)
A local system that ingests CHI PDFs → builds a vector database → answers questions with citations and confidence—without sending anything to the cloud.
At a high level:
PDFs → Text/Metadata → Chunking → Embeddings → ChromaDB
Query → Retrieval → Context → Local LLM (LMStudio) → Answer + Sources + Confidence
Why “Local” Matters (More Than You Think)
If you’re doing HCI research, your PDFs can include:
- copyrighted proceedings
- unpublished drafts
- sensitive notes and ideas
- early research directions
Uploading that to a hosted API is often a non-starter.
So I designed the system to be 100% local:
- PDFs stay on disk
- vector DB persists locally
- LLM runs via LMStudio (OpenAI-compatible local server)
That choice shaped everything else: performance, UX, and reliability.
Core Features That Make It Useful (Not Just “Cool”)
HCI‑LLM isn’t a single chat box. It’s a workflow tool:
1) Semantic Search Across Thousands of Papers
Instead of keyword search, you can ask:
- “What are common evaluation methods for accessibility tools?”
- “How do papers measure cognitive load in XR?”
- “Summarize approaches to participatory design for older adults.”
And get a response backed by relevant paper chunks + citations.
2) Specialized Research “Skills”
The system supports structured modes like:
- Literature review
- Methodology analysis
- Gap analysis
- Comparative analysis
- Brainstorming research ideas (with scoring)
These aren’t gimmicks—they are prompts + scaffolds aligned with real research tasks.
3) Analytics
Because discovery isn’t only Q&A:
- papers by year
- topic distributions
- trends
- (eventually) author networks + citation context
The Most Important Part: Anti‑Hallucination (Trust Design)
If an LLM confidently makes things up, it’s worse than useless.
So the RAG pipeline is designed around citation-backed responses:
- retrieval filters by similarity threshold
- confidence scoring considers evidence + sources
- explicit “I don’t know” when confidence is low
- low temperature for factual modes
In other words: trust is a feature, not an afterthought.
Scaling Up: 8,000+ PDFs Without Re‑Ingesting Forever
One underrated challenge is operational:
ingestion takes time, and re-ingesting a 5,000–8,000 PDF library is painful.
So HCI‑LLM is designed for:
- persistent vector DB (ChromaDB on disk)
- incremental ingestion (only new files are processed)
- checkpointing (resume if interrupted)
- parallel processing for speed
That matters because real research libraries grow weekly.
Quick Start (If You Want to Try It)
If you have the repo, the workflow is:
cd HCI-Agent/HCI_LLM
./setup.sh
./start.sh
Then:
- Streamlit UI:
http://localhost:8501 - API Docs:
http://localhost:8000/docs
To ingest papers:
python scripts/ingest.py --max-files 10
python scripts/ingest.py --parallel --workers 8
What I Learned (HCI Lens)
Building HCI‑LLM taught me that “LLM UX” isn’t just prompts—it’s:
- provenance UI (sources must be legible)
- error UX (“no answer” should be graceful, not failure)
- workflow fit (what do researchers do before and after the answer?)
- performance as UX (latency changes trust)
The system is a research tool, but also a design experiment:
How do we build LLM interfaces that earn trust in high-stakes knowledge work?
What’s Next
I’m actively iterating on:
- better citation UI + chunk highlighting inside PDFs
- deeper analytics (author networks, method clustering)
- evaluation with real research workflows (time saved, quality of related work, confidence)
- better “study design” assistance with constraints and templates
If you’re curious, the project lives here:
- GitHub: https://github.com/GTamilSelvan07/HCI-Agent
Images I’d Love to Add (If You Want to Share)
If you send me 2–4 images, I’ll wire them into the post immediately:
- Streamlit UI screenshot (chat + citations)
- Analytics tab screenshot
- Ingestion pipeline diagram or a terminal screenshot with ingestion stats
- A candid photo that fits the story (you at the lab / your notebook / your setup)
Drop them into public/images/blog/hci-llm/ as:
hero.pngui-chat.pngui-analytics.pngpipeline.png
