Back to Blog

RAG for Business Websites: Build a No-Hallucination Knowledge-Base Chatbot

Stop generic chatbots from guessing. Use Retrieval-Augmented Generation (RAG) so your bot answers strictly from your docs—accurate, cited, and trustworthy.

Jorge Mena
AIRAGchatbotSMEknowledge-base

RAG for Business Websites: Build a No-Hallucination Knowledge-Base Chatbot

Most website chatbots still "guess." Retrieval-Augmented Generation (RAG) flips the script: your bot retrieves relevant passages from your knowledge (docs, FAQs, contracts) and then generates an answer grounded in those sources—with citations.

The Limitation of Vanilla Chatbots

Non-RAG chatbots (or poorly configured ones) tend to hallucinate when they lack context or stray off-topic. They ignore source truth because they don't retrieve documents before answering. They go stale without a recrawl/re-embed schedule. And they break trust by hiding where answers came from.

The RAG Advantage

RAG, done right, gives you:

1. Grounded Answers with Citations

Answers include source links/snippets so users (and your team) can verify the facts.

2. Scope Control

You can hard-limit the bot to approved corpora (policies, pricing, SLAs) and block out-of-scope topics.

3. Continuous Freshness

A recrawl + re-embed cadence keeps knowledge updated without manual rework.

4. Lower Risk + Better UX

Clear fallbacks ("I don't have that info yet") beat confident nonsense. Trust rises, tickets drop.

Real-World Examples

We've used RAG patterns to ship outcomes like support deflection 30–50%: Bot answers from FAQs, past tickets, and guides—with "handoff when unsure." Sales enablement: Pricing and feature comparisons grounded in your product sheet; lead quality improves. Ops clarity: Internal playbooks searchable in chat; fewer "where is the doc?" interruptions.

The Andy Approach

Minimal, fast, measurable:

Stack: Next.js widget (web), Andy as the agent layer, Convex for vector store and real-time database, Firecrawl for website ingestion, and OpenAI for embeddings and generation.

Pipelines: Upload/Crawl → Chunk → Embed → Store in Convex RAG → Retrieve → Answer (+ cite) → Log → Evaluate.

Guardrails: Scope filters, cost caps, confidence thresholds with human handoff, and plan-based limits for usage control.

Getting Started

Before you build, decide:

Sources of truth: Which spaces are canonical (FAQ, Docs, PDFs, Websites)?

Scope limits: What must it not answer (legal advice, roadmap, custom quotes)?

Refresh cadence: Weekly for marketing docs, daily for pricing/FAQs, on-change for policies.

KPIs: Deflection rate, first-contact resolution, citation click-through, "I don't know" rate, CSAT.

Minimal Implementation Blueprint

Data prep

Normalize to PDF/Markdown where possible. Chunking: 300–800 tokens with semantic overlap (titles, section IDs as metadata). Add metadata (language, version, doc type, product area).

Retrieval

Vector search with semantic similarity. Tune top-k (start 5–8) and add filters (doc_type:faq, product:pro). Penalize stale versions via metadata.

Answering

System prompt enforces: cite sources, refuse out-of-scope, use brand tone. Format: short answer → bullet details → Sources with anchors. Confidence threshold: below X → handoff or ask a clarifying question.

Freshness & Ops

Convex schedulers: crawl on schedule, diff detection → selective re-embed. Alerts if ingestion fails or deflection drops week-over-week. Versioning: keep embeddings keyed by doc hash to avoid drift.

Evaluation (non-negotiable)

Golden-set Q&A: 30–100 questions your team cares about; re-run on each deployment. Track precision/recall proxy (did the cited doc actually contain the answer?). Regression gate: block releases that degrade accuracy.

Common Pitfalls (and Fixes)

Problem: Bot cites a doc but answers from memory. Fix: Force extract-then-answer and require inline citations with IDs.

Problem: Answers are too long or too vague. Fix: Cap tokens, prefer bullets, and include a one-sentence summary first.

Problem: Index bloat slows retrieval. Fix: Archive old versions, add filters, and split indices by domain.

Problem: Costs creep up. Fix: Cache frequent Q&As, batch embeddings, and compress long sources before chunking.

What "Good" Looks Like

Every answer includes sources (title + anchor). "I don't know" path is explicit (handoff, ticket, or contact form). Weekly eval report with trendlines (deflection, accuracy, CSAT). Freshness SLA defined per doc type. Security: PII masked in logs; private indices per tenant if multi-client.

Ready to launch a RAG chatbot that doesn't hallucinate?

---

Install Andy on your site—then plug in your Knowledge Manager. (WhatsApp and Slack integrations coming in Q2 2025)

Ready to Build Something Amazing?

Let's discuss how custom AI solutions can transform your business.