VaultLens — case study

A permission-aware, source-grounded document intelligence demo built to show how enterprise teams can reason over internal knowledge without ignoring access control and auditability.

Document intelligence, not open-ended chat. VaultLens is a RAG-ready architecture demo: it implements the trust layer (access control, citations, audit, evaluation) that a production system would wrap around retrieval and optional generation. The public build is deterministic — no live generative model, no embeddings API, no Supabase yet.

Overview

VaultLens is a portfolio project that simulates an internal “ask the policy library” experience. Users pick a demo role (not real authentication), ask a question, and receive answers that are filtered by document permissions, backed by citations, and recorded in a session audit trail. The corpus is 100% synthetic.

Problem

Many document Q&A experiences imply every retrieved chunk is fair game. In a real organization, policies, legal summaries, and HR data sit behind different entitlements. Answers must be grounded, permission-safe, and explainable for GRC, Security, and Operations stakeholders — before any generative model is involved.

Why this is not just a chatbot

Typical conversational UIs answer from whatever context they receive. VaultLens focuses on the control layer around document intelligence: it filters by role before selecting sources, refuses to reveal restricted body text, requires citations for grounded answers, logs audit events, and ships evaluation cases so permission and citation behavior regress like any other critical path. That is the hard part of enterprise RAG — independent of which generative model you plug in later.

Why the public demo does not use a live LLM yet

The first goal is to prove permissions, citations, auditability, and evaluation — behaviors that must stay correct no matter what model you use. Deterministic retrieval and answers make the demo testable and predictable for reviewers and CI. A generative model can be added later, but only after access filtering selects allowed chunks; the model should never receive restricted chunks for unauthorized roles.

Solution

A deterministic access filter → retrieval → answer pipeline. Retrieval scores chunks with keyword and metadata overlap; a separate bucket tracks matches in restricted material so the app can show a permission-aware refusal instead of hallucinating or teasing content.

How restricted matches are handled

Retrieval always computes two lists: chunks in documents the role may read, and chunks that matched the question but sit in documents the role cannot read. If the globally strongest match falls in the second bucket, the UI returns a blocked response: it may name the document title and category for transparency, but it never pastes clauses or terms from restricted sources. Switching to a higher-privilege demo role re-runs the same pipeline with a different allow list — that is the intended “compare permissions” story on /ask.

Architecture

Role + question
  → access control (allowed / restricted)
  → retrieval (top allowed + restricted match signals)
  → answer engine (grounded or blocked)
  → citations
  → audit log events
  → evaluation cases

The data layer is abstracted (VaultLensDataAdapter) so a local seed adapter can be swapped for Supabase/Postgres without rewriting the domain logic.

Permission model

Each document lists allowed role_id values. Retrieval scores every chunk, but only “allowed” chunks can power citations. If the strongest matches live in disallowed material, the user gets a blocked / notice with document titles, never body text, from the restricted set.

Retrieval & citations

Retrieval is intentionally simple: token overlap, keyword boosts, and extra weight when terms hit document titles (so “incident reporting” routes to the incident procedure). Citations are not free-form model output: each citation is tied to a specific chunk id, section title, sensitivity, and the list of demo roles that may read that document. The answer body is assembled only from those allowed chunks (paraphrased excerpts in this build). Stable chunk ids deep-link into the library for reviewer spot-checks. A future vector or hybrid RAG layer would replace the scoring function, not the rule that only allowed chunks may be quoted.

Auditability

The audit page stores session events: query, retrieval scope, status, and citation ids. In production this would land in an append-only log store for compliance. The demo is honest: no external persistence yet.

Evaluation

Fixed cases call the same answerQuery as the UI and assert status, expected source documents, absence of leaked ids, and (for permission tests) that a restricted match was detected. That makes the demo behave like a small trust test suite — useful in interviews to discuss evaluation beyond prompt prettiness. The evaluation dashboard summarizes pass rate, blocked cases, and citation coverage.

Real vs simulated

Real: TypeScript types, access rules, session audit, deterministic retrieval, blocked paths, and UI.
Simulated / demo: no generative API, no embeddings pipeline, no enterprise IdP, no org-scoped storage, and no PDF ingestion — by design for a safe public portfolio.

Supabase & RAG roadmap

This build does not require Supabase to tell the product story. When you are ready, see docs/supabase-plan.md for tables, RLS sketch, launch timing notes, and migration from the local adapter. Phase 3 can add pgvector, hybrid search, and server-side generation with only allowed chunks in the context window — entitlements enforced in SQL or server code before the model runs.

What this demonstrates

Open the live demo · Home

Responsible use (demo)

  • Public demo uses synthetic documents only. Do not upload private files in v1.
  • Restricted documents are never revealed to roles that cannot access them.
  • Grounded answers require cited sources; support scores are retrieval heuristics, not guarantees.
  • When in doubt, have a subject-matter expert review the answer in production.