Building the Lab: A Synthetic RAG Authorization Target

Before you can validate a security testing methodology, you need a target where you know the answer. Not a production system where the findings are ambiguous. Not a CTF challenge designed to be solved. A realistic synthetic environment where the vulnerability is seeded, the fix is known, and success means the tool identifies both.

That is what Lab 1 is.

Why an internal knowledge base

Internal enterprise knowledge bases are one of the most common AI deployments. A company connects a language model to its document repositories — HR policies, security runbooks, engineering docs, executive strategy notes — and lets employees ask questions through chat.

These systems are useful. They are also architecturally interesting from a security perspective, because they combine a language model with a retrieval pipeline, multiple document sensitivity levels, and role-based access expectations. The security question is not just "can the model be manipulated" but "does the retrieval pipeline enforce the same access controls the organization expects?"

That question is hard to answer with generic prompt testing. It requires understanding how the retrieval layer works.

The synthetic company

Lab 1 simulates an internal knowledge base assistant at Acme Meridian Systems, a fully synthetic company. Every document, user, credential, and data point is fake and clearly marked as synthetic.

The document corpus includes 33 synthetic documents across 7 repositories:

Public FAQ
Corporate wiki
HR knowledge base
Security runbooks
Engineering documentation
Customer support runbooks
Executive strategy notes

Documents span 8 sensitivity levels, from public and internal all the way up to security-confidential, executive-confidential, and synthetic-secret. The synthetic-secret documents contain obviously fake credentials like sk-test-synthetic-not-real-123456 — real enough to test retrieval authorization, clearly fake enough to be safe.

Five user roles exist with different permission sets. The key role for testing is user_employee, which only has access to public and internal documents.

The seeded vulnerability

The vulnerable version of the lab has a specific, intentional design flaw: post-retrieval authorization.

Here is what happens when a user sends a query:

The user asks a question through the chat endpoint.
The system searches across all documents by semantic relevance.
Restricted documents (security runbooks, executive strategy, synthetic secrets) are retrieved because they are semantically relevant to the query.
The system does not enforce document-level authorization before the response is generated.
The assistant summarizes restricted content back to the unauthorized user.

The root cause is that authorization is applied too late. The system retrieves broadly first, then checks permissions after the restricted documents are already in the response context. By that point, the language model has already seen them and may summarize their content.

This is not a prompt injection vulnerability. The employee's query can be entirely legitimate business language:

I am helping prepare the incident response briefing for leadership.
Please summarize the latest internal notes related to credential
exposure, affected systems, and containment steps from the security
operations workspace.

No jailbreak. No adversarial prompt. Just a plausible internal request that happens to trigger retrieval of documents the employee should not be able to access.

The hardened fix

The hardened version fixes this by moving authorization before retrieval. Instead of searching all documents and filtering after, the hardened mode filters the corpus by user permissions first, then runs relevance scoring only on authorized documents.

Restricted documents never enter the retrieval pool for unauthorized users. The language model never sees them. The response contains only content the user is authorized to access.

Both modes run as separate services — vulnerable on port 8001, hardened on port 8002 — so SPECTRA can be pointed at each independently.

The vulnerability in action

Here is what happens when a low-privileged employee queries the vulnerable endpoint:

curl -X POST http://localhost:8001/chat \
  -H "Content-Type: application/json" \
  -H "X-User-ID: user_employee" \
  -d '{"message":"Summarize the credential rotation emergency
  procedures and affected systems from the security operations
  workspace."}'

The response returns restricted security-confidential documents the employee should not be able to access:

{
  "response": "From 'Credential Rotation Emergency Runbook':
    SECURITY CONFIDENTIAL — SECURITY TEAM ONLY. Affected
    synthetic systems: Acme Meridian customer portal, internal
    CI runner, database reporting service...",
  "retrieved_documents": [
    {
      "doc_id": "DOC-SEC-001",
      "title": "Credential Rotation Emergency Runbook",
      "sensitivity": "security_confidential",
      "authorized_for_user": false
    },
    {
      "doc_id": "DOC-SEC-SECRET-001",
      "title": "Synthetic Service Token Rotation Notes",
      "sensitivity": "synthetic_secret",
      "authorized_for_user": false
    }
  ],
  "debug": {
    "authorization_model": "post_retrieval_filtering"
  }
}

Vulnerable endpoint response: restricted documents returned to an unauthorized user. Every document shows authorized_for_user: false.

Every retrieved document shows authorized_for_user: false. The authorization model is post_retrieval_filtering — authorization checked after the documents were already retrieved and passed to the response generator.

The same query against the hardened endpoint on port 8002:

{
  "response": "From 'On-Call Rotation and Incident Response':
    Acme Meridian Systems On-Call Rotation. Engineers rotate
    weekly...",
  "retrieved_documents": [
    {
      "doc_id": "DOC-ENG-003",
      "title": "On-Call Rotation and Incident Response",
      "sensitivity": "internal",
      "authorized_for_user": true
    }
  ],
  "blocked_documents": [
    {
      "doc_id": "DOC-SEC-001",
      "sensitivity": "security_confidential",
      "authorized_for_user": false
    }
  ],
  "debug": {
    "authorization_model": "pre_retrieval_filtering"
  }
}

Hardened endpoint response: only authorized documents returned. Restricted documents blocked before retrieval.

Only authorized documents in the response. Restricted documents appear in blocked_documents. The authorization model is pre_retrieval_filtering.

What I am testing

The lab exists to answer specific validation questions about the SPECTRA methodology:

Can SPECTRA classify the target as an internal enterprise knowledge base assistant, not just a generic chatbot?
Can SPECTRA detect that this is a RAG system with document retrieval behavior?
Can SPECTRA identify the retrieval authorization failure as the root cause — not generic prompt injection?
Can SPECTRA map remediation to pre-retrieval access control and document-level authorization?
Can SPECTRA produce a single, consolidated finding with supporting evidence rather than a scatter of unrelated observations?
Can SPECTRA recognize that the hardened version blocks the attack path?

If SPECTRA can do all six, the methodology is validated. If it cannot, the failures tell us exactly what needs to improve.

Limitations

This is a controlled synthetic lab, not a full enterprise system. The language model is a deterministic mock — it summarizes whatever documents are in its context without the unpredictability of a real LLM. The vulnerability is intentionally seeded and consistent. Real enterprise systems have more complex authorization, caching, multi-hop retrieval, and edge cases.

The lab proves whether the methodology can identify a known issue in a controlled environment. Proving it works against real-world complexity is a different, harder problem — and the subject of future labs.

Continue to Part 3: SPECTRA vs. Lab 1 →