AI System Privacy Audit: Model Provider Risks and Mitigations¶

System in scope: doc_quality_compliance_check (backend + orchestrator + observability).

1. System Diagram¶

Model-relevant architecture facts used in this risk sheet:

The backend/orchestrator uses a model adapter layer (Anthropic, OpenAI-compatible, Nemotron target/scaffold).
Prompt/output traces are exposed and persisted for observability and audit workflows.
The product already includes anti-hallucination signals and a validator stage, but model outputs can still be wrong or overconfident.

2. Data Flow Analysis¶

Data Flow	Source	Destination	Encrypted?	Logged?	Priority	Action Point	Implemtation
User document/text submission (may include personal data)	Frontend user session	FastAPI API routes (`/documents/*`, workflow endpoints)	In-transit (HTTPS/TLS)	Request + audit metadata; selected payload fields in audit/observability paths	High	1. Document meta data (e.g. author, role, time of change, change topic ...). use pseudonymization techniques on private data. Role based data exist in postgres db (mandatory profiles). Mask private data in change history meta data table depending on role and for final output. 2. Document content proof about PII data in text, images and link targets.	Yes
Workflow context assembly for model call	FastAPI/Orchestrator	Model adapter layer	Internal service transport (TLS in deployment)	Yes (workflow/audit events)	High	1. Orchestrator to block activities outside of the models sandbox. 2. Use pydantic/FastAPI library to validate the data meets the required schema; similar to data loss prevention frameworks like MS Purview	Yes
Prompt payload to model provider (current state)	Model adapter	External provider endpoint (Anthropic/OpenAI-compatible/Nemotron target)	In-transit (HTTPS/TLS)	Provider metadata and local audit/trace logs	High	local model sandbox / no external models	Yes
Model completion returned to service	External provider	Model adapter + orchestrator flow	In-transit (HTTPS/TLS)	Yes (quality traces, run/step events)	High	like above	maybe
Prompt/output pair persistence	Orchestrator/quality service	PostgreSQL observability/audit records	In-transit + at-rest DB controls	Yes (explicitly persisted)	High	Prompt check for PII data.	maybe
Quality telemetry view (provider, model, prompt/output, rich payload)	Backend observability endpoints	Admin UI (`/admin/observability`)	In-transit (HTTPS/TLS)	Access path is auditable	High	Check tracing output for PII data.	no
On-prem target model invocation (target state)	Model adapter	Internal model gateway/cluster (private network)	In-transit (mTLS recommended)	Yes (internal trace + governance logs)	High	Check tracing & logging output for PII data.	no
Shadow evaluation during migration	Adapter dual-run mode	External + internal models (temporarily)	In-transit (TLS/mTLS)	Yes (comparison-only dataset, restricted retention)	Medium	Model in sandbox and usage of synthetic dataset.	no

Corrected interpretation for privacy¶

The critical privacy boundary is not only frontend-to-backend, but backend/orchestrator-to-model-provider.
The highest-risk payload is the full prompt context plus model output, because both may contain direct or indirect personal data.
Observability improves governance but increases privacy exposure if traces are over-retained or insufficiently redacted.

3. Sensitive Data¶

Sensitive Data: Prompt Context and Model Output Content¶

Category: Input/output content containing direct or inferred personal data
Examples: Names, emails, stakeholder assignments, reviewer identifiers, document passages copied into prompts, generated summaries that may restate personal data
Why Sensitive: Leaves the primary application context during model inference (current external-provider path)
Current Protection: Session auth, RBAC, TLS, audit events, optional graceful fallback without external model key
Risk (or Harm) if Exposed: Unauthorized disclosure, regulatory non-compliance (GDPR purpose limitation/data minimization), reputational harm

Sensitive Data: Provider/Model Telemetry and Rich Trace Payload¶

Category: Metadata linked to actors, workflows, and potentially sensitive content fingerprints
Examples: provider, model_used, trace_id, correlation_id, latency/tokens, rich payload entries, prompt/output snapshots
Why Sensitive: Enables reconstruction of who triggered what model action and when; can become re-identifiable when combined with audit tables
Current Protection: Backend-protected endpoints, role checks, PostgreSQL persistence
Risk (or Harm) if Exposed: Excessive internal visibility, inference attacks, unauthorized profiling of users/reviewers

Sensitive Data: Model Credentials and Routing Configuration¶

Category: Secrets and control-plane configuration
Examples: API keys, adapter routing flags, provider selection settings
Why Sensitive: Compromise enables data exfiltration or unauthorized model usage
Current Protection: Environment-based secret configuration and backend auth controls
Risk (or Harm) if Exposed: Account abuse, data leakage, loss of integrity of compliance decisions

4. Privacy Risks¶

Risk 1: External model transfer of potentially personal prompt/output data¶

Priority: High
Risk Category: Model data transfer and residency
Potential Harm/Impact: Data leaves controlled environment; third-party processing and cross-border concerns
Ability to Implement Control: Medium
Recommended controls:
Replace external inference path with internal on-prem model gateway as default.
Keep adapter interface stable so provider swap does not change business logic.
Enforce strict egress policy: only approved internal model endpoints from production.

Risk 2: Over-collection and over-retention in model observability traces¶

Priority: High
Risk Category: Logging/telemetry minimization
Potential Harm/Impact: Prompt/output traces may store unnecessary personal data for too long
Ability to Implement Control: High
Recommended controls:
Apply pre-persistence redaction policy for prompt/output trace fields.
Separate "operational" vs "audit-evidence" retention classes with explicit TTLs.
Restrict raw trace visibility to least-privilege roles and log all access.

Risk 3: Hallucinated model output impacts compliance decisions¶

Priority: High
Risk Category: Model output reliability and decision integrity
Potential Harm/Impact: Wrong legal/compliance guidance, missed obligations, false audit confidence
Ability to Implement Control: Medium
Recommended controls:
Keep mandatory validator stage and add citation-required answer format.
Add grounding checks against rule packs before output is accepted.
Require HITL approval for high-impact outputs.

Risk 4: Model/version drift reduces reproducibility and explainability¶

Priority: Medium
Risk Category: Governance and traceability
Potential Harm/Impact: Same input yields diverging output without clear reason; weak audit defensibility
Ability to Implement Control: Medium
Recommended controls:
Persist model version, adapter version, prompt template version, and config hash for every run.
Freeze approved model release sets per environment.
Run periodic regression suite on reference compliance tasks.

Risk 5: Migration to on-prem models may degrade quality if unmanaged¶

Priority: Medium
Risk Category: Migration and model quality assurance
Potential Harm/Impact: Quality drop or hidden failure modes could shift risk from privacy to governance failure
Ability to Implement Control: High
Recommended controls:
Build a benchmark set from real workflows (privacy-scrubbed), including hallucination-sensitive cases.
Use shadow mode (external vs on-prem) with acceptance thresholds before cutover.
Define release gates: factuality, citation correctness, latency, and refusal behavior.

5. On-Prem Migration Plan (Quality-Preserving)¶

Target objective: remove external GenAI dependency for production personal-data paths while maintaining or improving compliance-output quality.

Adapter-first migration: keep orchestration and API contracts unchanged; switch routing in adapter layer.
Dual-run validation: run external and on-prem in parallel for a controlled period on the same scrubbed inputs.
Quality gate: promote on-prem only when benchmark thresholds are met (accuracy, hallucination rate, explainability/citations, latency).
Privacy hardening: enforce prompt minimization, redaction, and retention classes before and after cutover.
Operational fallback: if on-prem model confidence/validator checks fail, route to deterministic rules + HITL instead of external provider.

This sequence preserves output quality while reducing privacy exposure and external model dependency.

Mitigations and Controls¶

Risk	Mitigation or Control	Review	Reduction or Impact Change	Implement? (y/n)
Risk 1: External model transfer of potentially personal prompt/output data	On-prem migration plan	We analyzed substituting local models.	Risk of sensitive data exposure to 3^rd parties reduced	y
Risk 1: External model transfer of potentially personal prompt/output data	Run parallel on-prem and external models on the same scrubbed inputs	We will analyze what differences are observed in leakage of data between the two solutions.	Risk of sensitive data leakage by external quantitative estimate relative to on-prem	y
Risk 1: Quality gate: promote on-prem only when benchmark thresholds are met (accuracy, hallucination rate, explainability/citations, latency).	Use external provider only when on-prem is insufficient	We will analyze conditions where external models meaningfully outperform on-prem.	Risk of sensitive data exposure by reducing reliance on external models	y
Risk 1: External model transfer of potentially personal prompt/output data	Operational fallback to human in the loop (HITL)	We analyzed conditions where human review is necessary.	Risk of sensitive data exposure to 3^rd parties reduced	y
Risk 2: Over-collection and over-retention in model observability traces				n
Risk 3: Hallucinated model output				n
Risk 4: Model/version drift				n
Risk 5: Migration to on-prem models may degrade quality if unmanaged				n

Appendix¶

Additional information from the repo source code

Subprocessor egress — Anthropic and Perplexity

Perplexity (`src/doc_quality/services/research_service.py`)¶

When: PERPLEXITY_API_KEY is non-empty.

Prompt content: Built from ResearchRequest: domain, description, target_market, and optional custom_query (or default research question) via prompts/research_prompt_v1.txt.

When key is empty: Static fallback; no HTTP call to Perplexity (still builds query string for ResearchResult).

Research API persistence (`src/doc_quality/api/routes/research.py`)¶

When result.provider == "perplexity", the API calls create_quality_observation with payload containing:

llm_prompt (full built prompt),
llm_output (full model answer),
provider metadata and citation counts.

These are stored in quality_observations.payload (src/doc_quality/services/quality_service.py). This duplicates potentially sensitive user text on disk in addition to sending it to Perplexity.

Anthropic — risk template AI suggest (`src/doc_quality/api/routes/risk_templates.py`)¶

When: ANTHROPIC_API_KEY set; route ai-suggest.

Sent to Anthropic: System prompt (FMEA vs RMF), optional request.context (product/system context), partial_row as JSON, and instructions. Not full documents unless context or row fields contain them.

Anthropic — document analysis agent (`src/doc_quality/agents/doc_check_agent.py`)¶

Implementation: Sends up to 2000 characters of document content plus document_type and rule-based issues in the user message.

Wiring: DocumentCheckAgent is not referenced elsewhere in the Python tree (only defined in this file). The live analyze_document path uses document_analyzer.analyze_document only — rule-based, no LLM in current API flows. Treat as dead integration risk if something imports it later.

Compliance agent (`src/doc_quality/agents/compliance_agent.py`)¶

Creates an Anthropic client when a key exists but check_compliance delegates to compliance_checker only (no Anthropic messages observed in compliance_checker.py).

Orchestrator service (`services/orchestrator/`)¶

Separate process; AnthropicAdapter is scaffold-style. Boundary review: whatever workflows you enable may forward document fragments to backends — trace per workflow when operating this service.

Subprocessor summary¶

Vendor	Trigger	Typical data categories
Perplexity	`/api/v1/research/regulations` with API key	Product domain, description, markets, optional free-text query → also persisted in DB as prompt+answer.
Anthropic	Risk template `ai-suggest`	Free-text context + partial row JSON.
Anthropic	`DocumentCheckAgent`	First 2k chars of document + issues — not used by default document routes today.