# Verixa — User Requirements Specification

# Module 29: Screen Reader / Data Capture and Extraction Governance

| Field | Value |
|---|---|
| Document ID | VRX-URS-29 |
| Version | 1.0 |
| Status | Final — ready for QA, Validation, Regulatory Affairs, Information Security (Primary Owner — credential governance), Manufacturing Head, Site Quality Lead, Qualified Person Authority, and Founder approval. URS approval is separate from validation execution. This document becomes "Approved Controlled URS — released for engineering implementation and validation planning" only after signature capture in the Document Approval block. It becomes "Released for validation execution" only after the module migration evidence gate (URS-29-VAL-008) and validation evidence pack are satisfied. |
| Document Type | User Requirements Specification (URS) |
| GAMP 5 Category | Category 5 — Custom Application |
| Code Modules | Target implementation binding: expected primary code module `screen-reader`, expected API mounts `/api/v1/screen-reader/*` (canonical), expected supporting modules `audit-log`, `auth/rbac`, `electronic_signatures`, `hitl`, `documents` (URS-12 — controlled storage for source files + raw evidence retention), expected event-bus emission for `sr_capture_session_created`, `sr_capture_session_closed`, `sr_capture_session_reopened`, `sr_capture_record_added`, `sr_ocr_job_created`, `sr_ocr_job_processed`, `sr_ocr_job_failed`, `sr_extraction_template_created`, `sr_extraction_template_released`, `sr_extraction_template_superseded`, `sr_ingestion_rule_created`, `sr_ingestion_rule_polled`, `sr_ingestion_queue_item_created`, `sr_ingestion_queue_item_processed`, `sr_extraction_created`, `sr_extraction_review_started`, `sr_extraction_approved`, `sr_extraction_rejected`, `sr_extraction_correction_recorded`, `sr_extraction_promoted_to_target`, `sr_capture_program_locked`, `sr_capture_program_reopened`, expected MIRA context integration through `useMiraRecord('sr_capture_session', id)`, `useMiraRecord('sr_extraction', id)`, and `useMiraRecord('sr_extraction_template', id)` mappings, expected URS-12 Document Control linkage for **controlled source-file storage + immutable raw-evidence retention**, expected URS-13 Change Control linkage for extraction-template effective release, expected URS-21 findings emission for stale ingestion queues + repeated extraction rejections, expected URS-18 CAPA emission for chronic capture failures, expected target-record promotion API for downstream modules (URS-14 / URS-15 / URS-16 / URS-22 / URS-23 / URS-24 / URS-25 / URS-26 / URS-27 — extraction → target-record is the **promotion path** by which captured data lands in regulated modules), expected Authority Profile + HITL + e-signature integration for non-bypassable extraction approval / rejection / correction / promotion / program lock / reopen, expected platform_admin / super_admin support / break-glass only paths. Implementation evidence remains subject to repository verification and validation evidence. |
| Architecture Bindings | This module is subject to **ARCH-AI-001 AI Optionality and Manual Continuity** (see SR-architecture-binding row in Module specification). Verixa internally classifies this AI surface as **limited-risk under internal AI governance**, aligned with the limited-risk transparency approach in **EU AI Act (Regulation 2024/1689) Article 13**, escalating to **internal forward-looking AI governance aligned with EU AI Act Annex III concepts** where the captured data feeds a regulated downstream record (e.g., complaint per URS-14, OOS per URS-15, deviation per URS-16, batch record per URS-23). Every Screen Reader / Data Capture AI-assisted surface (OCR-assist, field-extraction suggestion, classification-assist, email-ingestion auto-route) shall provide a fully functional non-AI manual capture, extraction, and review path; **no AI service shall be the sole path to persist, validate, or promote captured data into a GxP record** (canonical binding). MIRA copilot may surface advisory extraction suggestions, advisory classification, and advisory routing recommendations through the controlled `sr_ai_assistance` substructure with full provenance + mandatory human acceptance per DEC-29-13. This module binds ARCH-AI-001 AC-1, AC-2, AC-3, AC-4, AC-5, and AC-7. Verixa treats **EU GMP Annex 22 Draft 2025 §7** as an internal forward-looking architectural control (not an enacted predicate rule); under that internal control, generative AI may draft advisory extraction-field suggestions and classification only with mandatory human acceptance recorded; generative / probabilistic AI is **PROHIBITED from being the sole path to approve an extraction, promote a captured record to a target GxP module, or finalize correction disposition**. Static deterministic AI (regex-based extraction, deterministic templates) is the primary structured-extraction mechanism per DEC-29-06 / DEC-29-10. Jurisdiction-specific legal enforceability of Annex 22 and the EU AI Act remains subject to a future jurisdiction-specific legal assessment. |
| Regulatory Classification | Critical infrastructure substrate — operates the canonical Screen Reader / Data Capture / Extraction governance covering: (a) the capture-session registry with **controlled lifecycle (`draft → in_progress → under_review → closed → reopened` per DEC-29-02 —)** + close authority gate; (b) the capture-record registry with **provenance + reviewability** per DEC-29-03; (c) the OCR-job registry with **controlled source-file storage** (URS-12 document upload, server-attributed storage path, never client-supplied file paths) per DEC-29-04 + multi-engine processing (Tesseract default, extensible per matching shared schema engine enum); (d) the OCR-job lifecycle with **failure-state audit evidence** per DEC-29-05 — `created → processing → succeeded | failed → archived` with explicit `sr_ocr_job_failed` audit emission; (e) the extraction-template registry with **controlled lifecycle (`draft → under_review → effective → superseded → archived`) + version immutability after `effective` release + URS-13 change-request linkage** per DEC-29-06; (f) the email-ingestion-rule registry with **credential governance via secret store + source-system qualification** per DEC-29-07 (— IMAP credentials NOT stored in env files; consumed via tenant-controlled secret store); (g) the ingestion-queue with **immutable source-evidence retention (raw email + attachment hashes preserved in URS-12 controlled document storage) + per-item audit emission** per DEC-29-08; (h) the data-extraction registry with **non-bypassable controlled review/validation/correction workflow** (`pending → under_review → approved | rejected | needs_correction → corrected → approved` with **bound e-signature on approval, rejection, and promotion + reviewer SoD**) per DEC-29-09 (— replaces generic update-driven validation); (i) the extraction provenance chain with **method (`regex_template` / `ocr_text` / `email_parsing` / `manual_capture` / `ai_assisted`) + source-record link + confidence + correction history** per DEC-29-10; (j) the **target-record promotion API** by which an approved extraction is promoted to a downstream GxP module (URS-14 complaint / URS-15 OOS / URS-16 deviation / URS-22 inspection observation / URS-23 batch entry / URS-24 stability result / URS-25 EM result / URS-26 APQR data collection / URS-27 regulatory feed item) — promotion requires **bound e-signature + target-module qualification gate (where applicable)** per DEC-29-23; (k) the multi-dimensional context capture (`tenant_id` mandatory, `study_id` optional, `product_id` optional, `target_module` (ENUM) for promotion path, `source_type` (ENUM `screen_capture` / `email` / `pdf_upload` / `manual_keying`)); (l) the canonical API contract `/api/v1/screen-reader/*`; (m) the typed schema validation across every route; (n) the controlled frontend route surface with explicit page coverage for `/screen-reader/extractions`, `/screen-reader/sessions/new`, `/screen-reader/sessions/:id`; (o) the audit-trail coverage with reason-for-change discipline including previously-incomplete OCR failure events and ingestion queue per-item events per DEC-29-11; (p) the Authority/HITL/e-signature substrate on every regulated final action per DEC-29-10; (q) the AI-assisted extraction substrate with **provenance + mandatory human acceptance** per DEC-29-13 (target requirement, ARCH-AI-001 binding); (r) the post-locked record immutability across the capture program; (s) the controlled reopen workflow with executive authority co-sign and Qualified Person co-sign per DEC-29-22; (t) the canonical findings emission to URS-21 for stale ingestion queues / repeated extraction rejections per DEC-29-15; (u) the canonical CAPA emission to URS-18 for chronic capture failures per DEC-29-16; (v) the MIRA copilot read-only context integration on capture records; and the per-jurisdictional regulatory expectations under FDA 21 CFR Part 11 §11.10(a) (validation), §11.10(b) (record copying), §11.10(c) (record protection), §11.10(d) (authority checks), §11.10(e) (audit trails), §11.10(g) (operational system checks), §11.10(h) (input device checks), §11.30 (open systems controls), §11.50 (signature manifestations), §11.70 (signature/record linking) — primary FDA predicate; **EU GMP Annex 11 §4 (Validation), §5 (Data — including data transfer integrity), §7 (Data Storage), §10 (Periodic Evaluation), §12 (Security including credential governance), §14 (Electronic Records / Signatures)** — primary EU predicate; EU GMP Annex 22 Draft 2025 §7 (HITL — internal forward-looking control); EU AI Act (Regulation 2024/1689) Art. 13 / Annex III (internal forward-looking control); MHRA Data Integrity Guidance (ALCOA+) — **primary applicability** (data capture is the entry point for ALCOA+ across the platform); GAMP 5 Cat 5; **FDA Computer Software Assurance (CSA) — September 2025 Final Guidance** (replaces CSV; applicable to data-capture systems); 21 CFR Part 11 §11.30 (open systems including TLS for email ingestion); ISO/IEC 27001 (information security — credential governance for IMAP / file-upload); and India CDSCO Schedule M (Revised) §16 (Records and Reports — applicable to data-capture provenance) subject to a future jurisdiction-specific legal assessment for Verixa's exact CDSCO obligations. |
| Date of Issue | 2026-05-07 |
| Module Owner (Engineering) | Quality / Data Capture Squad |
| Module Owner (Quality Validation) | CSV / CSA Lead — Data Capture |
| Module Owner (Compliance) | Information Security (Primary Owner — credential governance), Quality Assurance, Manufacturing, Regulatory Affairs |
| Approving Authority | Founder / Chairman & MD; QA Head; Manufacturing Head; Validation Head; RA Head; **Information Security Head (Primary Owner)**; Qualified Person (QP) Authority; Site Quality Lead |

---

## 0. Document Framing

### 0.1 Purpose of this document

This URS defines the target expected state for Verixa's Screen Reader / Data Capture and Extraction Governance module (Module 29). It is the binding contract between product, engineering, quality validation, regulatory affairs, information security (primary owner — credential governance), manufacturing, the Qualified Person authority, distribution, laboratory operations, and the executive authority for the design, implementation, validation, release, and on-going periodic review of the regulated screen-reader / data-capture substrate: the capture-session registry with controlled lifecycle per DEC-29-02; the capture-record registry per DEC-29-03; the OCR-job registry with controlled source-file storage per DEC-29-04; the OCR-job lifecycle with failure-state audit evidence per DEC-29-05; the extraction-template registry with controlled lifecycle and version immutability per DEC-29-06; the email-ingestion-rule registry with credential governance per DEC-29-07; the ingestion-queue with immutable source-evidence retention and per-item audit per DEC-29-08; the data-extraction registry with non-bypassable controlled review/validation/correction workflow per DEC-29-09; the extraction provenance chain per DEC-29-10; the **target-record promotion API** by which approved extractions land in downstream regulated modules per DEC-29-23; the canonical API contract `/api/v1/screen-reader/*` per DEC-29-01; the typed schema validation; the controlled frontend route surface; the audit-trail coverage with reason-for-change discipline per DEC-29-11; the Authority/HITL/e-signature substrate on every regulated final action per DEC-29-10; the AI-assisted extraction substrate with provenance and mandatory human acceptance per DEC-29-13; the post-locked record immutability; the controlled reopen workflow per DEC-29-22; the canonical findings emission to URS-21 per DEC-29-15; the canonical CAPA emission to URS-18 per DEC-29-16; the MIRA copilot read-only context integration with **AI advisory only — never the sole path to persist, validate, or promote captured data into a GxP record** under the canonical ARCH-AI-001 binding; the audit trail coverage with reason-for-change discipline; and the per-jurisdictional regulatory expectations. Compliance with this URS is mandatory.

### 0.2 Audience

Engineering, QA, RA, Manufacturing, Qualified Person Authority, Distribution, Laboratory Operations, Validation, **Information Security** (primary owner for credential governance and source-file storage controls), executive authority, the platform's Implementation team, internal and external auditors, and inspectors from regulatory bodies (FDA, EMA, MHRA, Health Canada, CDSCO, PIC/S, PMDA, WHO). The plain-language primer (§0.4) and worked examples (§3.5) make Module 29 accessible to non-domain engineers, product owners, validation engineers, and quality investigators.

### 0.3 Cross-references

- **URS-01** Authentication, Session & Access Control — identity envelope for every capture mutation
- **URS-02** RBAC & Permissions — the `screen_reader:*`, `screen_reader:capture:*`, `screen_reader:ocr:*`, `screen_reader:template:*`, `screen_reader:ingestion:*`, `screen_reader:extraction:*`, `screen_reader:extraction:approve:*`, `screen_reader:extraction:promote:*`, `screen_reader:secret_store:*` permission set
- **URS-03** Context Gate & Approval Scope — context-gate enforcement for capture scope dimensions
- **URS-04** Workflow / HITL / E-Signature / Approval Authority — Controlled Approval Modal contract for extraction approval / rejection / correction / promotion / program lock / reopen
- **URS-05** Authority Profile / Delegation / SoD — Authority Profiles consumed (`sr_capture_session_authority`, `sr_template_release_authority`, `sr_ingestion_rule_authority`, `sr_extraction_reviewer_authority`, `sr_extraction_promoter_authority`, `qualified_person_authority`, `final_quality_approver`, `executive_authority`, `information_security_authority`)
- **URS-06** Audit Trail / Hash Chain / Tamper-Evident — append-only audit substrate
- **URS-07** Study Management — optional study-scope dimension
- **URS-08** Tenant Management Lifecycle — tenant context for capture records
- **URS-09** Site / Facility Management — site-scope inherited via `studies` parent (where study-bound)
- **URS-10** Product / SKU / Drug Master Data — optional product-scope dimension
- **URS-12** Document Control / SOP — **primary linkage**: controlled source-file storage; immutable raw-evidence retention for OCR uploads + ingested email + attachment payloads (per DEC-29-04 / DEC-29-08)
- **URS-13** Change Control — extraction-template effective-release linkage; ingestion-rule effective-release linkage
- **URS-14** Complaints — **promotion target**: an approved extraction can be promoted to a complaint record per DEC-29-23
- **URS-15** OOS / OOT — promotion target: extraction → OOS record
- **URS-16** Deviations — promotion target: extraction → deviation record
- **URS-17** RCA — read-only consumer
- **URS-18** CAPA — primary downstream consumer for chronic capture-failure CAPA per DEC-29-16
- **URS-19** Risk Assessment — promotion target: extraction → risk record
- **URS-20** Reviews — periodic capture program review consumer
- **URS-21** Findings — primary downstream consumer for stale ingestion queues + repeated extraction rejections per DEC-29-15
- **URS-22** Inspection Mgmt — promotion target: extraction → inspection observation; URS-22 also retrieves capture-records as evidence
- **URS-23** Batch Records — promotion target: extraction → batch-record entry / IPC value (with URS-23 immutability and URS-23 ownership validation per URS-23 DEC-23-09 enforced at promotion)
- **URS-24** Stability — promotion target: extraction → stability result
- **URS-25** Environmental Monitoring — promotion target: extraction → EM result
- **URS-26** APQR — promotion target: extraction → APQR data-collection row
- **URS-27** Regulatory Intelligence — promotion target: extraction → regulatory feed item (where applicable)
- **URS-28** Training — qualification gate consumed for promoter authority (the user promoting an extraction MUST be qualified for the target module per URS-28 DEC-28-23)
- **URS-30** Notifications — notification delivery
- **URS-31** DQG — data-quality gate evidence
- **URS-32** MIRA AI — read-only MIRA copilot context integration; AI advisory drafting only with mandatory human acceptance per DEC-29-13
- **URS-33** GMP Manufacturing — promotion path
- **URS-34** GDP Distribution — promotion path
- **URS-35** Infrastructure / Backup-Restore — operational continuity; **secret store linkage for IMAP credentials** per DEC-29-07

### 0.4 Plain-language primer

In a regulated pharmaceutical operation, **screen reader / data capture and extraction governance** is the controlled entry point through which non-system data (paper records scanned to PDF, screenshots from external systems, emails received from suppliers / regulators / labs, manual keyed-in data) becomes regulated electronic records inside the QMS. Module 29 is the **paper-to-electronic / external-to-internal bridge**. It is the most consequential ALCOA+ entry point in the platform because: (1) every captured datum becomes the system-of-record source for downstream regulated modules (URS-14 complaints, URS-15 OOS, URS-16 deviations, URS-22 inspection observations, URS-23 batch records, URS-24 stability, URS-25 EM, URS-26 APQR, URS-27 regulatory intelligence); (2) MHRA Data Integrity Guidance ALCOA+ (Attributable, Legible, Contemporaneous, Original, Accurate, plus Complete, Consistent, Enduring, Available) applies first at the capture point; (3) EU GMP Annex 11 §5 (Data — including data transfer integrity), §7 (Data Storage), §12 (Security), §14 (Electronic Records / Signatures), and 21 CFR Part 11 §11.10(b) (record copying), §11.10(c) (record protection), §11.10(g) (operational system checks), §11.10(h) (input device checks), §11.50 (signature manifestations), §11.70 (signature/record linking) are all directly engaged at the capture boundary; (4) the **FDA Computer Software Assurance (CSA) — September 2025 Final Guidance** treats data-capture systems as a high-process-risk class. Module 29 is the target specification for this regulated workflow.

The most common mistake in regulated data-capture handling is **client-supplied file paths for OCR jobs**. The regulator's tell-tale at inspection is an OCR job pointing at `/tmp/uploads/file_x.pdf` with no controlled storage chain, no document-version tracking, and no integrity hash. Module 29 enforces the pathway: source files MUST be uploaded through controlled storage (URS-12 Document Control) with server-attributed storage path, content hash, document version per DEC-29-04; client-supplied file paths are rejected. The second most common mistake is **email-ingestion credentials in environment files**. Module 29 enforces the pathway: IMAP credentials are governed via tenant-controlled secret store (e.g., AWS Secrets Manager / HashiCorp Vault / Azure Key Vault) per DEC-29-07; env-file credentials are rejected. The third most common mistake is **extraction validation as a generic status update** without reviewer authority, intent, or evidence. Module 29 enforces the pathway: extraction validation is a **non-bypassable controlled review/validation/correction workflow** with `sr_extraction_reviewer_authority` + HITL + bound e-signature per DEC-29-09; reviewer SoD enforced; correction history preserved.

The fourth (and most consequential) mistake is **promoting captured data to a regulated GxP record without controlled handoff**. Module 29 introduces the **target-record promotion API** per DEC-29-23: approved extractions are promoted to URS-14 / URS-15 / URS-16 / URS-22 / URS-23 / URS-24 / URS-25 / URS-26 / URS-27 via a controlled handoff requiring `sr_extraction_promoter_authority` + HITL + bound e-signature + URS-28 qualification gate (per URS-28 DEC-28-23) where the target module requires qualified personnel; the target module respects its own validation rules at promotion time.

The **AI-assistance** dimension is critical and explicit in the Module specification ARCH-AI-001 binding (the canonical text is **explicit** that "no AI service shall be the sole path to persist, validate, or promote captured data into a GxP record"). Static deterministic AI (regex-based extraction templates, deterministic field parsers) is the **primary** structured-extraction mechanism per DEC-29-06 / DEC-29-10. **Generative AI (LLMs / MIRA copilot) is PROHIBITED from being the sole path to persist, validate, or promote captured data**; AI may surface advisory extraction-field suggestions, advisory classification, advisory routing recommendations only via the controlled `sr_ai_assistance` substructure with full provenance + mandatory human acceptance per DEC-29-13. The qualified human extraction reviewer / promoter's signed disposition is the system of record.

The **two-step release path** mirrors every other Module: this URS becomes "Approved Controlled URS — released for engineering implementation and validation planning" upon signature capture in the Document Approval block; it becomes "Released for validation execution" only after URS-29-VAL-008 (Migration Evidence Gate) and the §17 validation evidence pack are satisfied.

### 0.5 Conventions

Each requirement has a unique identifier. "MUST" denotes a mandatory requirement; "SHOULD" denotes a strong recommendation; "MAY" denotes an option. The document is self-contained: front end (§5), back end (§6), data model (§6.2), application programming interface (§6.3), workflow (§6.4), business rules (§6.5), audit (§6.6), security (§12), regulatory mapping (§14), test cases (§16), and validation evidence (§17) are all in this single file. Every requirement is mandatory unless explicitly marked SHOULD or MAY.

### 0.6 Glossary

| Term | Definition |
|---|---|
| Capture session | A controlled grouping of capture activity; lifecycle `draft → in_progress → under_review → closed → reopened` per DEC-29-02. |
| Capture record | A raw capture-evidence record under a session (e.g., a screenshot, a PDF upload metadata record); provenance preserved per DEC-29-03. |
| OCR job | A controlled job to extract text from an image / PDF using OCR; uses controlled source-file storage per DEC-29-04; lifecycle `created → processing → succeeded | failed → archived` with explicit failure audit per DEC-29-05. |
| Extraction template | A controlled-template document defining structured-data extraction rules (regex patterns, field labels, data types); lifecycle `draft → under_review → effective → superseded → archived` with version immutability per DEC-29-06. |
| Email ingestion rule | A controlled rule defining IMAP polling parameters (mailbox, filter, target template); credentials via secret store per DEC-29-07. |
| Ingestion queue | The queue of pending email-derived items awaiting processing; per-item audit + immutable source evidence per DEC-29-08. |
| Data extraction | The primary extracted record (regex-extracted fields, OCR text, email parsed fields, manually keyed fields) with full provenance per DEC-29-10; lifecycle `pending → under_review → approved | rejected | needs_correction → corrected → approved` per DEC-29-09. |
| Correction history | Immutable history of corrections to an extraction (the original extracted value is preserved; corrections are append-only with reviewer attribution + bound e-signature) per DEC-29-10. |
| Target-record promotion | The controlled handoff by which an approved extraction becomes a record in a downstream regulated module (URS-14 complaint, URS-15 OOS, URS-16 deviation, URS-22 observation, URS-23 batch entry, URS-24 stability result, URS-25 EM result, URS-26 APQR data row, URS-27 feed item) per DEC-29-23. |
| Reopen | A governed transition event from `locked → in_progress` requiring `executive_authority` co-sign AND `qualified_person_authority` co-sign + documented reason; appends a new program iteration without mutating prior locked evidence per DEC-29-22. |
| ARCH-AI-001 | Platform architecture binding requiring manual continuity for every AI surface (AC-1, AC-2, AC-3, AC-4, AC-5, AC-7) — the Module specification explicitly invokes ARCH-AI-001 AC-5 for this module given the AI prohibition on sole-path promotion. |
| Annex 22 | EU GMP Annex 22 (Draft 2025) §7. Verixa treats Annex 22 §7 + EU AI Act high-risk / transparency as internal forward-looking AI governance controls. AI may draft advisory extraction / classification / routing only with mandatory human acceptance; AI is prohibited from being the sole path to persist, validate, or promote captured data. Binding predicate-rule obligations remain those listed in §14. |
| MIRA | The platform's read-only AI copilot service; for Module 29 MIRA may propose advisory drafts via `sr_ai_assistance`; no MIRA write paths to system-of-record extraction fields without explicit human confirmation; no AI signs extraction approvals; no AI promotes extractions. |
| Secret store | Tenant-controlled secret-management substrate (AWS Secrets Manager / HashiCorp Vault / Azure Key Vault / GCP Secret Manager) for IMAP credentials and other sensitive integration credentials per DEC-29-07. |

### 0.7 Module-29 architectural picture

```mermaid
flowchart TD
 USR[Capture User] --> SES[/Capture Sessions — controlled lifecycle/]
 SES --> CR[/Capture Records — provenance/]
 USR --> UPLOAD[/PDF Upload via URS-12 Doc Control/]
 UPLOAD --> OCR[/OCR Jobs — controlled source-file storage/]
 OCR --> OCR_PROC[OCR Processing — Tesseract or extensible engine]
 OCR_PROC --> OCR_OK[Succeeded — text + confidence + audit]
 OCR_PROC --> OCR_FAIL[Failed — explicit failure audit — DEC-29-05]
 EMAIL_RULE[/Email Ingestion Rules — credentials via secret store/]
 EMAIL_RULE --> POLL[Polling — IMAP via secret-store-resolved credentials]
 POLL --> QUEUE[/Ingestion Queue — per-item audit + raw evidence in URS-12/]
 QUEUE --> EXT_E[Extraction from email parsed fields]
 OCR_OK --> EXT_O[Extraction from OCR text via regex template]
 CR --> EXT_M[Manual keyed extraction]
 AI[MIRA AI] --> AIA[/SR AI Assistance — advisory + provenance + acceptance/]
 AIA -. advisory only.-> EXT_O / EXT_E / EXT_M
 EXT_O / EXT_E / EXT_M --> EXT[/Data Extractions — controlled review workflow — DEC-29-09/]
 EXT --> REV[Review — sr_extraction_reviewer_authority + HITL + bound e-sign]
 REV --> APPROVED[Approved]
 REV --> REJECTED[Rejected]
 REV --> NEED_CORR[Needs Correction]
 NEED_CORR --> CORR[Correction — append-only history — DEC-29-10]
 CORR --> REV
 APPROVED --> PROMOTE[/Target-Record Promotion — DEC-29-23/]
 PROMOTE --> M14[URS-14 Complaints]
 PROMOTE --> M15[URS-15 OOS]
 PROMOTE --> M16[URS-16 Deviations]
 PROMOTE --> M22[URS-22 Inspection Observations]
 PROMOTE --> M23[URS-23 Batch Entries]
 PROMOTE --> M24[URS-24 Stability Results]
 PROMOTE --> M25[URS-25 EM Results]
 PROMOTE --> M26[URS-26 APQR Data]
 PROMOTE --> M27[URS-27 Feed Items]
 M28[URS-28 Training] -- qualification gate --> PROMOTE
 M12[URS-12 Document Control] -- raw evidence --> UPLOAD / QUEUE
 M13[URS-13 Change Control] --> EXT_TPL[/Extraction Templates — controlled lifecycle/]
 EXT_TPL --> EXT_O
 M21[URS-21 Findings] <-- EXT (rejected pattern, stale queue findings)
 M18[URS-18 CAPA] <-- EXT (chronic capture failure CAPA)
 LOCK[Capture Program Lock] --> SES
 LOCK -. governed reopen + executive + QP co-sign.-> SES
```

The platform shall implement: controlled capture-session lifecycle per DEC-29-02; capture-record provenance per DEC-29-03; **controlled source-file storage** via URS-12 Document Control for OCR jobs per DEC-29-04 (no client-supplied file paths); explicit OCR failure-state audit per DEC-29-05; controlled extraction-template lifecycle per DEC-29-06; **credential governance via secret store** for email ingestion per DEC-29-07; immutable source-evidence retention via URS-12 + per-item ingestion-queue audit per DEC-29-08; **non-bypassable controlled review/validation/correction workflow** per DEC-29-09; full extraction provenance chain per DEC-29-10; **target-record promotion API** to downstream regulated modules per DEC-29-23 with URS-28 qualification gate; canonical API contract `/api/v1/screen-reader/*`; typed schema validation; controlled frontend route surface; audit-trail coverage with reason-for-change discipline per DEC-29-11; Authority/HITL/e-signature substrate per DEC-29-10; AI-assisted extraction substrate with provenance + mandatory human acceptance per DEC-29-13; post-locked record immutability; governed reopen with executive + QP co-sign per DEC-29-22; canonical findings emission per DEC-29-15; canonical CAPA emission per DEC-29-16; MIRA copilot read-only with advisory drafting only — **AI prohibited from sole-path persist / validate / promote**; and per-jurisdictional regulatory expectations.

### 0.8 Locked Launch Controls

| Locked control | Authority | Rationale |
|---|---|---|
| Two-step release path: signature → engineering implementation → validation execution | DEC-29-01 / VAL-008 | Mirrors every other Module. |
| "No Module 29 internal decisions outstanding" | §11.6 | Captured in locked decisions DEC-29-01..DEC-29-23 (§3.2). |
| `platform_admin` / `super_admin` support / break-glass only | DEC-29-20 / SoD-29-07 | Operating-tenant capture ownership is the regulated path. |
| Target implementation binding language | Module bindings | URS specifies expected implementation. |
| AI overclaim posture as **internal forward-looking governance** with **canonical ARCH-AI-001 prohibition on AI-only sole-path persist / validate / promote** | Architecture Bindings | Canonical Module specification explicitly invokes ARCH-AI-001 AC-5; static deterministic AI (regex) is primary; GenAI advisory only. |
| Enumerated error codes | §6.7 | Stable machine-readable error contract. |
| JSON multi-signature evidence as **derived snapshot** | §6.6 | The `electronic_signatures` substrate is the system of record. |
| India CDSCO §16 row | §14 | India CDSCO Records and Reports captured (subject to a future jurisdiction-specific legal assessment). |
| Version 1.0 posture | Header | First binding version. |
| Canonical API mount `/api/v1/screen-reader/*` | DEC-29-01 / SR-001 | Frontend hooks aligned to canonical `/api/v1/screen-reader/*` mount; route comments aligned. |
| Frontend route surface alignment | DEC-29-12 / SR-001 / SR-011 | Routes `/screen-reader/extractions`, `/screen-reader/sessions/new`, `/screen-reader/sessions/:id` declared in App.tsx. |
| Capture-session controlled lifecycle | DEC-29-02 / SR-002 | `draft → in_progress → under_review → closed → reopened` with close authority gate. |
| Capture-record provenance | DEC-29-03 / SR-003 | Source-link + capture-method + timestamp + capturer attribution. |
| OCR controlled source-file storage | DEC-29-04 / SR-004 | Source files via URS-12 Document Control upload with server-attributed storage path + content hash; client-supplied file paths rejected. |
| OCR failure-state audit | DEC-29-05 / SR-005 | `sr_ocr_job_failed` event + audit-trail entry with error detail. |
| Extraction-template controlled lifecycle + version immutability | DEC-29-06 / SR-006 | `draft → under_review → effective → superseded → archived`; URS-13 CR linkage on effective release. |
| Email-ingestion credential governance via secret store | DEC-29-07 / SR-007 | IMAP credentials NOT in env files; tenant-controlled secret store reference (`secret_store_ref`); env-file credentials rejected. |
| Ingestion-queue immutable source-evidence retention + per-item audit | DEC-29-08 / SR-008 | Raw email + attachment payloads stored in URS-12 with content hash; per-item audit emission. |
| Non-bypassable controlled extraction review/validation/correction workflow | DEC-29-09 / SR-009 | `pending → under_review → approved | rejected | needs_correction → corrected → approved`; `sr_extraction_reviewer_authority` + HITL + bound e-sign + reviewer SoD. |
| Extraction provenance chain | DEC-29-10 / SR-010 | `extraction_method`, `source_record_id`, `confidence`, `correction_history_json` (append-only). |
| Target-record promotion API to downstream regulated modules (target requirement) | DEC-29-23 | Approved extractions promoted via `sr_extraction_promoter_authority` + HITL + bound e-sign + URS-28 qualification gate where applicable. |
| AI-assisted extraction substrate with provenance + mandatory human acceptance (target requirement) | DEC-29-13 / ARCH-AI-001 AC-5 | `sr_ai_assistance` table; advisory only; AI cannot sign approval / promote. |
| Authority/HITL/e-sign on every regulated final action | DEC-29-10 / SR-009 | Extraction approval / rejection / correction / promotion / program lock / reopen all gated. |
| Audit-trail coverage including OCR failure + ingestion queue per-item events | DEC-29-11 / SR-008 / SR-012 | Audit gaps closed. |
| Bound e-signature persistence on every regulated final action | DEC-29-10 / DEC-29-23 | Extraction approval / rejection / correction / promotion / program lock / reopen — all carry bound e-signature. |
| Governed reopen pattern (`locked → in_progress`) | DEC-29-22 / SoD-29-06 | Append-only iteration; executive + QP co-sign; does NOT mutate prior locked evidence. |

---

## 1. Scope and Out-of-Scope

### 1.1 In-scope

- The capture-session registry with controlled lifecycle.
- The capture-record registry with provenance.
- The OCR-job registry with controlled source-file storage and failure-state audit.
- The extraction-template registry with controlled lifecycle and version immutability.
- The email-ingestion-rule registry with secret-store credential governance.
- The ingestion-queue with immutable source-evidence retention and per-item audit.
- The data-extraction registry with non-bypassable controlled review/validation/correction workflow.
- The extraction provenance chain.
- The target-record promotion API to downstream regulated modules.
- The Authority/HITL/e-signature substrate on every regulated final action.
- The audit-trail coverage with reason-for-change discipline.
- The MIRA copilot read-only context integration (advisory drafting only).
- The AI-assisted extraction substrate with provenance + mandatory human acceptance.
- The findings emission to URS-21.
- The CAPA emission to URS-18.
- The change-control linkage to URS-13.
- The URS-12 Document Control integration for raw evidence.
- The URS-28 qualification-gate consumer.
- The governed reopen workflow.
- The per-jurisdictional regulatory expectations.

### 1.2 Out-of-scope

- The CAPA register itself (URS-18).
- The change-control register itself (URS-13).
- The findings register itself (URS-21).
- The document-control register itself (URS-12).
- The MIRA copilot service itself (URS-32).
- The secret-store substrate itself (URS-35 owns secret-store integration; this URS consumes secret-store references).
- Vendor-specific OCR engines beyond Tesseract default (extensible per shared schema engine enum, but vendor connectors are future-state).
- Direct integration with external screen-capture tools (e.g., Zapier / Make / IFTTT) — out of scope for v1.0.
- Email ingestion via protocols other than IMAP (e.g., POP3, Exchange Web Services) — future-state.

---

## 2. Preconditions, Dependencies, Constraints

### 2.1 Operating preconditions

The following preconditions MUST hold for this URS to apply at validation time. Each bullet is a binding precondition; deviations require a controlled exception per URS-13 Change Control.

- The platform's authentication and session substrate (URS-01), RBAC (URS-02), context gate (URS-03), HITL / e-sign (URS-04), Authority Profile registry (URS-05), audit-trail hash-chain (URS-06), document-control (URS-12), change-control (URS-13), training qualification-gate (URS-28), and MIRA AI (URS-32), and infrastructure secret-store (URS-35) are released and operational at validation time.
- Capture users, extraction reviewers, promoters, Qualified Person, Information Security authority are trained, attributable users with documented authority.
- AI-assisted capture surfaces are advisory only; the human reviewer / promoter's signed disposition is the system of record.
- The tenant operating jurisdiction(s) are configured.

### 2.2 Dependencies

- URS-01.URS-28, URS-30.URS-35 platform contracts.
- The `electronic_signatures` substrate.
- The `authority` substrate.
- The `hitl` substrate.
- The `audit_trail` substrate.
- The `documents` substrate (URS-12 — primary raw-evidence storage).
- The `secret_store` substrate (URS-35 — IMAP credential governance per DEC-29-07).
- The `change_control` substrate (URS-13).
- The training `qualification-gate` substrate (URS-28).
- All promotion-target modules (URS-14, URS-15, URS-16, URS-22, URS-23, URS-24, URS-25, URS-26, URS-27).
- The `notifications` substrate (URS-30).

### 2.3 Constraints

- The canonical API mount is `/api/v1/screen-reader/*`. No frontend hook may use `/api/screen-reader/*` (extra `/api`).
- AI-assisted content is advisory-only; **no AI service shall be the sole path to persist, validate, or promote captured data into a GxP record** (canonical ARCH-AI-001 binding — most strongly worded across the URS pack).
- OCR source files MUST flow through controlled URS-12 Document Control upload; client-supplied file paths rejected per DEC-29-04.
- IMAP credentials MUST be sourced from secret store; env-file credentials rejected per DEC-29-07.
- Extraction validation is non-bypassable controlled workflow per DEC-29-09; generic update-driven validation rejected.
- Effective extraction-template versions are immutable; revisions create new version rows.
- Target-record promotion respects target-module immutability + ownership rules at promotion time per DEC-29-23.

---

## 3. Closed Launch Decisions

### 3.1 Decision register

| Decision ID | Title | Locked decision |
|---|---|---|
| DEC-29-01 | Two-step release path + canonical API contract | Module 29 follows the same two-step release path; canonical API mount `/api/v1/screen-reader/*`; all hooks use canonical relative `/screen-reader/*` paths; route comments aligned. |
| DEC-29-02 | Capture-session controlled lifecycle | Capture-session lifecycle is `draft → in_progress → under_review → closed → reopened`; close requires `sr_capture_session_authority` + HITL + bound e-signature; reopen is governed transition per DEC-29-22. |
| DEC-29-03 | Capture-record provenance | Capture records persist `source_link` (FK to URS-12 document where applicable), `capture_method` (ENUM `screenshot` / `pdf_upload` / `email` / `manual_keying`), `captured_at` (TIMESTAMPTZ — server-attributed), `captured_by` (FK), `raw_evidence_hash` (TEXT). |
| DEC-29-04 | OCR controlled source-file storage | OCR-job source files MUST be uploaded through URS-12 Document Control (`document_id` + `document_version_id` + `content_hash`); the OCR-job persists `document_id`, `document_version_id`, `content_hash`, `storage_path` (server-attributed); client-supplied `file_path` is rejected with `SR_OCR_CLIENT_FILE_PATH_FORBIDDEN`; controlled storage with tenant-isolated bucket / path namespace. |
| DEC-29-05 | OCR failure-state audit + lifecycle | OCR-job lifecycle is `created → processing → succeeded | failed → archived`; failure state emits explicit `sr_ocr_job_failed` event with `error_message`, `error_class`, `engine_id`, `engine_version` + audit-trail entry; success state persists `extracted_text`, `confidence`, `processing_time_ms`, `engine_id`, `engine_version`. |
| DEC-29-06 | Extraction-template controlled lifecycle + version immutability | Extraction-template lifecycle is `draft → under_review → effective → superseded → archived`; release to `effective` requires `sr_template_release_authority` + HITL + bound e-signature + URS-13 change-request linkage per DEC-29-19; effective versions are immutable; revisions create new version rows; template defines regex patterns + field labels + data types + AI-assist flag (`ai_assisted: BOOLEAN`) — note: `ai_assisted = true` does NOT exempt the extraction from the controlled review workflow per DEC-29-09. |
| DEC-29-07 | Email-ingestion credential governance via secret store | Email ingestion rules persist `secret_store_ref` (TEXT — reference to tenant-controlled secret-store path, e.g., `aws-secrets-manager:tenant-id/imap-creds`) — NOT inline credentials; rule creation rejects payloads containing `password` / `imap_password` / `smtp_password` fields; resolution at polling time fetches credentials from secret store; secret rotation supported via secret-store TTL; access to secret store gated by `information_security_authority`. |
| DEC-29-08 | Ingestion-queue immutable source-evidence retention + per-item audit | Each ingestion-queue item persists `raw_email_document_id` (FK to URS-12 — raw email.eml stored in controlled document storage with content hash) + `attachment_document_ids` (UUID[] FK to URS-12 — attachments stored separately with content hashes); each queue-item creation, processing, and rejection emits explicit audit-trail entry per DEC-29-11. |
| DEC-29-09 | Non-bypassable controlled extraction review/validation/correction workflow | Data-extraction lifecycle is `pending → under_review → approved | rejected | needs_correction → corrected → approved`; review requires `sr_extraction_reviewer_authority` + HITL + bound e-signature; reviewer MUST be SoD-distinct from the user who created the extraction (SoD-29-02); `needs_correction` triggers correction flow with append-only correction history per DEC-29-10; rejection captures rejection reason; `validateExtraction` endpoint is replaced by the controlled review endpoint; generic `updateExtraction` cannot set terminal status. |
| DEC-29-10 | Extraction provenance chain + correction history | Extractions persist `extraction_method` (ENUM `regex_template` / `ocr_text` / `email_parsing` / `manual_capture` / `ai_assisted`), `extraction_template_id` (FK nullable), `source_record_type` (ENUM `ocr_job` / `ingestion_queue_item` / `capture_record` / `manual`), `source_record_id` (UUID), `confidence` (NUMERIC nullable), `correction_history_json` (JSONB — append-only history with reviewer attribution + bound e-signature reference per correction); the original extracted value is preserved across corrections. |
| DEC-29-11 | Audit-trail coverage + reason-for-change discipline | Every mutation route emits audit-trail entries including OCR failure events per DEC-29-05, ingestion-queue per-item events per DEC-29-08, and extraction correction events per DEC-29-10; high-risk status changes (terminal-state transitions, correction, rejection) require structured reason-for-change captured in audit `details` JSON. |
| DEC-29-12 | Frontend route surface alignment | Frontend routes `/screen-reader/extractions`, `/screen-reader/sessions/new`, `/screen-reader/sessions/:id`, `/screen-reader/templates`, `/screen-reader/ingestion-rules`, `/screen-reader/ingestion-queue` are declared in `App.tsx`; dashboard CTAs resolve to real pages. |
| DEC-29-13 | AI-assisted extraction substrate with provenance + mandatory human acceptance | `sr_ai_assistance` table persists per-record columns including `assistance_type` (ENUM `extraction_field_suggestion` / `classification_assist` / `routing_recommendation` / `correction_suggestion`), `narrative_text` / `field_value_text`, `model_id`, `model_version`, `prompt_version`, `confidence`, `citation_snapshot_json`, `proposed_at`, `proposed_by_system`, `accepted_by`, `accepted_at`, `acceptance_e_signature_id`, `accepted_text_immutable`, `rejection_reason`, `status` (ENUM `proposed` / `accepted` / `rejected`); AI-generated content is advisory until accepted; promotion to system-of-record requires explicit human confirmation captured in `acceptance_e_signature_id`; **AI cannot persist captured data into the system-of-record extraction record without acceptance, AI cannot sign extraction approval, AI cannot promote extractions to target modules** per ARCH-AI-001 AC-5 (target requirement, ARCH-AI-001 binding, parallel pattern to URS-26 DEC-26-11 / URS-27 DEC-27-13 / URS-28 DEC-28-13). |
| DEC-29-14 | Multi-dimensional context model | `tenant_id` mandatory, `study_id` optional, `product_id` optional, `target_module` (ENUM where promotion path defined), `source_type` (ENUM per DEC-29-03). |
| DEC-29-15 | Findings emission to URS-21 | Stale ingestion queue items (≥ configurable age threshold without processing), repeated extraction rejections (≥ configurable threshold per template / per source), and OCR engine error patterns emit `sr_finding_created` event to URS-21 with `screen_reader_capture` source type. |
| DEC-29-16 | CAPA emission to URS-18 | Chronic capture failures (e.g., recurring OCR engine failures, repeated email-ingestion authentication failures, persistent extraction rejection patterns) escalated to CAPA emit `sr_capture_failure_capa_linked` event consumed by URS-18 (`screen_reader_capture` source type). |
| DEC-29-17 | URS-12 Document Control linkage | OCR source files + raw email + attachments stored as URS-12 documents per DEC-29-04 / DEC-29-08. |
| DEC-29-18 | URS-13 change-control linkage | Extraction-template effective release + ingestion-rule effective release require URS-13 change-request linkage per DEC-29-06 / DEC-29-19. |
| DEC-29-19 | Ingestion-rule effective release with URS-13 CR | Ingestion-rule lifecycle is `draft → under_review → effective → suspended → archived`; effective release requires `sr_ingestion_rule_authority` + HITL + bound e-signature + URS-13 change-request linkage. |
| DEC-29-20 | platform_admin / super_admin | `platform_admin` / `super_admin` are support / break-glass only paths. |
| DEC-29-21 | Reason-for-change on material updates | Captured per DEC-29-11. |
| DEC-29-22 | Capture program reopen as governed transition | Program `locked → in_progress` requires `executive_authority` co-sign AND `qualified_person_authority` co-sign + documented reason; appends a new program iteration without mutating prior locked evidence (consistent with M14.M28 reopen pattern). |
| DEC-29-23 | Target-record promotion API | Approved extractions are promoted to a downstream regulated module via `POST /screen-reader/extractions/:id/promote` with `target_module` (ENUM `complaints` / `oos` / `deviations` / `inspection_observations` / `batch_entries` / `stability_results` / `em_results` / `apqr_data` / `regulatory_feed_items`); promotion requires `sr_extraction_promoter_authority` + HITL + bound e-signature; **the promoter MUST satisfy the URS-28 qualification gate** for the target module's role (e.g., promoting to URS-23 batch entry requires the promoter to be a qualified batch executor per URS-28 DEC-28-23); the target module enforces its own ownership / immutability / lifecycle rules at promotion time (e.g., URS-23 batch-entry MBR-step ownership per URS-23 DEC-23-09); promotion is logged with bound e-signature persisted; the source extraction's `promoted_to_record_type`, `promoted_to_record_id`, `promoted_at`, `promoted_by`, `promotion_e_signature_id` captured. |

### 3.2 Locked-decision rationale narrative

The decisions above define the binding launch posture for Module 29 v1.0. The most consequential locked controls are: (a) DEC-29-04 routes OCR uploads through controlled URS-12 Document Control with content hash + version tracking; (b) DEC-29-07 mandates tenant-controlled secret-store reference for IMAP credentials; (c) DEC-29-08 stores raw emails + attachments as URS-12 documents with content hash and emits per-item audit events for immutable source evidence; (d) DEC-29-09 requires a non-bypassable controlled review/validation/correction workflow with reviewer authority + HITL + bound e-signature + reviewer SoD; (e) DEC-29-10 requires `extraction_method`, `source_record_id`, `confidence`, `correction_history_json` (append-only) for full extraction provenance; (f) **DEC-29-23 introduces the target-record promotion API** — the keystone integration that makes Module 29 the controlled bridge from captured data to regulated GxP records, with bound e-signature + URS-28 qualification gate; (g) DEC-29-13 introduces the AI-assisted extraction substrate with provenance + mandatory human acceptance, parallel to URS-26 / URS-27 / URS-28 patterns and explicitly bound to the canonical ARCH-AI-001 AC-5 prohibition; (h) DEC-29-22 defines reopen as a governed append-only transition consistent with the Module-14..-28 reopen pattern.

### 3.3 Closed launch decisions: cross-link to items

| Specification item ID | Specification item | Locked decision |
|---|---|---|
| SR-001 | Canonical route + frontend route surface | DEC-29-01 / DEC-29-12 |
| SR-002 | Capture-session lifecycle thin | DEC-29-02 |
| SR-003 | Capture-record provenance thin | DEC-29-03 |
| SR-004 | Client-supplied file paths for OCR | DEC-29-04 |
| SR-005 | OCR failure-state audit incomplete | DEC-29-05 |
| SR-006 | Extraction-template lifecycle missing | DEC-29-06 / DEC-29-18 |
| SR-007 | Credential governance missing | DEC-29-07 |
| SR-008 | Ingestion-queue per-item audit + immutable raw evidence missing | DEC-29-08 / DEC-29-11 |
| SR-009 | Generic update-driven validation | DEC-29-09 |
| SR-010 | Extraction provenance chain incomplete | DEC-29-10 |
| SR-011 | Frontend route coverage requirement | DEC-29-12 |
| SR-012 | Audit + reason-for-change gaps | DEC-29-11 / DEC-29-21 |

### 3.4 Locked-decision authority

Each locked decision is approved by the Founder / Chairman & MD on signature capture in the Document Approval block of this URS (§19). Decisions cannot be unlocked except through controlled URS revision under the URS change-control process and re-approval.

### 3.5 Worked examples

**Worked example 1 — PDF upload → OCR → extraction → URS-14 complaint promotion.**
A complaint intake user receives a paper complaint letter from a customer and scans it to PDF. The user uploads the PDF via URS-12 Document Control which assigns `document_id = DOC-CMP-001` + `document_version_id = v1` + `content_hash = sha256:abc.` + server-attributed storage path. The user creates a capture session `SR-SES-2026-08-12-001` (study-scoped optional; for this scenario tenant-wide). The user creates an OCR job referencing `DOC-CMP-001`; per DEC-29-04 client-supplied file path is rejected. OCR processes via Tesseract; extracts text with confidence 0.91; persists `extracted_text`, `confidence`, `processing_time_ms`, `engine_id = tesseract`, `engine_version = 4.1.1` per DEC-29-05. An effective extraction template `EXT-TPL-Complaint-Letter-v3` (effective per DEC-29-06 + URS-13 CC-2026-0033) extracts structured fields (complaint_text, customer_name, batch_number, complaint_date, severity_indicator) via regex; persists `extraction_method = regex_template`, `extraction_template_id`, `source_record_type = ocr_job`, `source_record_id`, `confidence` per DEC-29-10. The extraction enters `pending`. A reviewer (SoD-distinct from the capture user per SoD-29-02) reviews; finds the extracted `customer_name` is partially garbled by OCR; transitions to `needs_correction`. The reviewer corrects the field; correction is appended to `correction_history_json` with bound e-signature per DEC-29-10. The reviewer approves the corrected extraction with bound e-signature per DEC-29-09. The promoter (a qualified complaint-handling user — qualification verified via URS-28 DEC-28-23 qualification gate) calls `POST /screen-reader/extractions/:id/promote` with `target_module = complaints`; promotion creates a URS-14 complaint record with the extracted fields; bound e-signature on promotion per DEC-29-23; source extraction marked `promoted_to_record_type = complaint`, `promoted_to_record_id = COMPLAINT-2026-08-12-001`, `promoted_at`, `promoted_by`, `promotion_e_signature_id`. The URS-14 complaint module receives the promoted record and runs through its own controlled lifecycle (URS-14 owns the complaint record from this point forward).

**Worked example 2 — Email ingestion → extraction → URS-15 OOS promotion.**
An effective email-ingestion rule `EMAIL-RULE-LabResults-v2` (effective per DEC-29-19 + URS-13 CC-2026-0044) polls IMAP mailbox `lab-results@aeonn.com` for new messages from `partner-lab@example.com` matching subject pattern `OOS Notification — Batch.*`. IMAP credentials resolved via `secret_store_ref = aws-secrets-manager:tenant-id/lab-results-imap` per DEC-29-07. A new email arrives; raw `.eml` stored as URS-12 document with content hash; attachments (PDF lab certificate of analysis) stored as separate URS-12 documents per DEC-29-08. An ingestion-queue item is created with `raw_email_document_id` and `attachment_document_ids`; per-item audit emitted per DEC-29-08. The queue item is processed; an effective extraction template extracts (batch_id, test_name, result_value, spec_low, spec_high, oos_flag) from the email body + attachment OCR; extraction enters `pending`. MIRA proposes an advisory classification `assistance_type = classification_assist` indicating `severity = critical` based on out-of-spec magnitude + product class — persisted in `sr_ai_assistance` with full provenance per DEC-29-13 (advisory only). The reviewer accepts the AI classification with bound e-signature; the locked text is written to the system-of-record. The reviewer approves the extraction. The promoter (qualified OOS-handling user — qualification verified via URS-28) promotes to `target_module = oos`; URS-15 OOS record created; URS-15 lifecycle proceeds.

**Worked example 3 — OCR failure with audit evidence.**
An OCR job fails because the source PDF is encrypted. Per DEC-29-05 the failure state emits `sr_ocr_job_failed` event with `error_message = "PDF encrypted — cannot extract text"`, `error_class = "encryption_error"`, `engine_id = tesseract`, `engine_version`. Audit-trail entry persisted. URS-30 Notifications consumes the failure event. The capture user is notified to obtain an unencrypted copy and resubmit.

**Worked example 4 — Stale ingestion queue → URS-21 finding.**
On `2027-01-15` the stale-queue detector identifies 47 ingestion-queue items older than 30 days without processing. A `sr_finding_created` event is emitted to URS-21 per DEC-29-15 with `severity = major`. URS-21 standalone finding is created. If the pattern persists, URS-18 CAPA opens via `sr_capture_failure_capa_linked` event per DEC-29-16.

**Worked example 5 — AI-only promotion attempt rejected.**
A misconfigured automation attempts to call `POST /screen-reader/extractions/:id/promote` with the AI service as the acting principal. Per DEC-29-13 + ARCH-AI-001 AC-5 + DEC-29-23, the system rejects with `SR_AI_CANNOT_PROMOTE_TO_TARGET`. The promotion requires a qualified human user with `sr_extraction_promoter_authority` + HITL + bound e-signature.

**Worked example 6 — Governed reopen of locked capture program.**
On `2027-04-15` an inspection finding (URS-22) reveals a previously locked capture program may have under-recorded one ingestion event. The Manufacturing Head initiates a reopen; per DEC-29-22 + SoD-29-06, both `executive_authority` co-sign AND `qualified_person_authority` co-sign + documented reason are required. On both co-signs the program transitions `locked → in_progress` and a new program iteration is appended; the prior locked evidence is NOT mutated.

---

## 4. End-to-End User Journeys (28 launch journeys)

| # | Journey | Actor | Pre-condition | Path | Post-condition |
|---|---|---|---|---|---|
| 1 | Create capture session | Capture User | `screen_reader:capture:create` | Create session in `draft` | Session `draft`; audit entry |
| 2 | Add capture record (screenshot / PDF / email / manual) | Capture User | Session active | Add record with provenance per DEC-29-03 | Record persisted; audit entry |
| 3 | Upload PDF via URS-12 for OCR | Capture User | `screen_reader:ocr:create` | Upload via URS-12; controlled storage path + content hash | Document persisted in URS-12 |
| 4 | Create OCR job | Capture User | Document uploaded | Create OCR job referencing `document_id` (NOT `file_path` per DEC-29-04) | OCR job `created` |
| 5 | Reject client-supplied file path | System (validation) | Body contains `file_path` | Reject with `SR_OCR_CLIENT_FILE_PATH_FORBIDDEN` per DEC-29-04 | Operation rejected |
| 6 | Process OCR job (success) | System | OCR job `created` | OCR runs; persists `extracted_text`, `confidence`, `engine_id`, `engine_version`; transition to `succeeded` per DEC-29-05 | OCR job `succeeded`; audit entry |
| 7 | Process OCR job (failure) | System | OCR engine error | Persist `error_message`, `error_class`, `engine_id`; transition to `failed`; emit `sr_ocr_job_failed` event per DEC-29-05 | OCR job `failed`; audit entry |
| 8 | Author extraction template draft | Template Author | `screen_reader:template:create` | Create template in `draft` with regex patterns + field labels | Template `draft`; audit entry |
| 9 | Release extraction template to effective | `sr_template_release_authority` | Template `draft`; URS-13 CR | HITL + bound e-sign + URS-13 CR linkage; transition `draft → effective`; supersede prior version per DEC-29-06 | Template `effective`; bound e-signature |
| 10 | Author email-ingestion rule | Information Security Authority | `screen_reader:ingestion:create` | Create rule with `secret_store_ref` (NOT inline credentials) per DEC-29-07 | Rule `draft`; audit entry |
| 11 | Reject inline credentials in ingestion rule | System (validation) | Body contains `password` / `imap_password` | Reject with `SR_CREDENTIALS_IN_PAYLOAD_FORBIDDEN` per DEC-29-07 | Operation rejected |
| 12 | Release ingestion rule to effective | `sr_ingestion_rule_authority` | Rule `draft`; URS-13 CR | HITL + bound e-sign + URS-13 CR linkage per DEC-29-19; transition `draft → effective` | Rule `effective`; bound e-signature |
| 13 | Poll email ingestion | System (scheduled) | Rule `effective` | Resolve credentials from secret store; poll IMAP; ingest new messages | Ingestion-queue items created |
| 14 | Store raw email + attachments in URS-12 | System | Email ingested | Persist `raw_email_document_id` + `attachment_document_ids` with content hash per DEC-29-08 | Raw evidence in URS-12 |
| 15 | Process ingestion-queue item | System or User | Queue item `pending` | Process; emit per-item audit per DEC-29-08; transition to `processed` | Audit entry per item |
| 16 | Create extraction (regex template) | System or User | OCR job `succeeded` or queue item `processed` | Run effective template; create extraction with full provenance per DEC-29-10 | Extraction `pending` |
| 17 | MIRA proposes AI extraction suggestion | System (MIRA) | Authorized user requests | Persist `sr_ai_assistance` advisory per DEC-29-13 | Suggestion `proposed`; advisory only |
| 18 | Accept AI suggestion | Capture User / Reviewer | Suggestion `proposed` | HITL + bound e-sign; persist `accepted_text_immutable`, `acceptance_e_signature_id` per DEC-29-13 | Suggestion `accepted`; locked text |
| 19 | Reject AI suggestion | Capture User / Reviewer | Suggestion `proposed` | Reject with reason | Suggestion rejected |
| 20 | Review extraction | `sr_extraction_reviewer_authority` (SoD-distinct from creator per SoD-29-02) | Extraction `pending` | HITL + bound e-sign per DEC-29-09; transition `pending → under_review → approved | rejected | needs_correction` | Extraction reviewed; bound e-signature |
| 21 | Reject review by extraction creator | System (SoD validation) | Reviewer = creator | Reject with `SR_REVIEWER_CREATOR_SOD_VIOLATION` per SoD-29-02 | Operation rejected |
| 22 | Correct extraction (needs_correction → corrected) | Capture User | Extraction `needs_correction` | Apply correction; append to `correction_history_json` with bound e-signature per DEC-29-10 | Correction history appended; original value preserved |
| 23 | Re-review corrected extraction | Reviewer | Extraction `corrected` | HITL + bound e-sign; transition `corrected → approved` | Extraction approved |
| 24 | Promote extraction to target module | `sr_extraction_promoter_authority` (qualified per URS-28 for target module) | Extraction `approved` | HITL + bound e-sign per DEC-29-23; URS-28 qualification gate verified; create target-module record (URS-14 / -15 / -16 / -22 / -23 / -24 / -25 / -26 / -27); persist `promoted_to_record_*` | Target-module record created; bound e-signature |
| 25 | Reject promotion without URS-28 qualification | System (qualification gate) | URS-28 returns `qualified = false` | Reject with `TRN_QUALIFICATION_GATE_FAILED` (URS-28 error code) | Promotion rejected |
| 26 | Reject AI-only promotion attempt | System (validation) | AI service as acting principal | Reject with `SR_AI_CANNOT_PROMOTE_TO_TARGET` per DEC-29-13 | Operation rejected |
| 27 | Generate stale ingestion queue finding | System (scheduled) | Ingestion-queue items > 30 days unprocessed | Emit `sr_finding_created` to URS-21 per DEC-29-15 | URS-21 finding created |
| 28 | Reopen locked capture program (governed transition) | Manufacturing Head + Executive Authority + Qualified Person | Program `locked` | Executive co-sign AND QP co-sign + reason; transition `locked → in_progress`; append new iteration per DEC-29-22 | Program `in_progress`; new iteration appended; prior locked evidence NOT mutated |

---

## 5. Front-end Requirements

### 5.1 Screen Reader Dashboard

The dashboard (URS-29-FE-001) renders capture sessions, OCR jobs, extraction stats, ingestion rules, ingestion queue with filters; uses canonical `/screen-reader/*` hooks per DEC-29-01; **all CTAs resolve to real pages** per DEC-29-12.

### 5.2 Capture Session Console

The capture session console (URS-29-FE-002) supports session draft / lifecycle UI; new routes `/screen-reader/sessions/new` and `/screen-reader/sessions/:id` per DEC-29-12.

### 5.3 OCR Jobs Page

The OCR jobs page (URS-29-FE-003) supports OCR job creation **only via URS-12 Document Control upload** (no client-supplied file paths per DEC-29-04); rendering of success / failure states + error details.

### 5.4 Extraction Template Editor

The extraction template editor (URS-29-FE-004) supports template draft authoring; release flow with HITL + e-signature gates linked to URS-13 CR per DEC-29-06.

### 5.5 Email Ingestion Rule Console

The email ingestion rule console (URS-29-FE-005) supports rule authoring with **`secret_store_ref` field only** (no password fields in UI per DEC-29-07); release flow with HITL + e-signature.

### 5.6 Ingestion Queue Console

The ingestion queue console (URS-29-FE-006) renders queue with raw-evidence preview (links to URS-12 documents); per-item processing history per DEC-29-08.

### 5.7 Extraction Console

The extraction console (URS-29-FE-007) renders extractions with provenance badges, review ceremony with HITL + bound e-signature per DEC-29-09, correction ceremony per DEC-29-10; new route `/screen-reader/extractions` per DEC-29-12.

### 5.8 Extraction Detail

The extraction detail (URS-29-FE-008) renders extraction with full provenance chain, correction history, AI assistance proposals, promotion ceremony with bound e-signature + URS-28 qualification gate UI.

### 5.9 AI Assistance Console

The AI assistance console (URS-29-FE-009) renders AI-generated assistance with provenance, accept ceremony with bound e-signature, reject ceremony with reason; advisory-only labeling per DEC-29-13 + ARCH-AI-001 AC-5.

### 5.10 Promotion Console

The promotion console (URS-29-FE-010) supports target-module selection (URS-14 / -15 / -16 / -22 / -23 / -24 / -25 / -26 / -27); URS-28 qualification gate UI feedback; bound e-signature ceremony per DEC-29-23.

### 5.11 MIRA Copilot Integration

MIRA copilot (URS-29-FE-011) is read-only context. **AI-generated content is advisory only with mandatory human acceptance per DEC-29-13; AI cannot persist captured data; AI cannot sign extraction approval; AI cannot promote extractions** per ARCH-AI-001 AC-5.

### 5.12 Accessibility

WCAG 2.1 AA accessible.

---

## 6. Back-end Requirements

### 6.1 Module structure

`packages/backend/src/modules/screen-reader/` with `plugin.ts`, `routes.ts` (typed schemas), `service.ts` (controlled OCR source-file storage; secret-store credential resolution; non-bypassable extraction review workflow; correction history; target-record promotion; audit-trail with reason-for-change), `schemas.ts`, `events.ts`, `secret-store-resolver.ts` (per DEC-29-07), `extraction-review-engine.ts` (per DEC-29-09), `target-promotion-handler.ts` (per DEC-29-23).

### 6.2 Data model

#### 6.2.1 `sr_capture_sessions`

`id`, `tenant_id`, `study_id` (FK nullable), `session_code`, `title`, `status` (ENUM `draft` / `in_progress` / `under_review` / `closed` / `reopened` per DEC-29-02), `closed_by`, `closed_at`, `closure_e_signature_id` (FK), `reopened_at`, `reopened_by`, `reopen_executive_co_signer`, `reopen_qp_co_signer`, `reopen_reason`, audit columns. RLS enabled.

#### 6.2.2 `sr_capture_records`

`id`, `tenant_id`, `session_id` (FK), `capture_method` (ENUM per DEC-29-03), `source_link_document_id` (FK to URS-12 nullable), `raw_evidence_hash` (TEXT), `captured_at` (TIMESTAMPTZ — server-attributed), `captured_by` (FK), audit columns.

#### 6.2.3 `sr_ocr_jobs`

`id`, `tenant_id`, `session_id` (FK nullable), `document_id` (FK to URS-12 NOT NULL per DEC-29-04), `document_version_id` (FK to URS-12 NOT NULL), `content_hash` (TEXT NOT NULL), `storage_path` (TEXT — server-attributed), `engine_id` (TEXT), `engine_version` (TEXT), `extracted_text` (TEXT nullable), `confidence` (NUMERIC nullable), `processing_time_ms` (INTEGER nullable), `status` (ENUM `created` / `processing` / `succeeded` / `failed` / `archived` per DEC-29-05), `error_message` (TEXT nullable), `error_class` (TEXT nullable), audit columns. Note: legacy client-supplied `file_path` field removed per DEC-29-04.

#### 6.2.4 `sr_extraction_templates`

`id`, `tenant_id`, `template_code`, `version` (INTEGER NOT NULL), `effective_from`, `effective_to`, `supersedes_template_id` (self-FK nullable), `release_change_request_id` (FK to URS-13 per DEC-29-18), `regex_patterns_json` (JSONB), `field_definitions_json` (JSONB), `ai_assisted` (BOOLEAN), `approved_by`, `approved_at`, `e_signature_id` (FK), `status` (ENUM `draft` / `under_review` / `effective` / `superseded` / `archived` per DEC-29-06), audit columns.

#### 6.2.5 `sr_email_ingestion_rules`

`id`, `tenant_id`, `rule_code`, `imap_host`, `imap_port`, `imap_username`, `secret_store_ref` (TEXT NOT NULL per DEC-29-07), `imap_folder`, `subject_pattern`, `from_pattern`, `target_extraction_template_id` (FK), `version`, `effective_from`, `effective_to`, `release_change_request_id` (FK to URS-13 per DEC-29-19), `status` (ENUM `draft` / `under_review` / `effective` / `suspended` / `archived`), audit columns. Note: legacy `imap_password` field removed per DEC-29-07.

#### 6.2.6 `sr_ingestion_queue`

`id`, `tenant_id`, `rule_id` (FK), `raw_email_document_id` (FK to URS-12 NOT NULL per DEC-29-08), `attachment_document_ids` (UUID[] FK to URS-12), `received_at`, `subject`, `from_address`, `status` (ENUM `pending` / `processed` / `rejected` / `failed`), `processed_at`, `processed_by` (nullable), audit columns.

#### 6.2.7 `sr_data_extractions`

`id`, `tenant_id`, `session_id` (FK nullable), `extraction_method` (ENUM per DEC-29-10), `extraction_template_id` (FK nullable), `extraction_template_version_snapshot` (INTEGER nullable), `source_record_type` (ENUM per DEC-29-10), `source_record_id` (UUID), `extracted_fields_json` (JSONB), `confidence` (NUMERIC nullable), `correction_history_json` (JSONB DEFAULT '[]'), `status` (ENUM `pending` / `under_review` / `approved` / `rejected` / `needs_correction` / `corrected` per DEC-29-09), `reviewed_by` (FK nullable), `reviewed_at`, `review_e_signature_id` (FK nullable), `rejection_reason` (TEXT nullable), `promoted_to_record_type` (ENUM nullable per DEC-29-23), `promoted_to_record_id` (UUID nullable), `promoted_at` (TIMESTAMPTZ nullable), `promoted_by` (FK nullable), `promotion_e_signature_id` (FK nullable), `created_by` (FK), audit columns. Constraint: `reviewed_by != created_by` (reviewer ≠ creator per SoD-29-02).

#### 6.2.8 `sr_ai_assistance`

`id`, `tenant_id`, `assistance_type` (ENUM per DEC-29-13), `linked_record_type` (ENUM `extraction` / `extraction_field` / `ingestion_queue_item` / `ocr_job`), `linked_record_id` (UUID), `narrative_text` / `field_value_text`, `model_id`, `model_version`, `prompt_version`, `confidence`, `citation_snapshot_json`, `proposed_at`, `proposed_by_system`, `accepted_by`, `accepted_at`, `acceptance_e_signature_id`, `accepted_text_immutable`, `rejection_reason`, `status` (ENUM `proposed` / `accepted` / `rejected`), audit columns.

#### 6.2.9 `sr_capture_program_locks`

`id`, `tenant_id`, `period_start`, `period_end`, `locked_by`, `locked_at`, `lock_e_signature_id`, `reopened_at`, `reopened_by`, `reopen_executive_co_signer`, `reopen_qp_co_signer`, `reopen_reason`, audit columns.

#### 6.2.10 RLS

All Module 29 tables have RLS enabled.

### 6.3 API contract

| Route | Method | Permission | Status |
|---|---|---|---|
| `/api/v1/screen-reader/capture-sessions` | GET / POST | `screen_reader:capture:read` / `screen_reader:capture:create` | |
| `/api/v1/screen-reader/capture-sessions/:id` | GET / PATCH | `screen_reader:capture:read` / `screen_reader:capture:update` | |
| `/api/v1/screen-reader/capture-sessions/:id/close` | POST | `sr_capture_session_authority` + HITL + bound e-sign per DEC-29-02 | target route |
| `/api/v1/screen-reader/capture-sessions/:id/records` | GET / POST | `screen_reader:capture:record:read` / `screen_reader:capture:record:create` | |
| `/api/v1/screen-reader/ocr-jobs` | GET / POST | `screen_reader:ocr:read` / `screen_reader:ocr:create` (rejects client-supplied `file_path` per DEC-29-04) | |
| `/api/v1/screen-reader/ocr-jobs/:id/process` | POST | `screen_reader:ocr:process` | |
| `/api/v1/screen-reader/extraction-templates` | GET / POST | `screen_reader:template:read` / `screen_reader:template:create` | |
| `/api/v1/screen-reader/extraction-templates/:id` | GET / PATCH (draft only) | `screen_reader:template:read` / `screen_reader:template:update` | |
| `/api/v1/screen-reader/extraction-templates/:id/release` | POST | `sr_template_release_authority` + HITL + bound e-sign + URS-13 CR per DEC-29-06 | target route |
| `/api/v1/screen-reader/ingestion-rules` | GET / POST | `screen_reader:ingestion:read` / `screen_reader:ingestion:create` (rejects inline credentials per DEC-29-07) | |
| `/api/v1/screen-reader/ingestion-rules/:id` | GET / PATCH (draft only) | `screen_reader:ingestion:read` / `screen_reader:ingestion:update` | |
| `/api/v1/screen-reader/ingestion-rules/:id/release` | POST | `sr_ingestion_rule_authority` + HITL + bound e-sign + URS-13 CR per DEC-29-19 | target route |
| `/api/v1/screen-reader/ingestion-rules/:id/poll` | POST | `screen_reader:ingestion:poll` (resolves credentials from secret store per DEC-29-07) | |
| `/api/v1/screen-reader/ingestion-queue` | GET | `screen_reader:ingestion:queue:read` | |
| `/api/v1/screen-reader/ingestion-queue/:id/process` | POST | `screen_reader:ingestion:queue:process` (per-item audit per DEC-29-08) | |
| `/api/v1/screen-reader/extractions` | GET / POST | `screen_reader:extraction:read` / `screen_reader:extraction:create` | |
| `/api/v1/screen-reader/extractions/:id` | GET / PATCH (terminal status excluded per DEC-29-09) | `screen_reader:extraction:read` / `screen_reader:extraction:update` | |
| `/api/v1/screen-reader/extractions/:id/review` | POST | `sr_extraction_reviewer_authority` (SoD-distinct from creator per SoD-29-02) + HITL + bound e-sign per DEC-29-09 | target route |
| `/api/v1/screen-reader/extractions/:id/correct` | POST | `sr_extraction_reviewer_authority` (or correction owner) + HITL + bound e-sign + reason per DEC-29-10 | target route |
| `/api/v1/screen-reader/extractions/:id/promote` | POST | `sr_extraction_promoter_authority` + HITL + bound e-sign + URS-28 qualification gate per DEC-29-23 | target route |
| `/api/v1/screen-reader/ai-assistance` | GET / POST | `screen_reader:ai_assistance:read` / `screen_reader:ai_assistance:propose` (advisory only per DEC-29-13) | target route |
| `/api/v1/screen-reader/ai-assistance/:id/accept` | POST | `screen_reader:ai_assistance:accept` + HITL + bound e-sign | target route |
| `/api/v1/screen-reader/ai-assistance/:id/reject` | POST | `screen_reader:ai_assistance:reject` + reason | target route |
| `/api/v1/screen-reader/program-locks` | POST | `final_quality_approver` + HITL + bound e-sign | target route |
| `/api/v1/screen-reader/program-locks/:id/reopen` | POST | `executive_authority` co-sign AND `qualified_person_authority` co-sign + HITL + reason per DEC-29-22 | target route |

### 6.4 Workflow

#### 6.4.1 Capture-session lifecycle

```mermaid
stateDiagram-v2
 [*] --> draft: create
 draft --> in_progress: start
 in_progress --> under_review: submit for review
 under_review --> closed: close (sr_capture_session_authority + HITL + e-sign — DEC-29-02)
 closed --> reopened: governed reopen (executive + QP co-sign — DEC-29-22)
```

#### 6.4.2 OCR-job lifecycle

```mermaid
stateDiagram-v2
 [*] --> created: create (with controlled URS-12 source-file storage — DEC-29-04)
 created --> processing: process
 processing --> succeeded: success (text + confidence + audit)
 processing --> failed: failure (sr_ocr_job_failed event + audit — DEC-29-05)
 succeeded --> archived: archive
 failed --> archived: archive
```

#### 6.4.3 Extraction-template lifecycle

```mermaid
stateDiagram-v2
 [*] --> draft: create
 draft --> under_review: submit for review
 under_review --> effective: release (sr_template_release_authority + HITL + e-sign + URS-13 CR — DEC-29-06)
 effective --> superseded: revise (new version)
 superseded --> archived: archive
```

#### 6.4.4 Extraction lifecycle

```mermaid
stateDiagram-v2
 [*] --> pending: create (provenance per DEC-29-10)
 pending --> under_review: review starts
 under_review --> approved: approve (sr_extraction_reviewer_authority + HITL + bound e-sign + reviewer SoD — DEC-29-09)
 under_review --> rejected: reject (with reason)
 under_review --> needs_correction: needs correction
 needs_correction --> corrected: correction (append-only history — DEC-29-10)
 corrected --> under_review: re-review
 approved --> promoted: target-record promotion (sr_extraction_promoter_authority + HITL + bound e-sign + URS-28 qualification gate — DEC-29-23)
```

### 6.5 Business rules

- BR-29-01: Capture-session lifecycle is `draft → in_progress → under_review → closed → reopened` per DEC-29-02.
- BR-29-02: Capture-record provenance persists `source_link`, `capture_method`, `captured_at` (server-attributed), `captured_by`, `raw_evidence_hash` per DEC-29-03.
- BR-29-03: OCR source files MUST flow through URS-12 Document Control with `document_id` + `content_hash` per DEC-29-04.
- BR-29-04: Client-supplied `file_path` rejected with `SR_OCR_CLIENT_FILE_PATH_FORBIDDEN`.
- BR-29-05: OCR failure state emits `sr_ocr_job_failed` event + audit-trail entry per DEC-29-05.
- BR-29-06: Extraction-template effective release requires `sr_template_release_authority` + HITL + bound e-sign + URS-13 CR per DEC-29-06.
- BR-29-07: Effective extraction-template versions are immutable.
- BR-29-08: Email-ingestion rule MUST persist `secret_store_ref` (never inline credentials) per DEC-29-07.
- BR-29-09: Inline credentials in ingestion-rule payload rejected with `SR_CREDENTIALS_IN_PAYLOAD_FORBIDDEN`.
- BR-29-10: Ingestion-queue item persists `raw_email_document_id` + `attachment_document_ids` (URS-12) per DEC-29-08.
- BR-29-11: Ingestion-queue per-item creation, processing, and rejection emit explicit audit-trail entries per DEC-29-08 / DEC-29-11.
- BR-29-12: Extraction lifecycle is `pending → under_review → approved | rejected | needs_correction → corrected → approved` per DEC-29-09.
- BR-29-13: Extraction review requires `sr_extraction_reviewer_authority` + HITL + bound e-sign per DEC-29-09.
- BR-29-14: Extraction reviewer MUST be SoD-distinct from creator per SoD-29-02.
- BR-29-15: Extraction provenance persists `extraction_method`, `extraction_template_id` (where applicable), `source_record_type`, `source_record_id`, `confidence`, `correction_history_json` per DEC-29-10.
- BR-29-16: Correction history is append-only; original extracted value preserved per DEC-29-10.
- BR-29-17: Direct PATCH cannot set extraction `status` to terminal values per DEC-29-09.
- BR-29-18: Target-record promotion requires `sr_extraction_promoter_authority` + HITL + bound e-sign per DEC-29-23.
- BR-29-19: Promotion respects URS-28 qualification gate where target module requires qualified personnel per DEC-29-23.
- BR-29-20: Promotion respects target-module ownership / immutability rules at promotion time per DEC-29-23.
- BR-29-21: AI assistance is advisory until human-accepted; promotion to system-of-record requires bound e-signature per DEC-29-13.
- BR-29-22: **AI cannot persist captured data into the system-of-record extraction record without human acceptance, AI cannot sign extraction approval, AI cannot promote extractions to target modules** per ARCH-AI-001 AC-5 / DEC-29-13.
- BR-29-23: Material updates after draft require structured reason-for-change per DEC-29-21.
- BR-29-24: Bound e-signature persistence on every regulated final action per DEC-29-10 / DEC-29-23.
- BR-29-25: Program reopen `locked → in_progress` requires `executive_authority` co-sign AND `qualified_person_authority` co-sign + reason per DEC-29-22.
- BR-29-26: `platform_admin` / `super_admin` are support / break-glass only paths per DEC-29-20.

### 6.6 Audit trail

Every Module 29 record mutation persists an audit-trail entry. Material updates after draft persist `reason_for_change` per DEC-29-21. OCR failure events audited per DEC-29-05. Ingestion-queue per-item events audited per DEC-29-08. Correction events audited per DEC-29-10. Regulated final actions persist a bound e-signature via the `electronic_signatures` substrate. Append-only.

### 6.7 Error handling

| Code | HTTP | Meaning |
|---|---|---|
| `SR_VALIDATION_FAILED` | 400 | Schema validation failure |
| `SR_UNAUTHORIZED` | 401 | Authentication required |
| `SR_FORBIDDEN` | 403 | RBAC denied |
| `SR_NOT_FOUND` | 404 | Resource not found |
| `SR_DUPLICATE_KEY` | 409 | Uniqueness violation |
| `SR_INVALID_TRANSITION` | 422 | Lifecycle transition not permitted |
| `SR_TERMINAL_STATE_PATCH_FORBIDDEN` | 422 | Direct PATCH attempted on terminal status per DEC-29-09 |
| `SR_OCR_CLIENT_FILE_PATH_FORBIDDEN` | 422 | Client-supplied `file_path` per DEC-29-04 |
| `SR_OCR_DOCUMENT_REQUIRED` | 422 | OCR job missing controlled `document_id` per DEC-29-04 |
| `SR_CREDENTIALS_IN_PAYLOAD_FORBIDDEN` | 422 | Inline credentials in ingestion-rule payload per DEC-29-07 |
| `SR_SECRET_STORE_REF_REQUIRED` | 422 | Ingestion rule missing `secret_store_ref` per DEC-29-07 |
| `SR_SECRET_STORE_RESOLUTION_FAILED` | 502 | Secret store resolution failed at polling time |
| `SR_INGESTION_QUEUE_DOCUMENT_REQUIRED` | 422 | Queue item missing controlled raw evidence per DEC-29-08 |
| `SR_REVIEWER_CREATOR_SOD_VIOLATION` | 422 | Reviewer = creator per SoD-29-02 |
| `SR_AUTHORITY_REQUIRED` | 422 | Authority Profile missing |
| `SR_HITL_DECISION_REQUIRED` | 422 | HITL decision capture missing |
| `SR_E_SIGNATURE_REQUIRED` | 422 | Bound e-signature persistence missing |
| `SR_REASON_FOR_CHANGE_REQUIRED` | 422 | Material update / terminal transition without reason-for-change |
| `SR_AI_CANNOT_PERSIST` | 422 | AI service attempted to persist captured data without human acceptance per ARCH-AI-001 |
| `SR_AI_CANNOT_SIGN_APPROVAL` | 422 | AI service attempted to sign extraction approval per ARCH-AI-001 |
| `SR_AI_CANNOT_PROMOTE_TO_TARGET` | 422 | AI service attempted to promote extraction to target module per ARCH-AI-001 AC-5 |
| `SR_AI_ASSISTANCE_NOT_ACCEPTED` | 422 | Attempt to promote AI assistance to system-of-record without human acceptance per DEC-29-13 |
| `SR_PROMOTION_QUALIFICATION_GATE_FAILED` | 422 | Promoter not qualified for target module per URS-28 |
| `SR_PROMOTION_TARGET_OWNERSHIP_VIOLATION` | 422 | Promotion violates target-module ownership / immutability rules per DEC-29-23 |
| `SR_REOPEN_AUTHORITY_REQUIRED` | 422 | Reopen attempted without executive AND QP co-sign per DEC-29-22 |
| `SR_CONTEXT_FILTER_MISMATCH` | 422 | Query against context column not present in schema |
| `SR_INTERNAL` | 500 | Sanitized server error |

### 6.8 Configuration rules

- Stale ingestion queue threshold (default 30 days) configurable per tenant per DEC-29-15.
- Repeated extraction rejection threshold configurable per template per tenant.
- OCR engine selection (Tesseract default; extensible) configured at platform level.
- Secret-store provider (AWS Secrets Manager / Vault / Azure / GCP) configured per tenant per DEC-29-07.

---

## 7. Non-functional Requirements

- NFR-29-01: List pagination (default 50, max 200).
- NFR-29-02: List p95 < 800ms (1M extractions, 100k OCR jobs per tenant).
- NFR-29-03: OCR processing p95 < 30s per page (Tesseract default).
- NFR-29-04: Email polling cycle p95 < 60s per rule.
- NFR-29-05: Promotion handoff p95 < 1.5s including URS-28 qualification gate.
- NFR-29-06: Audit-trail append p99 < 200ms.
- NFR-29-07: Concurrent capture users per tenant: 100.
- NFR-29-08: Storage scalability: 10M extractions per tenant; 1M OCR jobs per tenant.
- NFR-29-09: Backup / restore RPO ≤ 15 min; RTO ≤ 4 hours per URS-35.
- NFR-29-10: Bound e-signature persistence transaction p95 < 1.5s.
- NFR-29-11: Secret-store resolution p95 < 200ms.

---

## 8. Localization

English (en-US, en-GB), Hindi (hi-IN), Marathi (mr-IN), Japanese (ja-JP) at launch.

---

## 9. Migration

### 9.1 Migration scope

Greenfield at launch.

### 9.2 Schema migration

Migration baseline aligned with target migrations columns: remove client-supplied `file_path` from `sr_ocr_jobs` (replace with `document_id` FK to URS-12, `document_version_id` FK, `content_hash`, `storage_path`) per DEC-29-04; remove `imap_password` from `sr_email_ingestion_rules` (replace with `secret_store_ref`) per DEC-29-07; add `raw_email_document_id` + `attachment_document_ids` FK to URS-12 on `sr_ingestion_queue` per DEC-29-08; add `correction_history_json` (append-only via DB trigger), `extraction_method`, `source_record_type`, `source_record_id`, `confidence`, `promoted_to_record_*` on `sr_data_extractions` per DEC-29-10 / DEC-29-23; add `sr_ai_assistance` table per DEC-29-13; add `sr_capture_program_locks` table per DEC-29-22; add reviewer ≠ creator constraint on `sr_data_extractions` per SoD-29-02; add `release_change_request_id` FK to URS-13 on `sr_extraction_templates` and `sr_email_ingestion_rules` per DEC-29-06 / DEC-29-19.

### 9.3 Migration evidence gate (URS-29-VAL-008)

(a) all migrations applied; (b) RLS verified; (c) typed schema validation verified; (d) capture-session lifecycle verified; (e) OCR controlled source-file storage verified (client `file_path` rejection tested); (f) OCR failure-state audit verified; (g) extraction-template lifecycle + URS-13 CR linkage verified; (h) email-ingestion credential governance verified (inline credentials rejection tested); (i) secret-store resolution verified; (j) ingestion-queue immutable raw evidence + per-item audit verified; (k) extraction review/correction workflow + reviewer SoD verified; (l) extraction provenance + correction history append-only verified; (m) target-record promotion + URS-28 qualification gate verified; (n) AI assistance substrate + AI-only-promotion rejection verified; (o) cross-module event emission verified (URS-12, URS-13, URS-18, URS-21, URS-30, all promotion targets); (p) audit-trail coverage verified including OCR failures + queue per-item; (q) governed reopen verified; (r) §17 validation evidence pack signed.

---

## 10. Decommissioning

Module 29 records subject to platform record-retention policy: retained per regulatory record-retention rules including raw-evidence retention via URS-12 Document Control. On tenant decommissioning, records exported per URS-35.

---

## 11. Decisions, Dependencies, Risks, and Error Handling
### 11.1 Closed decision posture

**No Module 29 internal decisions outstanding.** Launch decisions are captured in the locked decisions above.

### 11.2 External dependencies

- URS-12 Document Control must support raw-evidence storage with content hash per DEC-29-04 / DEC-29-08.
- URS-13 change-control register must support extraction-template + ingestion-rule release linkage per DEC-29-06 / DEC-29-19.
- URS-18 CAPA register must accept `screen_reader_capture` source type per DEC-29-16.
- URS-21 findings register must accept `screen_reader_capture` source type per DEC-29-15.
- URS-28 training must expose qualification-gate API for promotion authority verification per DEC-29-23.
- URS-32 MIRA AI must support read-only `useMiraRecord(.)` mappings; AI advisory drafting only with mandatory human acceptance.
- URS-35 infrastructure must support secret-store integration per DEC-29-07.
- All promotion-target modules (URS-14, URS-15, URS-16, URS-22, URS-23, URS-24, URS-25, URS-26, URS-27) must support promoted-record creation per DEC-29-23.

### 11.3 Risks

- Risk-29-01: Secret-store provider availability (AWS / Vault / Azure / GCP outages). Mitigation: NFR-29-11 latency budget; configurable timeout; fallback to read-only mode (no email polling) on secret-store unreachable.
- Risk-29-02: OCR engine accuracy variance under low-quality scans. Mitigation: confidence threshold + reviewer correction workflow per DEC-29-10.
- Risk-29-03: AI-assistance acceptance rate may be high if reviewers rubber-stamp. Mitigation: acceptance-rate audit; periodic review.
- Risk-29-04: Promotion failure if target-module ownership rules violated. Mitigation: target-module pre-validation at promotion time per DEC-29-23.
- Risk-29-05: Reopen workflow gravity may delay urgent investigations. Mitigation: documented reopen SLA.

### 11.4 Out-of-scope risks tracked elsewhere

- Vendor-specific OCR engines (future-state).
- Direct integration with screen-capture tools (future-state).
- Email ingestion via non-IMAP protocols (future-state).

### 11.5 Risk owner

Module-29 risk register owned by Quality / Data Capture Squad with quarterly review by **Information Security Head (Primary Owner)** + QA Head + Validation Head + Qualified Person Authority.

### 11.6 Decision discipline

No Module 29 internal decisions outstanding.

### 11.7 Error Handling and Negative Paths

This section defines the controlled error envelope, the enumerated machine-code catalogue, and the negative-path response contract required for this module. The error envelope is the standard platform envelope (human message, machine code in upper-snake-case, optional structured details, correlation identifier). Errors are returned with the appropriate HTTP status; the UI surfaces inline errors at the field of cause where applicable, otherwise a controlled error toast or modal. Every error path is logged to the URS-06 audit substrate when the originating action is regulated; errors that occur before authentication are logged without `userId`. Audit-trail write failure on a state-changing action MUST cause the originating action to NOT commit (atomic write per URS-04 BR-04-15). The enumerated machine codes for this module's negative paths are defined alongside the corresponding lifecycle gates, segregation-of-duties controls, and authority-resolution outcomes throughout §6 (Back-end Requirements) and §13 (Segregation of Duties); engineering MUST surface every enumerated machine code through the standard envelope and MUST NOT swallow errors silently. Cross-module error propagation follows the §20 Cross-Module Event Contract.


---

## 12. Security

- SEC-29-01: Tenant isolation enforced at RLS on every Module 29 table.
- SEC-29-02: RBAC enforced on every route via `requirePermission(.)`.
- SEC-29-03: Authority resolution enforced on regulated final actions before HITL + e-signature.
- SEC-29-04: HITL decision capture enforced before bound e-signature persistence.
- SEC-29-05: Bound e-signature persistence via `electronic_signatures` substrate.
- SEC-29-06: PII redaction in logs (capture records may contain PII; redaction enforced).
- SEC-29-07: Audit-trail integrity via URS-06 hash chain.
- SEC-29-08: AI-request provenance via `ai_requests` linked to `sr_ai_assistance`; **AI cannot persist, validate, or promote captured data per ARCH-AI-001 AC-5**; AI may draft advisory only.
- SEC-29-09: `platform_admin` / `super_admin` break-glass actions logged per DEC-29-20.
- SEC-29-10: **OCR source files via URS-12 Document Control with content hash per DEC-29-04 — client-supplied file paths rejected (path-traversal vector eliminated)**.
- SEC-29-11: **IMAP credentials via tenant-controlled secret store per DEC-29-07 — env-file credentials rejected; secret rotation supported**.
- SEC-29-12: Raw email + attachments stored as URS-12 documents with content hash; tampering detected at retrieval.
- SEC-29-13: Reviewer ≠ creator DB-level constraint per SoD-29-02.
- SEC-29-14: Promotion respects target-module ownership rules at promotion time per DEC-29-23.

---

## 13. Segregation of Duties

| SoD ID | Constraint |
|---|---|
| SoD-29-01 | The capture-session closer MUST NOT be the only capture-record creator when tenant policy requires content-creator/closer separation. |
| SoD-29-02 | The extraction reviewer MUST NOT be the extraction creator (DB-level constraint per DEC-29-09). |
| SoD-29-03 | The extraction promoter MUST be qualified for the target module per URS-28 qualification gate (DEC-29-23). |
| SoD-29-04 | The extraction-template releaser MUST NOT be the template author when tenant policy requires content-author/releaser separation. |
| SoD-29-05 | The ingestion-rule releaser MUST be `information_security_authority` or higher. |
| SoD-29-06 | The reopen co-signers (executive AND Qualified Person per DEC-29-22) MUST NOT be the original lock signer. |
| SoD-29-07 | The `platform_admin` / `super_admin` support / break-glass action MUST NOT be a regulated production action; logged and reviewed per DEC-29-20. |

---

## 14. Regulatory Mapping

| Predicate rule | Section | Module 29 binding |
|---|---|---|
| **FDA 21 CFR Part 11 §11.10(a)** | Validation | URS-29-VAL-008 |
| **FDA 21 CFR Part 11 §11.10(b)** | Record copying | OCR-extracted text + raw evidence retention via URS-12 |
| **FDA 21 CFR Part 11 §11.10(c)** | Record protection | URS-12 controlled storage + content hash |
| **FDA 21 CFR Part 11 §11.10(d)** | Authority checks | Authority/HITL/e-sign substrate |
| **FDA 21 CFR Part 11 §11.10(e)** | Audit trails | Audit-trail substrate including OCR failure + queue per-item |
| **FDA 21 CFR Part 11 §11.10(g)** | Operational system checks | Controlled extraction review workflow per DEC-29-09 |
| **FDA 21 CFR Part 11 §11.10(h)** | Input device checks | Controlled source-file storage per DEC-29-04 |
| **FDA 21 CFR Part 11 §11.30** | Open systems controls | TLS for IMAP polling + secret-store credential governance per DEC-29-07 |
| **FDA 21 CFR Part 11 §11.50** | Signature manifestations | Bound e-signature manifestations |
| **FDA 21 CFR Part 11 §11.70** | Signature/record linking | Bound e-signature linked via `electronic_signatures` substrate |
| **EU GMP Annex 11 §4** | Validation | URS-29-VAL-008 |
| **EU GMP Annex 11 §5** | Data — including data transfer integrity | Captured data integrity from source through extraction to promotion |
| **EU GMP Annex 11 §7** | Data Storage | URS-12 Document Control storage |
| **EU GMP Annex 11 §10** | Periodic Evaluation | Periodic review pack §17 |
| **EU GMP Annex 11 §12** | Security including credential governance | Secret-store credential governance per DEC-29-07 |
| **EU GMP Annex 11 §14** | Electronic Records / Signatures | Bound e-signature on every regulated final action |
| EU GMP Annex 22 Draft 2025 | §7 — HITL / GenAI advisory only | Internal forward-looking control |
| EU AI Act (Regulation 2024/1689) | Annex III; Art. 13 transparency | Internal forward-looking control |
| **MHRA Data Integrity Guidance** | ALCOA+ — primary applicability (data capture is the entry point for ALCOA+) | Module 29 system of record for capture provenance + raw evidence retention |
| GAMP 5 Cat 5 | Custom-application validation lifecycle | URS-29 validation evidence pack per URS-29-VAL-008 |
| **FDA Computer Software Assurance (CSA) — September 2025 Final Guidance** | Replaces CSV; data-capture systems are high-process-risk class | URS-29 risk-based validation aligned with CSA |
| **ISO/IEC 27001** | Information security — credential governance | Secret-store credential governance per DEC-29-07 |
| **India CDSCO Schedule M (Revised) §16** | Records and Reports | Captured data provenance subject to a future jurisdiction-specific legal assessment |

---

## 15. Code Modules

| Code module | Path | Status |
|---|---|---|
| `screen-reader` plugin | `packages/backend/src/modules/screen-reader/plugin.ts` | (canonical mount) |
| `screen-reader` routes | `packages/backend/src/modules/screen-reader/routes.ts` | (typed schemas; route additions per §6.3; client `file_path` and inline credentials rejected) |
| `screen-reader` service | `packages/backend/src/modules/screen-reader/service.ts` | (controlled source-file storage; secret-store credential resolution; non-bypassable extraction review; correction history; target-record promotion; audit + reason-for-change) |
| `screen-reader` schemas | `packages/backend/src/modules/screen-reader/schemas.ts` | (client `file_path` and inline credentials removed) |
| `screen-reader` events | `packages/backend/src/modules/screen-reader/events.ts` | target route |
| `screen-reader` secret-store-resolver | `packages/backend/src/modules/screen-reader/secret-store-resolver.ts` | target route per DEC-29-07 |
| `screen-reader` extraction-review-engine | `packages/backend/src/modules/screen-reader/extraction-review-engine.ts` | target route per DEC-29-09 |
| `screen-reader` target-promotion-handler | `packages/backend/src/modules/screen-reader/target-promotion-handler.ts` | target route per DEC-29-23 |
| Migration | `packages/backend/src/db/migrations/.` | (per §9.2) |
| Shared types | `packages/shared/src/types/screen-reader.ts` | |
| Shared schemas | `packages/shared/src/schemas/screen-reader.schema.ts` | |
| Frontend hooks | `packages/frontend/src/api/hooks/useScreenReader.ts` | |
| Frontend dashboard | `packages/frontend/src/pages/ScreenReaderDashboard.tsx` | (CTAs resolve to real routes) |
| Frontend OCR jobs | `packages/frontend/src/pages/OcrJobsPage.tsx` | (URS-12 upload only) |
| Frontend extraction detail | `packages/frontend/src/pages/ExtractionDetail.tsx` | |
| Frontend capture session console | `packages/frontend/src/pages/ScreenReaderCaptureSessions.tsx` | target route per DEC-29-12 |
| Frontend extraction template editor | `packages/frontend/src/pages/ScreenReaderExtractionTemplates.tsx` | target route per DEC-29-06 |
| Frontend ingestion rule console | `packages/frontend/src/pages/ScreenReaderIngestionRules.tsx` | target route per DEC-29-07 |
| Frontend ingestion queue | `packages/frontend/src/pages/ScreenReaderIngestionQueue.tsx` | target route per DEC-29-08 |
| Frontend extraction list | `packages/frontend/src/pages/ScreenReaderExtractions.tsx` | target route per DEC-29-12 |
| Frontend AI assistance | `packages/frontend/src/pages/ScreenReaderAIAssistance.tsx` | target route per DEC-29-13 |
| Frontend promotion console | `packages/frontend/src/pages/ScreenReaderPromotion.tsx` | target route per DEC-29-23 |
| App routing | `packages/frontend/src/App.tsx` | (per DEC-29-12) |

---

## 16. Test Cases

### 16.1 Unit tests

- TC-29-U-001: Capture session uniqueness rejects duplicate `session_code` per tenant.
- TC-29-U-002: Capture session close without authority rejects.
- TC-29-U-003: Capture record without provenance rejects.
- TC-29-U-004: OCR job with client-supplied `file_path` rejects with `SR_OCR_CLIENT_FILE_PATH_FORBIDDEN`.
- TC-29-U-005: OCR job without `document_id` rejects with `SR_OCR_DOCUMENT_REQUIRED`.
- TC-29-U-006: OCR failure emits `sr_ocr_job_failed` event + audit per DEC-29-05.
- TC-29-U-007: Extraction template effective release without URS-13 CR rejects.
- TC-29-U-008: Effective extraction template edit rejects.
- TC-29-U-009: Ingestion rule with inline `password` rejects with `SR_CREDENTIALS_IN_PAYLOAD_FORBIDDEN`.
- TC-29-U-010: Ingestion rule without `secret_store_ref` rejects with `SR_SECRET_STORE_REF_REQUIRED`.
- TC-29-U-011: Secret-store resolution failure handled gracefully.
- TC-29-U-012: Ingestion queue item without `raw_email_document_id` rejects.
- TC-29-U-013: Ingestion queue per-item events emit audit per DEC-29-08.
- TC-29-U-014: Direct PATCH on extraction terminal status rejects with `SR_TERMINAL_STATE_PATCH_FORBIDDEN`.
- TC-29-U-015: Extraction review by creator rejects with `SR_REVIEWER_CREATOR_SOD_VIOLATION`.
- TC-29-U-016: Extraction review without `sr_extraction_reviewer_authority` rejects.
- TC-29-U-017: Correction appended to `correction_history_json` is append-only.
- TC-29-U-018: Original extracted value preserved across corrections.
- TC-29-U-019: Promotion without `sr_extraction_promoter_authority` rejects.
- TC-29-U-020: Promotion without URS-28 qualification rejects with `SR_PROMOTION_QUALIFICATION_GATE_FAILED`.
- TC-29-U-021: AI service attempting persist / sign / promote rejects with appropriate error code.
- TC-29-U-022: AI assistance promotion without acceptance rejects with `SR_AI_ASSISTANCE_NOT_ACCEPTED`.
- TC-29-U-023: Reopen without executive AND QP co-sign rejects.

### 16.2 Integration tests

- TC-29-I-001: PDF upload → OCR → extraction → URS-14 complaint promotion per Worked Example 1.
- TC-29-I-002: Email ingestion → extraction → URS-15 OOS promotion per Worked Example 2.
- TC-29-I-003: OCR failure with audit evidence per Worked Example 3.
- TC-29-I-004: Stale ingestion queue → URS-21 finding per Worked Example 4.
- TC-29-I-005: AI-only promotion attempt rejected per Worked Example 5.
- TC-29-I-006: Governed reopen of locked program per Worked Example 6.
- TC-29-I-007: Promotion to URS-23 batch entry with URS-23 ownership validation enforced.
- TC-29-I-008: Promotion to URS-24 stability result.
- TC-29-I-009: Promotion to URS-25 EM result.
- TC-29-I-010: Promotion to URS-22 inspection observation.
- TC-29-I-011: Promotion to URS-26 APQR data row.
- TC-29-I-012: Cross-module event emission (URS-12, URS-13, URS-18, URS-21, URS-30, all promotion targets).
- TC-29-I-013: Secret-store integration with AWS Secrets Manager mock.
- TC-29-I-014: Cross-tenant `platform_admin` break-glass logged.
- TC-29-I-015: MIRA copilot read-only context; advisory drafting only.

### 16.3 End-to-end tests

- TC-29-E-001: Complaint capture per Worked Example 1.
- TC-29-E-002: OOS email ingestion per Worked Example 2.
- TC-29-E-003: OCR failure scenario per Worked Example 3.
- TC-29-E-004: Stale queue scenario per Worked Example 4.
- TC-29-E-005: AI-only promotion rejection per Worked Example 5.
- TC-29-E-006: Reopen scenario per Worked Example 6.
- TC-29-E-007: Concurrent capture users — 100 users — NFR-29-07.
- TC-29-E-008: India CDSCO Schedule M §16 capture provenance scenario.

### 16.4 Performance tests

- TC-29-P-001: List p95 latency (NFR-29-02).
- TC-29-P-002: OCR processing p95 latency (NFR-29-03).
- TC-29-P-003: Email polling p95 latency (NFR-29-04).
- TC-29-P-004: Promotion handoff p95 latency including URS-28 qualification gate (NFR-29-05).
- TC-29-P-005: Bound e-signature p95 latency (NFR-29-10).
- TC-29-P-006: Secret-store resolution p95 latency (NFR-29-11).

### 16.5 Security tests

- TC-29-S-001: Cross-tenant access rejected by RLS.
- TC-29-S-002: Missing RBAC rejected.
- TC-29-S-003: Missing Authority Profile rejected.
- TC-29-S-004: Missing HITL rejected.
- TC-29-S-005: Missing bound e-signature rejected.
- TC-29-S-006: SQL injection rejected.
- TC-29-S-007: Audit-trail UPDATE / DELETE rejected.
- TC-29-S-008: AI service attempting persist / sign / promote rejected.
- TC-29-S-009: PII redaction in logs verified.
- TC-29-S-010: Path-traversal attempt via client `file_path` rejected.
- TC-29-S-011: Inline IMAP credentials in payload rejected.
- TC-29-S-012: Reviewer ≠ creator DB constraint enforced.
- TC-29-S-013: Raw email content hash tampering detected.

---

## 17. Validation Evidence

### 17.1 URS-29-VAL-001: Requirements traceability matrix

Complete RTM mapping every URS-29 requirement (DEC-29-01.DEC-29-23, BR-29-01.BR-29-26, NFR-29-01.NFR-29-11, SoD-29-01.SoD-29-07, SEC-29-01.SEC-29-14) to test cases (TC-29-U-001.TC-29-U-023, TC-29-I-001.TC-29-I-015, TC-29-E-001.TC-29-E-008, TC-29-P-001.TC-29-P-006, TC-29-S-001.TC-29-S-013) and code modules (§15).

### 17.2 URS-29-VAL-002: Design qualification (DQ)

Architecture, data model, API contract, workflow, business rules, audit trail, security, integration; signed by Validation Head, QA Head, RA Head, Manufacturing Head, **Information Security Head (Primary Owner)**, Qualified Person Authority.

### 17.3 URS-29-VAL-003: Installation qualification (IQ)

Migration application + RLS verification + route mount verification + frontend hook resolution + secret-store integration verification.

### 17.4 URS-29-VAL-004: Operational qualification (OQ)

Happy-path execution of every test case with evidence captures.

### 17.5 URS-29-VAL-005: Performance qualification (PQ)

NFR-29-01.NFR-29-11 verification including secret-store latency.

### 17.6 URS-29-VAL-006: AI/ML governance evidence

Per ARCH-AI-001 AC-5 (canonical binding): (a) MIRA read-only context integration; (b) AI advisory drafting only with mandatory human acceptance via `sr_ai_assistance`; (c) **AI cannot persist, validate, or promote captured data into a GxP record** verification (canonical prohibition); (d) Annex 22 §7 + EU AI Act Annex III internal forward-looking control compliance evidence.

### 17.7 URS-29-VAL-007: Regulatory mapping evidence

FDA 21 CFR Part 11 §§11.10(a)-(h), §11.30, §11.50, §11.70; **EU GMP Annex 11 §§4, 5, 7, 10, 12, 14**; Annex 22 Draft 2025 §7; EU AI Act Art. 13 / Annex III; **MHRA Data Integrity Guidance (primary applicability)**; GAMP 5 Cat 5; **FDA CSA September 2025 Final Guidance**; ISO/IEC 27001; India CDSCO Schedule M §16.

### 17.8 URS-29-VAL-008: Migration evidence gate

Per §9.3.

### 17.9 URS-29-VAL-009: Signature manifest

QA Head, RA Head, Validation Head, Manufacturing Head, **Information Security Head (Primary Owner)**, Qualified Person Authority, Site Quality Lead, Founder / Chairman & MD per §19.

### 17.10 URS-29-VAL-010: Post-launch periodic-review pack

(a) Capture metrics (sessions, OCR success/failure, ingestion volume, extraction approval/rejection rates); (b) AI-assistance acceptance rate; (c) audit-trail integrity; (d) reopen-event audit; (e) cross-tenant break-glass audit; (f) cross-module event integrity; (g) secret-store reliability; (h) promotion success / qualification-gate failure metrics; (i) reviewer SoD compliance; periodic review at quarterly cadence by Information Security Head + QA Head + Validation Head + Qualified Person Authority.

---

## 18. Document Change History

| Version | Date | Author | Change Summary |
|---|---|---|---|
| 1.0 | 2026-05-07 | Founder Doctrine — Verixa URS Cell | First issued user requirements specification for Module 29. |

---

## 19. Document Approval

| Role | Name | Signature | Date |
|---|---|---|---|
| Founder / Chairman & MD | Vimal | __________ | __________ |
| QA Head | __________ | __________ | __________ |
| RA Head | __________ | __________ | __________ |
| Validation Head | __________ | __________ | __________ |
| Manufacturing Head | __________ | __________ | __________ |
| Information Security Head (Primary Owner) | __________ | __________ | __________ |
| Qualified Person Authority | __________ | __________ | __________ |
| Site Quality Lead | __________ | __________ | __________ |

---

## 20. Cross-Module Event Contract

| Event | Emitter | Consumer | Payload key fields |
|---|---|---|---|
| `sr_capture_session_created` | Module 29 | URS-30 | `session_id`, `tenant_id` |
| `sr_capture_session_closed` | Module 29 | URS-30 | `session_id`, `closed_by`, `closure_e_signature_id` |
| `sr_capture_record_added` | Module 29 | URS-30 | `record_id`, `session_id`, `capture_method` |
| `sr_ocr_job_created` | Module 29 | URS-30 | `ocr_job_id`, `document_id`, `engine_id` |
| `sr_ocr_job_processed` | Module 29 | URS-30 | `ocr_job_id`, `confidence`, `processing_time_ms` |
| `sr_ocr_job_failed` | Module 29 | URS-30 | `ocr_job_id`, `error_message`, `error_class` |
| `sr_extraction_template_released` | Module 29 | URS-30, URS-13 | `template_id`, `change_request_id`, `released_by`, `e_signature_id` |
| `sr_ingestion_rule_polled` | Module 29 | URS-30 | `rule_id`, `items_ingested` |
| `sr_ingestion_queue_item_created` | Module 29 | URS-30 | `item_id`, `raw_email_document_id` |
| `sr_ingestion_queue_item_processed` | Module 29 | URS-30 | `item_id`, `processed_by` |
| `sr_extraction_created` | Module 29 | URS-30 | `extraction_id`, `extraction_method`, `source_record_id` |
| `sr_extraction_approved` | Module 29 | URS-30 | `extraction_id`, `reviewed_by`, `review_e_signature_id` |
| `sr_extraction_rejected` | Module 29 | URS-30 | `extraction_id`, `rejection_reason` |
| `sr_extraction_correction_recorded` | Module 29 | URS-30 | `extraction_id`, `correction_index`, `corrected_by`, `e_signature_id` |
| `sr_extraction_promoted_to_target` | Module 29 | **URS-14/-15/-16/-22/-23/-24/-25/-26/-27 (target consumers)**, URS-30 | `extraction_id`, `target_module`, `promoted_to_record_id`, `promoted_by`, `promotion_e_signature_id` |
| `sr_finding_created` | Module 29 | **URS-21 (Findings — primary consumer)**, URS-30 | `finding_id` (URS-21), `severity`, `finding_type` |
| `sr_capture_failure_capa_linked` | Module 29 | **URS-18 (CAPA — primary consumer)**, URS-30 | `capa_id`, `linked_by`, `source_type = screen_reader_capture` |
| `sr_ai_assistance_proposed` | Module 29 | URS-30 | `assistance_id`, `model_id`, `confidence` |
| `sr_ai_assistance_accepted` | Module 29 | URS-30 | `assistance_id`, `accepted_by`, `acceptance_e_signature_id` |
| `sr_capture_program_locked` | Module 29 | URS-30 | `program_lock_id`, `locked_by`, `lock_e_signature_id` |
| `sr_capture_program_reopened` | Module 29 | URS-30, URS-21 (governed-reopen audit) | `program_lock_id`, `reopened_by`, `executive_co_signer`, `qp_co_signer`, `reopen_reason` |

---

## 21. References

- ARCH-AI-001 — AI Optionality and Manual Continuity (binding architecture; Module specification explicitly invokes AC-5)
- VRX-SPEC-URS-029-Screen-Reader-Data-Capture-and-Extraction-Governance.md (Module specification)
- URS-01.URS-28, URS-30.URS-35 (cross-module contracts)
- **FDA 21 CFR Part 11** §§11.10(a)–(h), §11.30, §11.50, §11.70
- **EU GMP Annex 11** §§4, 5, 7, 10, 12, 14
- EU GMP Annex 22 (Draft 2025) §7 — internal forward-looking control
- EU AI Act (Regulation 2024/1689) Art. 13 / Annex III — internal forward-looking control
- **MHRA Data Integrity Guidance (2018)** — ALCOA+ — primary applicability
- GAMP 5 Cat 5
- **FDA Computer Software Assurance (CSA) — September 2025 Final Guidance**
- **ISO/IEC 27001** — information security
- India CDSCO Schedule M (Revised) §16

---

**END OF VRX-URS-29 — SCREEN READER / DATA CAPTURE AND EXTRACTION GOVERNANCE — VERSION 1.0**
