{
  "manifest_id": "Verixa_Synthetic_Data_Manifest_v1.0",
  "generated_utc": "2026-06-08T00:00:00+00:00",
  "owner": "CEO, Verixa, A Vertical of Aeonn Health",
  "doctrine_reference": "Verixa Synthetic Data Plan v1.0 (VRX-SDP-001); ARCH-AI-001 AC-2 (advisory-only).",
  "non_negotiable_rules": [
    "Synthetic data is ALWAYS advisory. Never the system of record for any regulatory claim.",
    "Every generator is deterministic: same seed -> SHA-256-identical output (Gate G6).",
    "Once a row is assigned to a partition, it stays there forever. Never re-partition.",
    "A dataset that fails any applicable gate cannot be locked.",
    "Cite Gold Standard for external claims; synthetic for internal engineering verification only."
  ],
  "datasets": [
    {
      "dataset_id": "wf_apqr_001_review",
      "workflow_id": "WF-APQR-001",
      "workflow_name": "Annual Product Quality Review (APQR)",
      "urs_refs": [
        "URS-26",
        "URS-24",
        "URS-12"
      ],
      "generator_id": "wf_apqr_001_review",
      "generator_version": "1.0.0",
      "generator_type": "rule_based+statistical",
      "seed_value": "vrx-apqr-001-seed-2026",
      "generation_timestamp_utc": "2026-06-08T00:00:00+00:00",
      "schema_version": "1.0.0",
      "schema_hash_sha256": "e613a0a5167c10ff1103c2e186233c41061d5d4aed45ec6ecf78c69cb571825d",
      "output_file": "synthetic_data/wf_apqr_001_review_v1.0.parquet",
      "output_format": "parquet",
      "output_row_count": 2000,
      "content_sha256": "a3c7c631a1c06e386eff4dfd2903bc3ed44ca3a34b4cfee61bc5830a3edf5af8",
      "content_sha256_note": "CANONICAL G6 determinism anchor — portable across toolchains (audit F2).",
      "output_sha256": "e23b9c30fd82db322748b9a6c3c47b9d0f95f75304788ffcd63fac095ced3153",
      "output_sha256_note": "Environment-bound Parquet-file integrity hash; reproducible only within the recorded build_environment, not across pyarrow versions (audit F2).",
      "build_environment": {
        "python": "3.13.12",
        "implementation": "CPython",
        "libraries": {
          "numpy": "2.4.6",
          "pandas": "3.0.3",
          "pyarrow": "24.0.0",
          "scipy": "1.17.1",
          "Faker": "40.21.0",
          "jsonschema": "4.26.0"
        }
      },
      "review_sample_file": "synthetic_data/wf_apqr_001_review_v1.0.review_sample.xlsx",
      "review_sample_row_count": 830,
      "review_sample_sha256": "e618374f20823f5ea8f1033df72397ff1488ef081e078f6d5a7cf39bd834429e",
      "partition_map_file": "synthetic_data/wf_apqr_001_review_v1.0.partition_map.parquet",
      "partition_map_hash": "9d2237cfbfa0b44ffe47a33bf9c5b44af88cb81326e121b50bebd564e39ab495",
      "large_dataset": false,
      "data_artifact_committed": true,
      "regeneration_note": null,
      "partition_split": {
        "training": 1400,
        "evaluation": 400,
        "edge_case": 200
      },
      "edge_case_coverage_pct": 0.165,
      "edge_case_by_category": {
        "negative": 140,
        "boundary": 90,
        "historical-failure": 50,
        "adversarial": 50
      },
      "bias_test_results": {
        "report_year": {
          "column": "report_year",
          "top_value": "2023",
          "top_share": 0.1735,
          "ceiling": 0.4,
          "passed": true
        },
        "status": {
          "column": "status",
          "top_value": "approved",
          "top_share": 0.381,
          "ceiling": 0.4,
          "passed": true
        }
      },
      "gold_standard_seed_corpus": [],
      "intended_use": "Engineering verification of the APQR capstone (URS-26): the CROSS-DATASET ROLLUP RECONCILIATION control - every stored aggregate (n_deviations/n_oos/n_complaints/n_recalls/n_capas/n_change_controls/n_batches/n_stability_studies per (product, year); n_em_excursions per (site, year)) is COMPUTED AT GENERATION TIME from the locked child datasets and is INDEPENDENTLY RECOMPUTABLE from them (the audit re-derives and asserts equality); plus the AI-insights advisory boundary (Cpk/Ppk + counts are deterministic system-of-record, AI narrative advisory-only, a recommendation triggers a CAPA only after human acceptance); and multi-step SoD approval (prepared_by != reviewed_by != approved_by, e-signature, no draft->approved bypass, version increment). Grounded in the apqr module (mig 033). The reconciliation logic is not in the service code (computed at gen time / by a job) - see assumption_register. NO real LLM.",
      "canonicalization_note": null,
      "model_grounding": {
        "source": "apqr module mig 033 (apqr_reports status/prepared_by/reviewed_by/approved_by/version, apqr_data_collections rollups, apqr_statistical_analyses cpk/ppk, apqr_recommendations source, apqr_approvals multi-step); service.ts:655 SoD (preparer!=approver), e-sig server-extracted, no draft->approved bypass; PHASE_1D_BUILD_PLAN.md:95 reconciliation requirement",
        "load_bearing": "cross-dataset rollup reconciliation (8 product-keyed + EM site-keyed, each on its OWN date field, recomputed from the locked children - non-tautology via the aggregate_count_mismatch edge) + AI-insights advisory boundary + multi-step SoD approval",
        "chain_link": "product_id -> MD-002; manufacturing_site_id -> MD-001; prepared_by/reviewed_by/approved_by -> user; child counts <- QE-001/002/004/007, CRC-001/004, MFG-004, LAB-003, LAB-005; ai_request_id -> ai_request namespace (APQR range); ai_model_config_id -> MIRA-003; hitl_decision_id -> hitl_decision namespace."
      },
      "assumption_register": [
        "Reconciliation/aggregation logic is NOT in the apqr service (confirmed); the synthetic APQR computes rollups at generation time from the locked child datasets and stores them - the audit independently re-derives.",
        "n_em_excursions is SITE-scoped (per (site, year)), not product-scoped, because LAB-005 EM has no product_id (EM monitors cleanrooms). APQR grain carries manufacturing_site_id; the metric is labeled site-scoped.",
        "Cpk/Ppk + batch_yield are deterministic engineered values (MFG-prior); the real module computes them from batch CQA data (no aggregation code found).",
        "product_id is the canonical MD-002 uuid (the real schema's VARCHAR product_id is a quirk; canonical uuid used so reconciliation joins resolve)."
      ],
      "gates": {
        "G1": {
          "gate": "G1_schema",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "rows": 2000,
            "failures": 0,
            "sample": []
          }
        },
        "G2": {
          "gate": "G2_statistical",
          "status": "N/A",
          "passed": true,
          "applicable": false,
          "detail": {
            "rationale": "WF-APQR-001: rollups are DERIVED aggregates (not anchored to a real distribution); Cpk/Ppk reuse the MFG engineered prior. G2 N/A - the load-bearing property is reconciliation, not distribution fit. Corpus stays 5 real-G2 datasets."
          }
        },
        "G3": {
          "gate": "G3_edge_cases",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "edge_rows": 330,
            "total": 2000,
            "pct": 0.165,
            "min_required": 0.15,
            "by_category": {
              "negative": 140,
              "boundary": 90,
              "historical-failure": 50,
              "adversarial": 50
            }
          }
        },
        "G4": {
          "gate": "G4_bias",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "axes": {
              "report_year": {
                "column": "report_year",
                "top_value": "2023",
                "top_share": 0.1735,
                "ceiling": 0.4,
                "passed": true
              },
              "status": {
                "column": "status",
                "top_value": "approved",
                "top_share": 0.381,
                "ceiling": 0.4,
                "passed": true
              }
            }
          }
        },
        "G5": {
          "gate": "G5_provenance",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "missing_columns": [],
            "unpopulated": {},
            "expected_columns": 11
          }
        },
        "G6": {
          "gate": "G6_determinism",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "hash_run_1": "a3c7c631a1c06e386eff4dfd2903bc3ed44ca3a34b4cfee61bc5830a3edf5af8",
            "hash_run_2": "a3c7c631a1c06e386eff4dfd2903bc3ed44ca3a34b4cfee61bc5830a3edf5af8",
            "match": true
          }
        }
      },
      "classification": "ENGINEERING_VERIFICATION",
      "advisory_label": "Synthetic data — engineering test fixture per ARCH-AI-001 AC-2. Not Gold Standard.",
      "provenance_columns": [
        "synthetic_record_id",
        "generator_id",
        "generator_version",
        "seed_value",
        "generation_timestamp_utc",
        "schema_hash_sha256",
        "is_edge_case",
        "edge_case_category",
        "partition_assignment",
        "partition_assignment_hash",
        "advisory_label"
      ]
    },
    {
      "dataset_id": "wf_crc_001_change_control",
      "workflow_id": "WF-CRC-001",
      "workflow_name": "Change Control",
      "urs_refs": [
        "URS-13"
      ],
      "generator_id": "wf_crc_001_change_control",
      "generator_version": "1.0.0",
      "generator_type": "rule_based+statistical",
      "seed_value": "vrx-crc-001-seed-2026",
      "generation_timestamp_utc": "2026-06-08T00:00:00+00:00",
      "schema_version": "1.0.0",
      "schema_hash_sha256": "645b36e813c078949a56db525215a111c39a7e097923a28d06403a47c60f0a0b",
      "output_file": "synthetic_data/wf_crc_001_change_control_v1.0.parquet",
      "output_format": "parquet",
      "output_row_count": 5000,
      "content_sha256": "f5d04aeaf850b71a0d801e3ea2ab67fb6ef25d7ec97db6d986029a8d485db6cc",
      "content_sha256_note": "CANONICAL G6 determinism anchor — portable across toolchains (audit F2).",
      "output_sha256": "cec8a30a3f967c2eed5f0f6c9cfdd9ec0c0337527fbe3bef836d08e472a0b4cc",
      "output_sha256_note": "Environment-bound Parquet-file integrity hash; reproducible only within the recorded build_environment, not across pyarrow versions (audit F2).",
      "build_environment": {
        "python": "3.13.12",
        "implementation": "CPython",
        "libraries": {
          "numpy": "2.4.6",
          "pandas": "3.0.3",
          "pyarrow": "24.0.0",
          "scipy": "1.17.1",
          "Faker": "40.21.0",
          "jsonschema": "4.26.0"
        }
      },
      "review_sample_file": "synthetic_data/wf_crc_001_change_control_v1.0.review_sample.xlsx",
      "review_sample_row_count": 1650,
      "review_sample_sha256": "61dae1ae6c56cb3d29de85b08184c930a5c78955735a58b0f855e5102e9c68ef",
      "partition_map_file": "synthetic_data/wf_crc_001_change_control_v1.0.partition_map.parquet",
      "partition_map_hash": "3a4cd3973464362e20bf983ce5d66e3a7061b915284a922cc9da1b4675c727ad",
      "large_dataset": false,
      "data_artifact_committed": true,
      "regeneration_note": null,
      "partition_split": {
        "training": 3500,
        "evaluation": 1000,
        "edge_case": 500
      },
      "edge_case_coverage_pct": 0.17,
      "edge_case_by_category": {
        "negative": 300,
        "boundary": 220,
        "adversarial": 210,
        "historical-failure": 120
      },
      "bias_test_results": {
        "classification": {
          "column": "change_classification",
          "top_value": "minor",
          "top_share": 0.3472,
          "ceiling": 0.4,
          "passed": true
        },
        "product": {
          "column": "product_id",
          "top_value": "6eae6951-e794-58c5-bab5-1aabcb0970b5",
          "top_share": 0.003,
          "ceiling": 0.4,
          "passed": true
        }
      },
      "gold_standard_seed_corpus": [],
      "intended_use": null,
      "canonicalization_note": null,
      "model_grounding": {
        "source": "URS-13 (DEC-13-04 required category matrix, DEC-13-05 CAB quorum, DEC-13-12 closure prerequisites, SoD-13-01/03/07, DEC-13-21 critical-system exec co-sign, DEC-13-22 reopen, DEC-13-24 ICH Q12 auto-major)",
        "lifecycle": "draft -> impact_assessment -> cab_review -> approved|approved_with_conditions|rejected; approved -> implementing -> verification -> closed; closed -> superseded|reopened; draft/impact_assessment/cab_review -> withdrawn",
        "classification": "major|minor|administrative|like_for_like; CAB quorum 5/2/1/2 (DEC-13-05)",
        "sod": "SoD-13-01 originator!=approver; SoD-13-03 no double-slot; SoD-13-07 closure authority!=originator"
      },
      "assumption_register": {
        "note": "Declared engineering assumptions.",
        "scope_anchor_rates": {
          "source": "engineering assumption",
          "values": {
            "site": 0.7,
            "supplier": 0.25,
            "study": 0.15,
            "related_deviation": 0.15
          }
        }
      },
      "gates": {
        "G1": {
          "gate": "G1_schema",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "rows": 5000,
            "failures": 0,
            "sample": []
          }
        },
        "G2": {
          "gate": "G2_statistical",
          "status": "N/A",
          "passed": true,
          "applicable": false,
          "detail": {
            "rationale": "WF-CRC-001 Change Control: no Gold Standard change-control corpus. G2 N/A. Classification matrix, CAB quorum, lifecycle states, SoD (originator!=approver, no double-slot), closure prerequisites and audit codes grounded in URS-13."
          }
        },
        "G3": {
          "gate": "G3_edge_cases",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "edge_rows": 850,
            "total": 5000,
            "pct": 0.17,
            "min_required": 0.15,
            "by_category": {
              "negative": 300,
              "boundary": 220,
              "adversarial": 210,
              "historical-failure": 120
            }
          }
        },
        "G4": {
          "gate": "G4_bias",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "axes": {
              "classification": {
                "column": "change_classification",
                "top_value": "minor",
                "top_share": 0.3472,
                "ceiling": 0.4,
                "passed": true
              },
              "product": {
                "column": "product_id",
                "top_value": "6eae6951-e794-58c5-bab5-1aabcb0970b5",
                "top_share": 0.003,
                "ceiling": 0.4,
                "passed": true
              }
            }
          }
        },
        "G5": {
          "gate": "G5_provenance",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "missing_columns": [],
            "unpopulated": {},
            "expected_columns": 11
          }
        },
        "G6": {
          "gate": "G6_determinism",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "hash_run_1": "f5d04aeaf850b71a0d801e3ea2ab67fb6ef25d7ec97db6d986029a8d485db6cc",
            "hash_run_2": "f5d04aeaf850b71a0d801e3ea2ab67fb6ef25d7ec97db6d986029a8d485db6cc",
            "match": true
          }
        }
      },
      "classification": "ENGINEERING_VERIFICATION",
      "advisory_label": "Synthetic data — engineering test fixture per ARCH-AI-001 AC-2. Not Gold Standard.",
      "provenance_columns": [
        "synthetic_record_id",
        "generator_id",
        "generator_version",
        "seed_value",
        "generation_timestamp_utc",
        "schema_hash_sha256",
        "is_edge_case",
        "edge_case_category",
        "partition_assignment",
        "partition_assignment_hash",
        "advisory_label"
      ]
    },
    {
      "dataset_id": "wf_crc_002_risk",
      "workflow_id": "WF-CRC-002",
      "workflow_name": "Risk Assessment",
      "urs_refs": [
        "URS-19"
      ],
      "generator_id": "wf_crc_002_risk",
      "generator_version": "1.0.0",
      "generator_type": "rule_based+statistical",
      "seed_value": "vrx-crc-002-seed-2026",
      "generation_timestamp_utc": "2026-06-08T00:00:00+00:00",
      "schema_version": "1.0.0",
      "schema_hash_sha256": "42e0bca1b82fc9bc2745f0b2016e4060fbe40b45b193bf8ff9d8151e356741f4",
      "output_file": "synthetic_data/wf_crc_002_risk_v1.0.parquet",
      "output_format": "parquet",
      "output_row_count": 3000,
      "content_sha256": "c24bcf77cf1ba2793ca7a975d0ef3c37f1f4cfb394139c489f7007a7f7702fe1",
      "content_sha256_note": "CANONICAL G6 determinism anchor — portable across toolchains (audit F2).",
      "output_sha256": "9d084fd9df54cb5a2015039d98fbce055427d2abfc61caae143d4b6c77ce6753",
      "output_sha256_note": "Environment-bound Parquet-file integrity hash; reproducible only within the recorded build_environment, not across pyarrow versions (audit F2).",
      "build_environment": {
        "python": "3.13.12",
        "implementation": "CPython",
        "libraries": {
          "numpy": "2.4.6",
          "pandas": "3.0.3",
          "pyarrow": "24.0.0",
          "scipy": "1.17.1",
          "Faker": "40.21.0",
          "jsonschema": "4.26.0"
        }
      },
      "review_sample_file": "synthetic_data/wf_crc_002_risk_v1.0.review_sample.xlsx",
      "review_sample_row_count": 1110,
      "review_sample_sha256": "7c71d62cf1bcbec0741a81f4d02c661abf232f38ba332d5cece3141b51767f24",
      "partition_map_file": "synthetic_data/wf_crc_002_risk_v1.0.partition_map.parquet",
      "partition_map_hash": "7ee4b03ff0d838f454381efed51e7178ac251bd32526f2a0d4dde35b7b09a0b3",
      "large_dataset": false,
      "data_artifact_committed": true,
      "regeneration_note": null,
      "partition_split": {
        "training": 2100,
        "evaluation": 600,
        "edge_case": 300
      },
      "edge_case_coverage_pct": 0.2033,
      "edge_case_by_category": {
        "boundary": 230,
        "negative": 140,
        "adversarial": 130,
        "historical-failure": 110
      },
      "bias_test_results": {
        "product": {
          "column": "product_id",
          "top_value": "b36118c3-caf2-50ec-867f-420da0086c7a",
          "top_share": 0.003,
          "ceiling": 0.4,
          "passed": true
        },
        "tenant": {
          "column": "tenant_id",
          "top_value": "28535f9d-463b-5337-92bc-94210a63f9c7",
          "top_share": 0.0087,
          "ceiling": 0.4,
          "passed": true
        }
      },
      "gold_standard_seed_corpus": [],
      "intended_use": null,
      "canonicalization_note": null,
      "model_grounding": {
        "source": "URS-19 (DEC-19-02 state machine, DEC-19-04 source linkage 9 types, DEC-19-05 methodologies, DEC-19-08 deterministic RPN=SxLxD, DEC-19-06 risk-level bands, DEC-19-14 RPN propagation to deviation, DEC-19-16 closure residual re-score, DEC-19-20 high-risk CAPA cascade, DEC-19-21 critical exec co-sign, DEC-19-22 reopen, SoD-19-01/04)",
        "lifecycle": "draft -> in_progress -> completed -> approved -> closed; closed -> in_progress (governed reopen)",
        "scoring": "RPN = severity(1-10) x likelihood(1-10) x detectability(1-10); risk_level by threshold bands (negligible/low/medium/high/critical). Deterministic, never AI-set (DEC-19-08).",
        "chain_link": "source_id resolves: deviation -> QE-001, change_control -> CRC-001 (forward 1.0); high/critical approved -> CAPA cascade (capa_id -> CRC-004); deviation-sourced approved -> RPN propagated to deviation.",
        "sod": "SoD-19-01 approver!=creator; SoD-19-04 closure authority!=creator"
      },
      "assumption_register": null,
      "gates": {
        "G1": {
          "gate": "G1_schema",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "rows": 3000,
            "failures": 0,
            "sample": []
          }
        },
        "G2": {
          "gate": "G2_statistical",
          "status": "N/A",
          "passed": true,
          "applicable": false,
          "detail": {
            "rationale": "WF-CRC-002 Risk: RPN/risk-level distribution is a deterministic function of S/L/D draws (ICH Q9 methodology), not an empirical Gold Standard frequency; no public quantitative risk-register corpus for a like-for-like KS anchor. G2 N/A."
          }
        },
        "G3": {
          "gate": "G3_edge_cases",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "edge_rows": 610,
            "total": 3000,
            "pct": 0.2033,
            "min_required": 0.15,
            "by_category": {
              "boundary": 230,
              "negative": 140,
              "adversarial": 130,
              "historical-failure": 110
            }
          }
        },
        "G4": {
          "gate": "G4_bias",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "axes": {
              "product": {
                "column": "product_id",
                "top_value": "b36118c3-caf2-50ec-867f-420da0086c7a",
                "top_share": 0.003,
                "ceiling": 0.4,
                "passed": true
              },
              "tenant": {
                "column": "tenant_id",
                "top_value": "28535f9d-463b-5337-92bc-94210a63f9c7",
                "top_share": 0.0087,
                "ceiling": 0.4,
                "passed": true
              }
            }
          }
        },
        "G5": {
          "gate": "G5_provenance",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "missing_columns": [],
            "unpopulated": {},
            "expected_columns": 11
          }
        },
        "G6": {
          "gate": "G6_determinism",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "hash_run_1": "c24bcf77cf1ba2793ca7a975d0ef3c37f1f4cfb394139c489f7007a7f7702fe1",
            "hash_run_2": "c24bcf77cf1ba2793ca7a975d0ef3c37f1f4cfb394139c489f7007a7f7702fe1",
            "match": true
          }
        }
      },
      "classification": "ENGINEERING_VERIFICATION",
      "advisory_label": "Synthetic data — engineering test fixture per ARCH-AI-001 AC-2. Not Gold Standard.",
      "provenance_columns": [
        "synthetic_record_id",
        "generator_id",
        "generator_version",
        "seed_value",
        "generation_timestamp_utc",
        "schema_hash_sha256",
        "is_edge_case",
        "edge_case_category",
        "partition_assignment",
        "partition_assignment_hash",
        "advisory_label"
      ]
    },
    {
      "dataset_id": "wf_crc_003_rca",
      "workflow_id": "WF-CRC-003",
      "workflow_name": "Root Cause Analysis",
      "urs_refs": [
        "URS-17"
      ],
      "generator_id": "wf_crc_003_rca",
      "generator_version": "1.0.0",
      "generator_type": "rule_based+statistical",
      "seed_value": "vrx-crc-003-seed-2026",
      "generation_timestamp_utc": "2026-06-08T00:00:00+00:00",
      "schema_version": "1.0.0",
      "schema_hash_sha256": "a81e581bc5658a21d901cfdd09275f1bb40b2d03012b70c8f48776222894931d",
      "output_file": "synthetic_data/wf_crc_003_rca_v1.0.parquet",
      "output_format": "parquet",
      "output_row_count": 1607,
      "content_sha256": "04b5e5bafd4c013b687a05dd7de9ae20b53197dd4ecf7e12d441afddd1510c4f",
      "content_sha256_note": "CANONICAL G6 determinism anchor — portable across toolchains (audit F2).",
      "output_sha256": "c025a860d39fc8475e02fca96c4853065c32e8267e574aa5caab275c0606b466",
      "output_sha256_note": "Environment-bound Parquet-file integrity hash; reproducible only within the recorded build_environment, not across pyarrow versions (audit F2).",
      "build_environment": {
        "python": "3.13.12",
        "implementation": "CPython",
        "libraries": {
          "numpy": "2.4.6",
          "pandas": "3.0.3",
          "pyarrow": "24.0.0",
          "scipy": "1.17.1",
          "Faker": "40.21.0",
          "jsonschema": "4.26.0"
        }
      },
      "review_sample_file": "synthetic_data/wf_crc_003_rca_v1.0.review_sample.xlsx",
      "review_sample_row_count": 759,
      "review_sample_sha256": "a7478c77c2cca9ea78018a0bdc4b9615756c0510143f0ce01e2d23c29daf7a65",
      "partition_map_file": "synthetic_data/wf_crc_003_rca_v1.0.partition_map.parquet",
      "partition_map_hash": "94b76bf65c93e9436100abffab0ee7460f5d80084623f05119893df293f4b18d",
      "large_dataset": false,
      "data_artifact_committed": true,
      "regeneration_note": null,
      "partition_split": {
        "training": 1125,
        "evaluation": 321,
        "edge_case": 161
      },
      "edge_case_coverage_pct": 0.1823,
      "edge_case_by_category": {
        "boundary": 100,
        "adversarial": 88,
        "negative": 76,
        "historical-failure": 29
      },
      "bias_test_results": {
        "method": {
          "column": "method",
          "top_value": "five_why",
          "top_share": 0.374,
          "ceiling": 0.4,
          "passed": true
        },
        "product": {
          "column": "product_id",
          "top_value": "61ff552b-f311-5972-b9c3-098e444d598d",
          "top_share": 0.0037,
          "ceiling": 0.4,
          "passed": true
        }
      },
      "gold_standard_seed_corpus": [],
      "intended_use": null,
      "canonicalization_note": null,
      "model_grounding": {
        "source": "URS-17 (DEC-17-02 state machine, DEC-17-04 mandatory multi-source linkage, DEC-17-06 methods five_why/fishbone/fault_tree, DEC-17-12 conclusion, DEC-17-13 approval, DEC-17-15 immutability, DEC-17-18 GenAI prohibition, DEC-17-21 critical multi-sign, DEC-17-22 reopen, SoD-17-01/02/04)",
        "lifecycle": "draft -> in_progress -> root_cause_identified -> completed -> approved -> closed; closed -> in_progress (governed reopen); voided (rare)",
        "bidirectional": "RCA derived from QE-001/002/004 rca_id refs; RCA.deviation_id/oos_id/complaint_id point back to events that reference this RCA; RCA.capa_id = the same event's capa (CAPA subset of RCA) closing the deviation->R, deviation->C, R->C triangle.",
        "sod": "SoD-17-01 approver!=creator; SoD-17-02 investigation lead!=creator; SoD-17-04 closure authority!=creator"
      },
      "assumption_register": null,
      "gates": {
        "G1": {
          "gate": "G1_schema",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "rows": 1607,
            "failures": 0,
            "sample": []
          }
        },
        "G2": {
          "gate": "G2_statistical",
          "status": "N/A",
          "passed": true,
          "applicable": false,
          "detail": {
            "rationale": "WF-CRC-003 RCA: demand-derived from internal event links; method/lifecycle mix is URS-17-grounded engineering distribution. GS 483/citation corpora describe inspector observations (downstream findings), not internal RCA method frequency, so no like-for-like KS anchor. G2 N/A."
          }
        },
        "G3": {
          "gate": "G3_edge_cases",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "edge_rows": 293,
            "total": 1607,
            "pct": 0.1823,
            "min_required": 0.15,
            "by_category": {
              "boundary": 100,
              "adversarial": 88,
              "negative": 76,
              "historical-failure": 29
            }
          }
        },
        "G4": {
          "gate": "G4_bias",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "axes": {
              "method": {
                "column": "method",
                "top_value": "five_why",
                "top_share": 0.374,
                "ceiling": 0.4,
                "passed": true
              },
              "product": {
                "column": "product_id",
                "top_value": "61ff552b-f311-5972-b9c3-098e444d598d",
                "top_share": 0.0037,
                "ceiling": 0.4,
                "passed": true
              }
            }
          }
        },
        "G5": {
          "gate": "G5_provenance",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "missing_columns": [],
            "unpopulated": {},
            "expected_columns": 11
          }
        },
        "G6": {
          "gate": "G6_determinism",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "hash_run_1": "04b5e5bafd4c013b687a05dd7de9ae20b53197dd4ecf7e12d441afddd1510c4f",
            "hash_run_2": "04b5e5bafd4c013b687a05dd7de9ae20b53197dd4ecf7e12d441afddd1510c4f",
            "match": true
          }
        }
      },
      "classification": "ENGINEERING_VERIFICATION",
      "advisory_label": "Synthetic data — engineering test fixture per ARCH-AI-001 AC-2. Not Gold Standard.",
      "provenance_columns": [
        "synthetic_record_id",
        "generator_id",
        "generator_version",
        "seed_value",
        "generation_timestamp_utc",
        "schema_hash_sha256",
        "is_edge_case",
        "edge_case_category",
        "partition_assignment",
        "partition_assignment_hash",
        "advisory_label"
      ]
    },
    {
      "dataset_id": "wf_crc_004_capa",
      "workflow_id": "WF-CRC-004",
      "workflow_name": "CAPA",
      "urs_refs": [
        "URS-18"
      ],
      "generator_id": "wf_crc_004_capa",
      "generator_version": "1.1.0",
      "generator_type": "rule_based+statistical",
      "seed_value": "vrx-crc-004-seed-2026",
      "generation_timestamp_utc": "2026-06-08T00:00:00+00:00",
      "schema_version": "1.0.0",
      "schema_hash_sha256": "a34d0635975d0eaf7015f21137f5372732630c8efc7e27525235f050a855b6b3",
      "output_file": "synthetic_data/wf_crc_004_capa_v1.0.parquet",
      "output_format": "parquet",
      "output_row_count": 1685,
      "content_sha256": "497e2eb0cec8907adacb38670f2cd5a39403707141a161d1b26a61c23882ee70",
      "content_sha256_note": "CANONICAL G6 determinism anchor — portable across toolchains (audit F2).",
      "output_sha256": "211d867db79664db4debd6e5b0ff5c73996689c49887386ff080579b5c5d3794",
      "output_sha256_note": "Environment-bound Parquet-file integrity hash; reproducible only within the recorded build_environment, not across pyarrow versions (audit F2).",
      "build_environment": {
        "python": "3.13.12",
        "implementation": "CPython",
        "libraries": {
          "numpy": "2.4.6",
          "pandas": "3.0.3",
          "pyarrow": "24.0.0",
          "scipy": "1.17.1",
          "Faker": "40.21.0",
          "jsonschema": "4.26.0"
        }
      },
      "review_sample_file": "synthetic_data/wf_crc_004_capa_v1.0.review_sample.xlsx",
      "review_sample_row_count": 901,
      "review_sample_sha256": "0eff348e2737ddacff5551c95ce43f22699355e46889204c505175e5461ba5c8",
      "partition_map_file": "synthetic_data/wf_crc_004_capa_v1.0.partition_map.parquet",
      "partition_map_hash": "7f3d2c33d53938c1647647ab6a7b8ce21d63ae5425dfe415ae0ef018235c5a25",
      "large_dataset": false,
      "data_artifact_committed": true,
      "regeneration_note": null,
      "partition_split": {
        "training": 1180,
        "evaluation": 337,
        "edge_case": 168
      },
      "edge_case_coverage_pct": 0.1715,
      "edge_case_by_category": {
        "negative": 105,
        "boundary": 85,
        "adversarial": 74,
        "historical-failure": 25
      },
      "bias_test_results": {
        "product": {
          "column": "product_id",
          "top_value": "cdd3bb93-fcbc-562b-9130-b197e94cffb7",
          "top_share": 0.0047,
          "ceiling": 0.4,
          "passed": true
        },
        "tenant": {
          "column": "tenant_id",
          "top_value": "33759aa1-7868-5a39-ac11-bc332a9037dd",
          "top_share": 0.0095,
          "ceiling": 0.4,
          "passed": true
        }
      },
      "gold_standard_seed_corpus": [],
      "intended_use": null,
      "canonicalization_note": null,
      "model_grounding": {
        "source": "URS-18 (DEC-18-02 8-state machine, DEC-18-04 mandatory source linkage 8 types, DEC-18-07 effectiveness effective/partial/ineffective, DEC-18-11/13 bound e-sig, DEC-18-15 immutability, DEC-18-18 GenAI prohibition, DEC-18-21 critical exec co-sign, DEC-18-22 reopen, SoD-18-01..07)",
        "lifecycle": "draft -> open -> assigned -> in_progress -> completed -> effectiveness_check -> verified -> closed; closed -> in_progress (reopen); cancelled",
        "bidirectional": "CAPA.deviation_id/oos_id/complaint_id point back to events referencing this CAPA; CAPA.linked_rca_id = the event's rca (CAPA subset of RCA) resolving RCA<->CAPA; ineffective effectiveness spawns re_capa_id (DEC-18-07).",
        "sod": "SoD-18-01 approver!=creator/owner; SoD-18-03 effectiveness reviewer!=owner; SoD-18-04 closer!=creator/owner",
        "re_capa": "ineffective effectiveness (DEC-18-07) spawns re_capa_id resolving to a real CAPA PK in the dataset (forward 1.0) — traceable follow-on, not a dangling reference."
      },
      "assumption_register": null,
      "gates": {
        "G1": {
          "gate": "G1_schema",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "rows": 1685,
            "failures": 0,
            "sample": []
          }
        },
        "G2": {
          "gate": "G2_statistical",
          "status": "N/A",
          "passed": true,
          "applicable": false,
          "detail": {
            "rationale": "WF-CRC-004 CAPA: demand-derived from internal event links; type/lifecycle/effectiveness mix is URS-18-grounded. No public CAPA-outcome corpus for a like-for-like KS anchor. G2 N/A."
          }
        },
        "G3": {
          "gate": "G3_edge_cases",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "edge_rows": 289,
            "total": 1685,
            "pct": 0.1715,
            "min_required": 0.15,
            "by_category": {
              "negative": 105,
              "boundary": 85,
              "adversarial": 74,
              "historical-failure": 25
            }
          }
        },
        "G4": {
          "gate": "G4_bias",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "axes": {
              "product": {
                "column": "product_id",
                "top_value": "cdd3bb93-fcbc-562b-9130-b197e94cffb7",
                "top_share": 0.0047,
                "ceiling": 0.4,
                "passed": true
              },
              "tenant": {
                "column": "tenant_id",
                "top_value": "33759aa1-7868-5a39-ac11-bc332a9037dd",
                "top_share": 0.0095,
                "ceiling": 0.4,
                "passed": true
              }
            }
          }
        },
        "G5": {
          "gate": "G5_provenance",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "missing_columns": [],
            "unpopulated": {},
            "expected_columns": 11
          }
        },
        "G6": {
          "gate": "G6_determinism",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "hash_run_1": "497e2eb0cec8907adacb38670f2cd5a39403707141a161d1b26a61c23882ee70",
            "hash_run_2": "497e2eb0cec8907adacb38670f2cd5a39403707141a161d1b26a61c23882ee70",
            "match": true
          }
        }
      },
      "classification": "ENGINEERING_VERIFICATION",
      "advisory_label": "Synthetic data — engineering test fixture per ARCH-AI-001 AC-2. Not Gold Standard.",
      "provenance_columns": [
        "synthetic_record_id",
        "generator_id",
        "generator_version",
        "seed_value",
        "generation_timestamp_utc",
        "schema_hash_sha256",
        "is_edge_case",
        "edge_case_category",
        "partition_assignment",
        "partition_assignment_hash",
        "advisory_label"
      ]
    },
    {
      "dataset_id": "wf_crc_005_findings",
      "workflow_id": "WF-CRC-005",
      "workflow_name": "Findings",
      "urs_refs": [
        "URS-21"
      ],
      "generator_id": "wf_crc_005_findings",
      "generator_version": "1.1.0",
      "generator_type": "rule_based+statistical",
      "seed_value": "vrx-crc-005-seed-2026",
      "generation_timestamp_utc": "2026-06-08T00:00:00+00:00",
      "schema_version": "1.0.0",
      "schema_hash_sha256": "612fa95bf1c34bcf02aa4db22b38c66514bc9c6563542a3ca92b952575a61737",
      "output_file": "synthetic_data/wf_crc_005_findings_v1.0.parquet",
      "output_format": "parquet",
      "output_row_count": 2000,
      "content_sha256": "a190968e7cd20becec96fcf8c9e39b826789999461e4a8f3cf3677ce0f4a1c53",
      "content_sha256_note": "CANONICAL G6 determinism anchor — portable across toolchains (audit F2).",
      "output_sha256": "203b047b36b85a6680007d5751c2c17a09c18955e29a77afa1dae3e107c34b33",
      "output_sha256_note": "Environment-bound Parquet-file integrity hash; reproducible only within the recorded build_environment, not across pyarrow versions (audit F2).",
      "build_environment": {
        "python": "3.13.12",
        "implementation": "CPython",
        "libraries": {
          "numpy": "2.4.6",
          "pandas": "3.0.3",
          "pyarrow": "24.0.0",
          "scipy": "1.17.1",
          "Faker": "40.21.0",
          "jsonschema": "4.26.0"
        }
      },
      "review_sample_file": "synthetic_data/wf_crc_005_findings_v1.0.review_sample.xlsx",
      "review_sample_row_count": 900,
      "review_sample_sha256": "645eb9d3a1b7c796b72c3b205e6acc0d9ed8f79d2975af465854dbea0e433deb",
      "partition_map_file": "synthetic_data/wf_crc_005_findings_v1.0.partition_map.parquet",
      "partition_map_hash": "aae856793fea48f3af447488cf02a52372255bb7ba718f7e3fb7be1b128826e1",
      "large_dataset": false,
      "data_artifact_committed": true,
      "regeneration_note": null,
      "partition_split": {
        "training": 1400,
        "evaluation": 400,
        "edge_case": 200
      },
      "edge_case_coverage_pct": 0.22,
      "edge_case_by_category": {
        "boundary": 140,
        "negative": 130,
        "adversarial": 90,
        "historical-failure": 80
      },
      "bias_test_results": {
        "severity": {
          "column": "severity",
          "top_value": "major",
          "top_share": 0.371,
          "ceiling": 0.4,
          "passed": true
        },
        "source_type": {
          "column": "source_type",
          "top_value": "inspection_observation",
          "top_share": 0.2385,
          "ceiling": 0.4,
          "passed": true
        }
      },
      "gold_standard_seed_corpus": [
        {
          "gold_standard_id": "GS-002",
          "derived_anchor": "gold_standard/GS-002_cfr211_drug_freq.json",
          "what": "FDA 21 CFR Part 211 cGMP citation-section frequency (drug subset)",
          "seeds": "WF-CRC-005 Findings.regulation_ref",
          "gate": "G2 chi-square goodness-of-fit",
          "note": "Anchor JSON is vendored + hash-pinned; derivation (Program Area=Drugs, Act/CFR Number -> 21 CFR 211.NNN) documented in the anchor file with GS-002 sha256."
        }
      ],
      "intended_use": null,
      "canonicalization_note": null,
      "model_grounding": {
        "source": "URS-21 (DEC-21-02 state machine, DEC-21-04 source linkage, DEC-21-08 severity, DEC-21-09 severity-gated mandatory CAPA, DEC-21-11 RCA trigger, DEC-21-13 bound e-sig, DEC-21-15 immutability, DEC-21-21 critical closure matrix, DEC-21-22 reopen, SoD-21-01..07)",
        "lifecycle": "open -> in_progress -> resolved -> closed; -> deferred; closed -> in_progress (governed reopen)",
        "chain_link": "Severity-gated: critical/major findings link >=1 CAPA (capa_id -> WF-CRC-004 CAPA, forward); subset trigger an RCA (triggered_investigation_rca_id -> WF-CRC-003, forward). Findings source is trend/inspection (external), not a single event FK, per URS-21 DEC-21-04.",
        "sod": "SoD-21-03 resolver!=creator (critical/major); SoD-21-04 closure authority!=creator"
      },
      "assumption_register": null,
      "gates": {
        "G1": {
          "gate": "G1_schema",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "rows": 2000,
            "failures": 0,
            "sample": []
          }
        },
        "G2": {
          "gate": "G2_categorical",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "chi2_stat": 19.98352385507109,
            "p_value": 0.3955702370350207,
            "alpha": 0.05,
            "categories": 20,
            "df": 19,
            "n_observed": 1670,
            "anchor_total": 33116,
            "min_detectable_effect_w": 0.111,
            "effect_band_detectable": "small",
            "power_note": "PASS = not inconsistent with the Gold Standard at n=1670 (detects Cohen's w >= 0.111 at 80% power); demonstrates consistency, not high-confidence identity."
          }
        },
        "G3": {
          "gate": "G3_edge_cases",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "edge_rows": 440,
            "total": 2000,
            "pct": 0.22,
            "min_required": 0.15,
            "by_category": {
              "boundary": 140,
              "negative": 130,
              "adversarial": 90,
              "historical-failure": 80
            }
          }
        },
        "G4": {
          "gate": "G4_bias",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "axes": {
              "severity": {
                "column": "severity",
                "top_value": "major",
                "top_share": 0.371,
                "ceiling": 0.4,
                "passed": true
              },
              "source_type": {
                "column": "source_type",
                "top_value": "inspection_observation",
                "top_share": 0.2385,
                "ceiling": 0.4,
                "passed": true
              }
            }
          }
        },
        "G5": {
          "gate": "G5_provenance",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "missing_columns": [],
            "unpopulated": {},
            "expected_columns": 11
          }
        },
        "G6": {
          "gate": "G6_determinism",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "hash_run_1": "a190968e7cd20becec96fcf8c9e39b826789999461e4a8f3cf3677ce0f4a1c53",
            "hash_run_2": "a190968e7cd20becec96fcf8c9e39b826789999461e4a8f3cf3677ce0f4a1c53",
            "match": true
          }
        }
      },
      "classification": "ENGINEERING_VERIFICATION",
      "advisory_label": "Synthetic data — engineering test fixture per ARCH-AI-001 AC-2. Not Gold Standard.",
      "provenance_columns": [
        "synthetic_record_id",
        "generator_id",
        "generator_version",
        "seed_value",
        "generation_timestamp_utc",
        "schema_hash_sha256",
        "is_edge_case",
        "edge_case_category",
        "partition_assignment",
        "partition_assignment_hash",
        "advisory_label"
      ]
    },
    {
      "dataset_id": "wf_doc_001_document_control",
      "workflow_id": "WF-DOC-001",
      "workflow_name": "Document Control & SOP Lifecycle",
      "urs_refs": [
        "URS-12"
      ],
      "generator_id": "wf_doc_001_document_control",
      "generator_version": "1.0.0",
      "generator_type": "rule_based+statistical",
      "seed_value": "vrx-doc-001-seed-2026",
      "generation_timestamp_utc": "2026-06-08T00:00:00+00:00",
      "schema_version": "1.0.0",
      "schema_hash_sha256": "91e9e1b2431af7f7651d4b361cead6ce37f3c97e9c2c59e4e96a764799719624",
      "output_file": "synthetic_data/wf_doc_001_document_control_v1.0.parquet",
      "output_format": "parquet",
      "output_row_count": 5000,
      "content_sha256": "b549e94d60b670bdd2ad9635aeed9a6cfd79ba0273f79d2f5f443f5e164c0490",
      "content_sha256_note": "CANONICAL G6 determinism anchor — portable across toolchains (audit F2).",
      "output_sha256": "8e388dc299637eec2417148c5180ea99a98bc6d4d868e416d3ec11f81c43bf08",
      "output_sha256_note": "Environment-bound Parquet-file integrity hash; reproducible only within the recorded build_environment, not across pyarrow versions (audit F2).",
      "build_environment": {
        "python": "3.13.12",
        "implementation": "CPython",
        "libraries": {
          "numpy": "2.4.6",
          "pandas": "3.0.3",
          "pyarrow": "24.0.0",
          "scipy": "1.17.1",
          "Faker": "40.21.0",
          "jsonschema": "4.26.0"
        }
      },
      "review_sample_file": "synthetic_data/wf_doc_001_document_control_v1.0.review_sample.xlsx",
      "review_sample_row_count": 1770,
      "review_sample_sha256": "b61985eaf7dd578a20040bb8254fceb8c9aee3e16e5a992138b053f094367e9c",
      "partition_map_file": "synthetic_data/wf_doc_001_document_control_v1.0.partition_map.parquet",
      "partition_map_hash": "9c9c2d2061758697b5262c9d8d7e56d17326d7629f87336a0c6add92ee6c98d9",
      "large_dataset": false,
      "data_artifact_committed": true,
      "regeneration_note": null,
      "partition_split": {
        "training": 3500,
        "evaluation": 1000,
        "edge_case": 500
      },
      "edge_case_coverage_pct": 0.18,
      "edge_case_by_category": {
        "boundary": 280,
        "adversarial": 250,
        "negative": 220,
        "historical-failure": 150
      },
      "bias_test_results": {
        "document_type": {
          "column": "document_type",
          "top_value": "sop",
          "top_share": 0.3128,
          "ceiling": 0.4,
          "passed": true
        },
        "tenant": {
          "column": "tenant_id",
          "top_value": "2359cb34-1fc2-7b8c-8c89-b6ff0c25a0aa",
          "top_share": 0.012,
          "ceiling": 0.4,
          "passed": true
        }
      },
      "gold_standard_seed_corpus": [],
      "model_grounding": {
        "source": "URS-12 (DEC-12-02 lifecycle, DEC-12-03 versioning, DEC-12-06 types, DEC-12-07 review cadence, DEC-12-23 retention, SoD-12-01/02, §6.6 audit codes)",
        "lifecycle_states": "draft | in_review | approved | effective | superseded | retired | withdrawn | held",
        "review_cadence": "SOP/WI 2yr, policy 3yr; missed review -> regulatory_concern auto-suspension",
        "retention_classes": "gmp_master_record(10y) | gmp_quality_record(5y) | glp_study(7y) | gcp_clinical(25y) | regulatory_submission(indefinite) | non_gxp_*(1-3y)",
        "sod": "SoD-12-01 author != approver; SoD-12-02 review initiator != approver; SoD-12-06 hold author != hold releaser"
      },
      "assumption_register": {
        "note": "Declared engineering assumptions.",
        "document_type_mix": {
          "source": "URS-12 §11 representative distribution",
          "values": {
            "sop": 0.3,
            "wi": 0.15,
            "specification": 0.12,
            "policy": 0.1,
            "protocol": 0.08,
            "form": 0.08,
            "report": 0.08,
            "master_file_component": 0.05,
            "evidence_package": 0.04
          }
        },
        "gxp_mix": {
          "source": "URS-12 §11",
          "values": {
            "gmp": 0.55,
            "glp": 0.18,
            "gcp": 0.12,
            "gdp": 0.08,
            "non_gxp": 0.05,
            "multi": 0.02
          }
        },
        "training_required_rate": {
          "source": "URS-12 §11 (Process Library SOP/WI)",
          "value": 0.8
        }
      },
      "gates": {
        "G1": {
          "gate": "G1_schema",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "rows": 5000,
            "failures": 0,
            "sample": []
          }
        },
        "G2": {
          "gate": "G2_statistical",
          "status": "N/A",
          "passed": true,
          "applicable": false,
          "detail": {
            "rationale": "WF-DOC-001 Document Control: no Gold Standard SOP-master corpus. G2 N/A. Document types, lifecycle, versioning, periodic review, retention classes and audit codes grounded in URS-12."
          }
        },
        "G3": {
          "gate": "G3_edge_cases",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "edge_rows": 900,
            "total": 5000,
            "pct": 0.18,
            "min_required": 0.15,
            "by_category": {
              "boundary": 280,
              "adversarial": 250,
              "negative": 220,
              "historical-failure": 150
            }
          }
        },
        "G4": {
          "gate": "G4_bias",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "axes": {
              "document_type": {
                "column": "document_type",
                "top_value": "sop",
                "top_share": 0.3128,
                "ceiling": 0.4,
                "passed": true
              },
              "tenant": {
                "column": "tenant_id",
                "top_value": "2359cb34-1fc2-7b8c-8c89-b6ff0c25a0aa",
                "top_share": 0.012,
                "ceiling": 0.4,
                "passed": true
              }
            }
          }
        },
        "G5": {
          "gate": "G5_provenance",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "missing_columns": [],
            "unpopulated": {},
            "expected_columns": 11
          }
        },
        "G6": {
          "gate": "G6_determinism",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "hash_run_1": "b549e94d60b670bdd2ad9635aeed9a6cfd79ba0273f79d2f5f443f5e164c0490",
            "hash_run_2": "b549e94d60b670bdd2ad9635aeed9a6cfd79ba0273f79d2f5f443f5e164c0490",
            "match": true
          }
        }
      },
      "classification": "ENGINEERING_VERIFICATION",
      "advisory_label": "Synthetic data — engineering test fixture per ARCH-AI-001 AC-2. Not Gold Standard.",
      "provenance_columns": [
        "synthetic_record_id",
        "generator_id",
        "generator_version",
        "seed_value",
        "generation_timestamp_utc",
        "schema_hash_sha256",
        "is_edge_case",
        "edge_case_category",
        "partition_assignment",
        "partition_assignment_hash",
        "advisory_label"
      ]
    },
    {
      "dataset_id": "wf_doc_002_training",
      "workflow_id": "WF-DOC-002",
      "workflow_name": "Training Management & Qualification",
      "urs_refs": [
        "URS-28"
      ],
      "generator_id": "wf_doc_002_training",
      "generator_version": "1.0.0",
      "generator_type": "rule_based+statistical",
      "seed_value": "vrx-doc-002-seed-2026",
      "generation_timestamp_utc": "2026-06-08T00:00:00+00:00",
      "schema_version": "1.0.0",
      "schema_hash_sha256": "a3d10d6a981ea55c3525fb24873d9ade2458fa31ee3184538bfd2e1e1438f6a5",
      "output_file": "synthetic_data/wf_doc_002_training_v1.0.parquet",
      "output_format": "parquet",
      "output_row_count": 100000,
      "content_sha256": "1cf75008f3d2b8b861cd96221c5a01fb0d93e0ac541bb9478768d22001075a54",
      "content_sha256_note": "CANONICAL G6 determinism anchor — portable across toolchains (audit F2).",
      "output_sha256": "986f2de0f5e1533b68afc36d2ebe73ec886b8dad1b5a89a0768421ebcf0a6e30",
      "output_sha256_note": "Environment-bound Parquet-file integrity hash; reproducible only within the recorded build_environment, not across pyarrow versions (audit F2).",
      "build_environment": {
        "python": "3.13.12",
        "implementation": "CPython",
        "libraries": {
          "numpy": "2.4.6",
          "pandas": "3.0.3",
          "pyarrow": "24.0.0",
          "scipy": "1.17.1",
          "Faker": "40.21.0",
          "jsonschema": "4.26.0"
        }
      },
      "review_sample_file": "synthetic_data/wf_doc_002_training_v1.0.review_sample.xlsx",
      "review_sample_row_count": 1600,
      "review_sample_sha256": "fa0a3037a0e06bb5794020954be5673e2d638e182fade9699430eb022f094461",
      "partition_map_file": "synthetic_data/wf_doc_002_training_v1.0.partition_map.parquet",
      "partition_map_hash": "7d96a7e8d973b366a1a291ebfc2a23bdbc14b79fb2d2da40c4b7f1fdba987627",
      "large_dataset": false,
      "data_artifact_committed": true,
      "regeneration_note": null,
      "partition_split": {
        "training": 70000,
        "evaluation": 20000,
        "edge_case": 10000
      },
      "edge_case_coverage_pct": 0.18,
      "edge_case_by_category": {
        "adversarial": 6500,
        "historical-failure": 4500,
        "negative": 3500,
        "boundary": 3500
      },
      "bias_test_results": {
        "assignment_source": {
          "column": "assignment_source",
          "top_value": "manual",
          "top_share": 0.329,
          "ceiling": 0.4,
          "passed": true
        },
        "tenant": {
          "column": "tenant_id",
          "top_value": "8064a089-1d99-e333-ba6d-df54fe71c7ca",
          "top_share": 0.0098,
          "ceiling": 0.4,
          "passed": true
        }
      },
      "gold_standard_seed_corpus": [],
      "model_grounding": {
        "source": "URS-28 (DEC-28-04 assignment lifecycle, DEC-28-05 record lifecycle, DEC-28-06 assessment, DEC-28-09 triggers, DEC-28-23 qualification gate, SoD-28-02, audit codes)",
        "assignment_lifecycle": "assigned -> in_progress -> completed -> waived|exempted",
        "record_lifecycle": "created -> in_progress -> completed -> verified",
        "assignment_sources": "manual | sop_revision_trigger | change_request_trigger | capa_trigger | periodic_trigger | new_hire_trigger",
        "sod": "SoD-28-02 verified_by != user_id (trainee)",
        "qualification_gate": "GET /training/qualification -> TRN_QUALIFICATION_GATE_FAILED if no verified record / waiver / exemption"
      },
      "assumption_register": {
        "note": "Declared engineering assumptions.",
        "assignment_source_mix": {
          "source": "engineering assumption",
          "values": {
            "manual": 0.35,
            "periodic_trigger": 0.2,
            "new_hire_trigger": 0.15,
            "sop_revision_trigger": 0.15,
            "change_request_trigger": 0.1,
            "capa_trigger": 0.05
          }
        },
        "first_attempt_pass_rate": {
          "source": "URS-28 §11 (~85%)",
          "value": 0.85
        },
        "pass_threshold": {
          "source": "URS-28 DEC-28-06",
          "value": 0.7
        }
      },
      "gates": {
        "G1": {
          "gate": "G1_schema",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "rows": 100000,
            "failures": 0,
            "sample": []
          }
        },
        "G2": {
          "gate": "G2_statistical",
          "status": "N/A",
          "passed": true,
          "applicable": false,
          "detail": {
            "rationale": "WF-DOC-002 Training: no Gold Standard training-event corpus. G2 N/A. Assignment/record lifecycle, effectiveness scoring, requalification, qualification gate and audit codes grounded in URS-28."
          }
        },
        "G3": {
          "gate": "G3_edge_cases",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "edge_rows": 18000,
            "total": 100000,
            "pct": 0.18,
            "min_required": 0.15,
            "by_category": {
              "adversarial": 6500,
              "historical-failure": 4500,
              "negative": 3500,
              "boundary": 3500
            }
          }
        },
        "G4": {
          "gate": "G4_bias",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "axes": {
              "assignment_source": {
                "column": "assignment_source",
                "top_value": "manual",
                "top_share": 0.329,
                "ceiling": 0.4,
                "passed": true
              },
              "tenant": {
                "column": "tenant_id",
                "top_value": "8064a089-1d99-e333-ba6d-df54fe71c7ca",
                "top_share": 0.0098,
                "ceiling": 0.4,
                "passed": true
              }
            }
          }
        },
        "G5": {
          "gate": "G5_provenance",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "missing_columns": [],
            "unpopulated": {},
            "expected_columns": 11
          }
        },
        "G6": {
          "gate": "G6_determinism",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "hash_run_1": "1cf75008f3d2b8b861cd96221c5a01fb0d93e0ac541bb9478768d22001075a54",
            "hash_run_2": "1cf75008f3d2b8b861cd96221c5a01fb0d93e0ac541bb9478768d22001075a54",
            "match": true
          }
        }
      },
      "classification": "ENGINEERING_VERIFICATION",
      "advisory_label": "Synthetic data — engineering test fixture per ARCH-AI-001 AC-2. Not Gold Standard.",
      "provenance_columns": [
        "synthetic_record_id",
        "generator_id",
        "generator_version",
        "seed_value",
        "generation_timestamp_utc",
        "schema_hash_sha256",
        "is_edge_case",
        "edge_case_category",
        "partition_assignment",
        "partition_assignment_hash",
        "advisory_label"
      ]
    },
    {
      "dataset_id": "wf_doc_003_dqg_intake",
      "workflow_id": "WF-DOC-003",
      "workflow_name": "Document Quality Gateway (DQG) Intake",
      "urs_refs": [
        "URS-31",
        "URS-32"
      ],
      "generator_id": "wf_doc_003_dqg_intake",
      "generator_version": "1.1.0",
      "generator_type": "rule_based+statistical",
      "seed_value": "vrx-doc-003-seed-2026",
      "generation_timestamp_utc": "2026-06-08T00:00:00+00:00",
      "schema_version": "1.0.0",
      "schema_hash_sha256": "db9efac30e69e49ce54063b02edfcbde594a461711e7a378f0179494768722fa",
      "output_file": "synthetic_data/wf_doc_003_dqg_intake_v1.0.parquet",
      "output_format": "parquet",
      "output_row_count": 20000,
      "content_sha256": "22de61df1c99e11e59dacb5e24410534738a8fc9cf2899a9b7e9f39048acc8a3",
      "content_sha256_note": "CANONICAL G6 determinism anchor — portable across toolchains (audit F2).",
      "output_sha256": "56f05936a2076b46df1182ef13e73990ff1178b18b3451788d098ed2318caa41",
      "output_sha256_note": "Environment-bound Parquet-file integrity hash; reproducible only within the recorded build_environment, not across pyarrow versions (audit F2).",
      "build_environment": {
        "python": "3.13.12",
        "implementation": "CPython",
        "libraries": {
          "numpy": "2.4.6",
          "pandas": "3.0.3",
          "pyarrow": "24.0.0",
          "scipy": "1.17.1",
          "Faker": "40.21.0",
          "jsonschema": "4.26.0"
        }
      },
      "review_sample_file": "synthetic_data/wf_doc_003_dqg_intake_v1.0.review_sample.xlsx",
      "review_sample_row_count": 1600,
      "review_sample_sha256": "80af24120629f1d517c72c550707dc19619459a18be60cba92e93f879e44abaa",
      "partition_map_file": "synthetic_data/wf_doc_003_dqg_intake_v1.0.partition_map.parquet",
      "partition_map_hash": "f2dd4143d30527a9656169e048ebf598d8936aa386a2359c74bfd7805d32e62f",
      "large_dataset": false,
      "data_artifact_committed": true,
      "regeneration_note": null,
      "partition_split": {
        "training": 14000,
        "evaluation": 4000,
        "edge_case": 2000
      },
      "edge_case_coverage_pct": 0.18,
      "edge_case_by_category": {
        "negative": 1000,
        "boundary": 1000,
        "adversarial": 800,
        "historical-failure": 800
      },
      "bias_test_results": {
        "received_via": {
          "column": "received_via",
          "top_value": "email_imap",
          "top_share": 0.3313,
          "ceiling": 0.4,
          "passed": true
        },
        "tenant": {
          "column": "tenant_id",
          "top_value": "b42b8d99-f78e-5c8f-5ca9-2e6185bbec15",
          "top_share": 0.0102,
          "ceiling": 0.4,
          "passed": true
        }
      },
      "gold_standard_seed_corpus": [],
      "intended_use": "Engineering verification of the DQG advisory/HITL control boundary (advisory-only classification, mandatory human review, AI-never-finalizes, attribution XOR) and audit-event wiring. This dataset does NOT and cannot validate real generative-model behaviour (hallucination, confidence calibration, prompt-injection resistance) — the advisory output is deterministically synthesized by a rule engine. Real-model validation is a separate intended use deferred to Phase 2B (MIRA).",
      "canonicalization_note": null,
      "model_grounding": {
        "source": "URS-31 (DEC-31-03 16-state lifecycle, DEC-31-13 AI advisory substrate, DEC-31-14 attribution XOR, DEC-31-23 HITL qualification gate, ARCH-AI-001 AC-1)",
        "ai_advisory_boundary": "MANDATORY HITL review on every artifact (DEC-31-03). AI proposes classification with confidence + provenance but CANNOT finalize HITL approval/rejection, approve filing, withdraw, or release taxonomy (7 prohibitions). 'No AI service shall be the only path to advance a regulated DQG workflow' (ARCH-AI-001 AC-1).",
        "deterministic_synthesis_note": "Per kickoff determinism rule, AI classification is synthesized DETERMINISTICALLY (model_id=deterministic_rule_engine) rather than via live LLM, to preserve G6 bit-determinism and avoid temp=0 non-reproducibility. The advisory boundary (human_review_required=true, ai never finalizes) is modelled faithfully; real LLM narrative is Phase 2B (MIRA).",
        "received_via": "email_imap | webhook_hmac | direct_upload | partner_api_oauth2",
        "human_disposition": "approved | rejected | requires_remediation"
      },
      "assumption_register": {
        "note": "Declared engineering assumptions for advisory classification confidence.",
        "received_via_mix": {
          "source": "engineering assumption",
          "values": {
            "email_imap": 0.34,
            "direct_upload": 0.3,
            "partner_api_oauth2": 0.2,
            "webhook_hmac": 0.16
          }
        },
        "normal_disposition_mix": {
          "source": "engineering assumption",
          "values": {
            "approved": 0.85,
            "requires_remediation": 0.1,
            "rejected": 0.05
          }
        },
        "normal_confidence_range": {
          "source": "engineering assumption",
          "high": [
            0.85,
            0.99
          ],
          "medium": [
            0.55,
            0.84
          ]
        }
      },
      "gates": {
        "G1": {
          "gate": "G1_schema",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "rows": 20000,
            "failures": 0,
            "sample": []
          }
        },
        "G2": {
          "gate": "G2_statistical",
          "status": "N/A",
          "passed": true,
          "applicable": false,
          "detail": {
            "rationale": "WF-DOC-003 DQG Intake: no Gold Standard intake corpus. G2 N/A. Intake lifecycle, classification taxonomy, AI advisory boundary and audit codes grounded in URS-31."
          }
        },
        "G3": {
          "gate": "G3_edge_cases",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "edge_rows": 3600,
            "total": 20000,
            "pct": 0.18,
            "min_required": 0.15,
            "by_category": {
              "negative": 1000,
              "boundary": 1000,
              "adversarial": 800,
              "historical-failure": 800
            }
          }
        },
        "G4": {
          "gate": "G4_bias",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "axes": {
              "received_via": {
                "column": "received_via",
                "top_value": "email_imap",
                "top_share": 0.3313,
                "ceiling": 0.4,
                "passed": true
              },
              "tenant": {
                "column": "tenant_id",
                "top_value": "b42b8d99-f78e-5c8f-5ca9-2e6185bbec15",
                "top_share": 0.0102,
                "ceiling": 0.4,
                "passed": true
              }
            }
          }
        },
        "G5": {
          "gate": "G5_provenance",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "missing_columns": [],
            "unpopulated": {},
            "expected_columns": 11
          }
        },
        "G6": {
          "gate": "G6_determinism",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "hash_run_1": "22de61df1c99e11e59dacb5e24410534738a8fc9cf2899a9b7e9f39048acc8a3",
            "hash_run_2": "22de61df1c99e11e59dacb5e24410534738a8fc9cf2899a9b7e9f39048acc8a3",
            "match": true
          }
        }
      },
      "classification": "ENGINEERING_VERIFICATION",
      "advisory_label": "Synthetic data — engineering test fixture per ARCH-AI-001 AC-2. Not Gold Standard.",
      "provenance_columns": [
        "synthetic_record_id",
        "generator_id",
        "generator_version",
        "seed_value",
        "generation_timestamp_utc",
        "schema_hash_sha256",
        "is_edge_case",
        "edge_case_category",
        "partition_assignment",
        "partition_assignment_hash",
        "advisory_label"
      ]
    },
    {
      "dataset_id": "wf_doc_004_screen_reader",
      "workflow_id": "WF-DOC-004",
      "workflow_name": "Screen Reader / Data Capture (OCR + structured extraction)",
      "urs_refs": [
        "URS-29",
        "URS-32"
      ],
      "generator_id": "wf_doc_004_screen_reader",
      "generator_version": "1.1.0",
      "generator_type": "rule_based+statistical",
      "seed_value": "vrx-doc-004-seed-2026",
      "generation_timestamp_utc": "2026-06-08T00:00:00+00:00",
      "schema_version": "1.0.0",
      "schema_hash_sha256": "ea652988b58d9218b4d2420ddf9e2d925aa316e5e4860b58e3d7e0c27e3e670c",
      "output_file": "synthetic_data/wf_doc_004_screen_reader_v1.0.parquet",
      "output_format": "parquet",
      "output_row_count": 10000,
      "content_sha256": "c86616b2e9c9f68ba9ef9e1705ebd39d624330b32c559ef5d8e46a6d1e4116a4",
      "content_sha256_note": "CANONICAL G6 determinism anchor — portable across toolchains (audit F2).",
      "output_sha256": "4402e51cc96e1a48533f90369125117ba45dcaa818d4c1e6655e95561650453b",
      "output_sha256_note": "Environment-bound Parquet-file integrity hash; reproducible only within the recorded build_environment, not across pyarrow versions (audit F2).",
      "build_environment": {
        "python": "3.13.12",
        "implementation": "CPython",
        "libraries": {
          "numpy": "2.4.6",
          "pandas": "3.0.3",
          "pyarrow": "24.0.0",
          "scipy": "1.17.1",
          "Faker": "40.21.0",
          "jsonschema": "4.26.0"
        }
      },
      "review_sample_file": "synthetic_data/wf_doc_004_screen_reader_v1.0.review_sample.xlsx",
      "review_sample_row_count": 1850,
      "review_sample_sha256": "449f8db8c7134e78e237030f05d5972dcba308a37e753b283103aa7ce5a3e810",
      "partition_map_file": "synthetic_data/wf_doc_004_screen_reader_v1.0.partition_map.parquet",
      "partition_map_hash": "da3e2227c1f335fe0f3e2ace8e4c7438d13434e8736ed918b05555908bc9ad31",
      "large_dataset": false,
      "data_artifact_committed": true,
      "regeneration_note": null,
      "partition_split": {
        "training": 7000,
        "evaluation": 2000,
        "edge_case": 1000
      },
      "edge_case_coverage_pct": 0.18,
      "edge_case_by_category": {
        "boundary": 600,
        "negative": 500,
        "historical-failure": 350,
        "adversarial": 350
      },
      "bias_test_results": {
        "document_type": {
          "column": "document_type",
          "top_value": "batch_record",
          "top_share": 0.2208,
          "ceiling": 0.4,
          "passed": true
        },
        "tenant": {
          "column": "tenant_id",
          "top_value": "d443a785-858e-048d-8003-64455e21228e",
          "top_share": 0.0119,
          "ceiling": 0.4,
          "passed": true
        }
      },
      "gold_standard_seed_corpus": [],
      "intended_use": "Engineering verification of the Screen Reader advisory/HITL control boundary (mandatory human review, AI-never-persists/promotes per ARCH-AI-001 AC-5, SoD reviewer!=creator) and audit-event wiring. This dataset does NOT validate real OCR/model behaviour — OCR confidence is deterministically simulated. Real-model validation is deferred to Phase 2B (MIRA).",
      "canonicalization_note": null,
      "model_grounding": {
        "source": "URS-29 (DEC-29-09 review workflow, DEC-29-10 extraction provenance/correction history, DEC-29-13 AI advisory substrate, DEC-29-23 promotion gate, SoD-29-02, ARCH-AI-001 AC-5)",
        "ai_advisory_boundary": "MANDATORY human review on every extraction (DEC-29-09). 'No AI service shall be the sole path to persist, validate, or promote captured data into a GxP record' (ARCH-AI-001 AC-5). Confidence is an advisory provenance signal, NOT a gating criterion.",
        "deterministic_synthesis_note": "Per kickoff determinism rule, OCR confidence is SIMULATED statistically (deterministic), not real OCR; model_id=deterministic_rule_engine. Advisory boundary modelled faithfully.",
        "sod": "SoD-29-02 reviewed_by != created_by (DB constraint)",
        "human_disposition": "accepted | field_corrected | rejected"
      },
      "assumption_register": {
        "note": "Declared engineering assumptions for simulated OCR confidence.",
        "document_type_mix": {
          "source": "engineering assumption",
          "values": {
            "batch_record": 0.22,
            "coa": 0.18,
            "lab_notebook": 0.16,
            "log_sheet": 0.16,
            "complaint_letter": 0.12,
            "lab_result_email": 0.1,
            "form": 0.06
          }
        },
        "normal_confidence_range": {
          "source": "engineering assumption",
          "values": [
            0.82,
            0.99
          ]
        },
        "normal_disposition_mix": {
          "source": "engineering assumption",
          "values": {
            "accepted": 0.8,
            "field_corrected": 0.15,
            "rejected": 0.05
          }
        }
      },
      "gates": {
        "G1": {
          "gate": "G1_schema",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "rows": 10000,
            "failures": 0,
            "sample": []
          }
        },
        "G2": {
          "gate": "G2_statistical",
          "status": "N/A",
          "passed": true,
          "applicable": false,
          "detail": {
            "rationale": "WF-DOC-004 Screen Reader/OCR: no Gold Standard OCR-capture corpus. G2 N/A. Capture/extraction lifecycle, confidence model, AI advisory boundary and audit codes grounded in URS-29."
          }
        },
        "G3": {
          "gate": "G3_edge_cases",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "edge_rows": 1800,
            "total": 10000,
            "pct": 0.18,
            "min_required": 0.15,
            "by_category": {
              "boundary": 600,
              "negative": 500,
              "historical-failure": 350,
              "adversarial": 350
            }
          }
        },
        "G4": {
          "gate": "G4_bias",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "axes": {
              "document_type": {
                "column": "document_type",
                "top_value": "batch_record",
                "top_share": 0.2208,
                "ceiling": 0.4,
                "passed": true
              },
              "tenant": {
                "column": "tenant_id",
                "top_value": "d443a785-858e-048d-8003-64455e21228e",
                "top_share": 0.0119,
                "ceiling": 0.4,
                "passed": true
              }
            }
          }
        },
        "G5": {
          "gate": "G5_provenance",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "missing_columns": [],
            "unpopulated": {},
            "expected_columns": 11
          }
        },
        "G6": {
          "gate": "G6_determinism",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "hash_run_1": "c86616b2e9c9f68ba9ef9e1705ebd39d624330b32c559ef5d8e46a6d1e4116a4",
            "hash_run_2": "c86616b2e9c9f68ba9ef9e1705ebd39d624330b32c559ef5d8e46a6d1e4116a4",
            "match": true
          }
        }
      },
      "classification": "ENGINEERING_VERIFICATION",
      "advisory_label": "Synthetic data — engineering test fixture per ARCH-AI-001 AC-2. Not Gold Standard.",
      "provenance_columns": [
        "synthetic_record_id",
        "generator_id",
        "generator_version",
        "seed_value",
        "generation_timestamp_utc",
        "schema_hash_sha256",
        "is_edge_case",
        "edge_case_category",
        "partition_assignment",
        "partition_assignment_hash",
        "advisory_label"
      ]
    },
    {
      "dataset_id": "wf_fnd_001_authentication",
      "workflow_id": "WF-FND-001",
      "workflow_name": "Authentication & Session",
      "urs_refs": [
        "URS-01"
      ],
      "generator_id": "wf_fnd_001_authentication",
      "generator_version": "1.1.0",
      "generator_type": "rule_based+statistical",
      "seed_value": "vrx-fnd-001-seed-2026",
      "generation_timestamp_utc": "2026-06-08T00:00:00+00:00",
      "schema_version": "1.0.0",
      "schema_hash_sha256": "74d5333fcbe4e29762b6305538346ed70bfa6fe29cdc38d597ee3e9909a84442",
      "output_file": "synthetic_data/wf_fnd_001_authentication_v1.0.parquet",
      "output_format": "parquet",
      "output_row_count": 50000,
      "content_sha256": "fc678f9c61c1097ccf6314575b4e18b2bad5e66417793233c2e1cbf92f87563c",
      "content_sha256_note": "CANONICAL G6 determinism anchor — portable across toolchains (audit F2).",
      "output_sha256": "0a05f649f9392296f0470a54040d790046034e2faa816a2e34b4500762f8f0bd",
      "output_sha256_note": "Environment-bound Parquet-file integrity hash; reproducible only within the recorded build_environment, not across pyarrow versions (audit F2).",
      "build_environment": {
        "python": "3.13.12",
        "implementation": "CPython",
        "libraries": {
          "numpy": "2.4.6",
          "pandas": "3.0.3",
          "pyarrow": "24.0.0",
          "scipy": "1.17.1",
          "Faker": "40.21.0",
          "jsonschema": "4.26.0"
        }
      },
      "review_sample_file": "synthetic_data/wf_fnd_001_authentication_v1.0.review_sample.xlsx",
      "review_sample_row_count": 2550,
      "review_sample_sha256": "a0a89acf01bda0c80f6d67806bcf97653802efc04ea738a11f0e9fc8954836ce",
      "partition_map_file": "synthetic_data/wf_fnd_001_authentication_v1.0.partition_map.parquet",
      "partition_map_hash": "d78150592b5d1e5816b17dafadc7e57a7da363c8ee11dec8b865e499a7575988",
      "large_dataset": false,
      "data_artifact_committed": true,
      "regeneration_note": null,
      "partition_split": {
        "training": 35000,
        "evaluation": 10000,
        "edge_case": 5000
      },
      "edge_case_coverage_pct": 0.176,
      "edge_case_by_category": {
        "adversarial": 5600,
        "boundary": 2200,
        "negative": 500,
        "historical-failure": 500
      },
      "bias_test_results": {
        "jurisdiction_by_geo": {
          "column": "geo_country",
          "top_value": "US",
          "top_share": 0.3187,
          "ceiling": 0.4,
          "passed": true
        },
        "jurisdiction_by_tenant": {
          "column": "tenant_id",
          "top_value": "8920bb04-c1ad-794a-b30d-0ef33fe2ecdc",
          "top_share": 0.0107,
          "ceiling": 0.4,
          "passed": true
        }
      },
      "gold_standard_seed_corpus": [],
      "model_grounding": null,
      "assumption_register": {
        "note": "Resolves audit finding F4. Every distributional prior below is a DECLARED engineering assumption, not an empirical fit. None is a Gold Standard claim.",
        "normal_event_type_weights": {
          "source": "engineering assumption (typical QMS auth telemetry shape; login-dominant)",
          "values": {
            "login_success": 0.43,
            "login_failure": 0.12,
            "mfa_challenge": 0.1,
            "mfa_success": 0.09,
            "mfa_failure": 0.03,
            "session_create": 0.08,
            "session_expire": 0.06,
            "session_idle_timeout": 0.03,
            "password_change": 0.025,
            "password_expiry": 0.015,
            "account_lockout": 0.012,
            "account_unlock": 0.008
          }
        },
        "mfa_method_mix": {
          "source": "engineering assumption (enterprise MFA adoption; TOTP-dominant)",
          "values": {
            "totp": 0.55,
            "sms": 0.25,
            "hardware_key": 0.12,
            "none": 0.08
          }
        },
        "geo_prior": {
          "source": "Decision B regulatory-jurisdiction blend (32 IN / 32 US / 15 EU / 10 CN / 11 other)",
          "use_boundary": "See use_boundary below."
        },
        "home_login_probability": {
          "source": "engineering assumption",
          "value": 0.9
        },
        "primary_device_probability": {
          "source": "engineering assumption",
          "value": 0.85
        }
      },
      "gates": {
        "G1": {
          "gate": "G1_schema",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "rows": 50000,
            "failures": 0,
            "sample": []
          }
        },
        "G2": {
          "gate": "G2_statistical",
          "status": "N/A",
          "passed": true,
          "applicable": false,
          "detail": {
            "rationale": "WF-FND-001 Authentication has zero public Gold Standard corpus (Synthetic Data Plan v1.0 §2.1: platform-internal behavior; no public auth-log dataset is locked in gold_standard_manifest_v1.0.json). G2 statistical fidelity is therefore N/A. No real-distribution claim is made; the event mix is engineered from documented rule-based assumptions (see assumption_register), not fitted to any Gold Standard anchor."
          }
        },
        "G3": {
          "gate": "G3_edge_cases",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "edge_rows": 8800,
            "total": 50000,
            "pct": 0.176,
            "min_required": 0.15,
            "by_category": {
              "adversarial": 5600,
              "boundary": 2200,
              "negative": 500,
              "historical-failure": 500
            }
          }
        },
        "G4": {
          "gate": "G4_bias",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "axes": {
              "jurisdiction_by_geo": {
                "column": "geo_country",
                "top_value": "US",
                "top_share": 0.3187,
                "ceiling": 0.4,
                "passed": true
              },
              "jurisdiction_by_tenant": {
                "column": "tenant_id",
                "top_value": "8920bb04-c1ad-794a-b30d-0ef33fe2ecdc",
                "top_share": 0.0107,
                "ceiling": 0.4,
                "passed": true
              }
            }
          }
        },
        "G5": {
          "gate": "G5_provenance",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "missing_columns": [],
            "unpopulated": {},
            "expected_columns": 11
          }
        },
        "G6": {
          "gate": "G6_determinism",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "hash_run_1": "fc678f9c61c1097ccf6314575b4e18b2bad5e66417793233c2e1cbf92f87563c",
            "hash_run_2": "fc678f9c61c1097ccf6314575b4e18b2bad5e66417793233c2e1cbf92f87563c",
            "match": true
          }
        }
      },
      "classification": "ENGINEERING_VERIFICATION",
      "advisory_label": "Synthetic data — engineering test fixture per ARCH-AI-001 AC-2. Not Gold Standard.",
      "provenance_columns": [
        "synthetic_record_id",
        "generator_id",
        "generator_version",
        "seed_value",
        "generation_timestamp_utc",
        "schema_hash_sha256",
        "is_edge_case",
        "edge_case_category",
        "partition_assignment",
        "partition_assignment_hash",
        "advisory_label"
      ]
    },
    {
      "dataset_id": "wf_fnd_002_rbac",
      "workflow_id": "WF-FND-002",
      "workflow_name": "RBAC / Permissions",
      "urs_refs": [
        "URS-02"
      ],
      "generator_id": "wf_fnd_002_rbac",
      "generator_version": "1.0.0",
      "generator_type": "rule_based+statistical",
      "seed_value": "vrx-fnd-002-seed-2026",
      "generation_timestamp_utc": "2026-06-08T00:00:00+00:00",
      "schema_version": "1.0.0",
      "schema_hash_sha256": "481e0ebfcc9153401dbcbf4f1d2aa87c6d73c7080d380de65a7a1d1fcf935c4a",
      "output_file": "synthetic_data/wf_fnd_002_rbac_v1.0.parquet",
      "output_format": "parquet",
      "output_row_count": 20000,
      "content_sha256": "38c1f69b1ad61ad248cf70e9f6e127f0052d05ff4c46b679b4c56a8ffa5caa1d",
      "content_sha256_note": "CANONICAL G6 determinism anchor — portable across toolchains (audit F2).",
      "output_sha256": "0cd6ef582dbfe4ed25da420e12ff86c30a106bbe726358900d3e54d0f6731774",
      "output_sha256_note": "Environment-bound Parquet-file integrity hash; reproducible only within the recorded build_environment, not across pyarrow versions (audit F2).",
      "build_environment": {
        "python": "3.13.12",
        "implementation": "CPython",
        "libraries": {
          "numpy": "2.4.6",
          "pandas": "3.0.3",
          "pyarrow": "24.0.0",
          "scipy": "1.17.1",
          "Faker": "40.21.0",
          "jsonschema": "4.26.0"
        }
      },
      "review_sample_file": "synthetic_data/wf_fnd_002_rbac_v1.0.review_sample.xlsx",
      "review_sample_row_count": 2800,
      "review_sample_sha256": "245b61a06e9c01acdad34494f6b7bedb38e49726ecdfdd27ae07968eca573f7c",
      "partition_map_file": "synthetic_data/wf_fnd_002_rbac_v1.0.partition_map.parquet",
      "partition_map_hash": "6876d95c003e65853655ecb84b16b3c816286ede0f37566831778c2aa0d40825",
      "large_dataset": false,
      "data_artifact_committed": true,
      "regeneration_note": null,
      "partition_split": {
        "training": 14000,
        "evaluation": 4000,
        "edge_case": 2000
      },
      "edge_case_coverage_pct": 0.18,
      "edge_case_by_category": {
        "adversarial": 2100,
        "boundary": 700,
        "negative": 600,
        "historical-failure": 200
      },
      "bias_test_results": {
        "role": {
          "column": "role",
          "top_value": "viewer",
          "top_share": 0.2382,
          "ceiling": 0.4,
          "passed": true
        },
        "tenant": {
          "column": "tenant_id",
          "top_value": "2c77bf25-fd7c-dda5-c16e-c7c5b69653bf",
          "top_share": 0.0108,
          "ceiling": 0.4,
          "passed": true
        },
        "resource_type": {
          "column": "resource_type",
          "top_value": "roles",
          "top_share": 0.0392,
          "ceiling": 0.4,
          "passed": true
        }
      },
      "gold_standard_seed_corpus": [],
      "model_grounding": {
        "source": "URS-02_RBAC-Permissions.Target-State.md (§3 roles/hierarchy, §6.1 taxonomy, §6.5/BR-02 decision model, §3.4 SoD, §12 enforcement chain)",
        "role_hierarchy": "viewer(1) < reviewer(2) < auditor(3) < quality_lead(4) < admin(5) < platform_admin < super_admin",
        "action_taxonomy_note": "URS-02 §6.1.2 defines 36 launch actions; this generator uses a representative 13-verb subset spanning the regulated/non-regulated spectrum, declared in assumption_register. 'unknown' is a sentinel for untrusted input that did not resolve to a known action.",
        "deny_by_default": "Every (role,resource,action) defaults is_allowed=false until granted (BR-02-04).",
        "no_wildcards_no_inheritance": "URS-02 §6.3.2: strict tuple-based, no glob/regex permissions, no role inheritance. wildcard_boundary edge tests rejection of wildcard-style requests.",
        "regulated_actions_require_esig": "approve, reject, delete, manage_users, manage_roles, configure require e-signature + authority (URS-02 §6.7)."
      },
      "assumption_register": {
        "note": "Declared engineering assumptions, not empirical fits. None is a Gold Standard claim.",
        "role_mix": {
          "source": "engineering assumption (viewer-heavy org)",
          "values": {
            "viewer": 0.26,
            "reviewer": 0.2,
            "quality_lead": 0.18,
            "auditor": 0.12,
            "admin": 0.12,
            "platform_admin": 0.07,
            "super_admin": 0.05
          }
        },
        "action_mix": {
          "source": "engineering assumption (read-dominant permission-check telemetry)"
        },
        "context_gate_pass_rate_normal": {
          "source": "engineering assumption",
          "value": 0.97
        },
        "resource_distribution": {
          "source": "engineering assumption",
          "value": "uniform over 35 URS-02 resources"
        }
      },
      "gates": {
        "G1": {
          "gate": "G1_schema",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "rows": 20000,
            "failures": 0,
            "sample": []
          }
        },
        "G2": {
          "gate": "G2_statistical",
          "status": "N/A",
          "passed": true,
          "applicable": false,
          "detail": {
            "rationale": "WF-FND-002 RBAC has zero public Gold Standard corpus (Plan §2.1: platform-internal behavior; no public RBAC permission-check dataset is locked). G2 N/A. The permission matrix and decision model are grounded in URS-02 Target State (roles, hierarchy, resource/action taxonomy, deny-by-default, SoD), not fitted to any Gold Standard anchor."
          }
        },
        "G3": {
          "gate": "G3_edge_cases",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "edge_rows": 3600,
            "total": 20000,
            "pct": 0.18,
            "min_required": 0.15,
            "by_category": {
              "adversarial": 2100,
              "boundary": 700,
              "negative": 600,
              "historical-failure": 200
            }
          }
        },
        "G4": {
          "gate": "G4_bias",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "axes": {
              "role": {
                "column": "role",
                "top_value": "viewer",
                "top_share": 0.2382,
                "ceiling": 0.4,
                "passed": true
              },
              "tenant": {
                "column": "tenant_id",
                "top_value": "2c77bf25-fd7c-dda5-c16e-c7c5b69653bf",
                "top_share": 0.0108,
                "ceiling": 0.4,
                "passed": true
              },
              "resource_type": {
                "column": "resource_type",
                "top_value": "roles",
                "top_share": 0.0392,
                "ceiling": 0.4,
                "passed": true
              }
            }
          }
        },
        "G5": {
          "gate": "G5_provenance",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "missing_columns": [],
            "unpopulated": {},
            "expected_columns": 11
          }
        },
        "G6": {
          "gate": "G6_determinism",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "hash_run_1": "38c1f69b1ad61ad248cf70e9f6e127f0052d05ff4c46b679b4c56a8ffa5caa1d",
            "hash_run_2": "38c1f69b1ad61ad248cf70e9f6e127f0052d05ff4c46b679b4c56a8ffa5caa1d",
            "match": true
          }
        }
      },
      "classification": "ENGINEERING_VERIFICATION",
      "advisory_label": "Synthetic data — engineering test fixture per ARCH-AI-001 AC-2. Not Gold Standard.",
      "provenance_columns": [
        "synthetic_record_id",
        "generator_id",
        "generator_version",
        "seed_value",
        "generation_timestamp_utc",
        "schema_hash_sha256",
        "is_edge_case",
        "edge_case_category",
        "partition_assignment",
        "partition_assignment_hash",
        "advisory_label"
      ]
    },
    {
      "dataset_id": "wf_fnd_003_context_gate",
      "workflow_id": "WF-FND-003",
      "workflow_name": "Context Gate / Active Scope Enforcement",
      "urs_refs": [
        "URS-03"
      ],
      "generator_id": "wf_fnd_003_context_gate",
      "generator_version": "1.0.0",
      "generator_type": "rule_based+statistical",
      "seed_value": "vrx-fnd-003-seed-2026",
      "generation_timestamp_utc": "2026-06-08T00:00:00+00:00",
      "schema_version": "1.0.0",
      "schema_hash_sha256": "b1c5dfd23cbf7d41c8733c8f693827db95efe514dc597eaa3163e49f66098d87",
      "output_file": "synthetic_data/wf_fnd_003_context_gate_v1.0.parquet",
      "output_format": "parquet",
      "output_row_count": 30000,
      "content_sha256": "b168926ff27834406a86e0dcbf603d517338637021e1f47a8e30bde74754111c",
      "content_sha256_note": "CANONICAL G6 determinism anchor — portable across toolchains (audit F2).",
      "output_sha256": "ec78a9adc5dc9ec700c5a6f35e19fa409cb5ab2e6ba89c31000bba1bdf5592c0",
      "output_sha256_note": "Environment-bound Parquet-file integrity hash; reproducible only within the recorded build_environment, not across pyarrow versions (audit F2).",
      "build_environment": {
        "python": "3.13.12",
        "implementation": "CPython",
        "libraries": {
          "numpy": "2.4.6",
          "pandas": "3.0.3",
          "pyarrow": "24.0.0",
          "scipy": "1.17.1",
          "Faker": "40.21.0",
          "jsonschema": "4.26.0"
        }
      },
      "review_sample_file": "synthetic_data/wf_fnd_003_context_gate_v1.0.review_sample.xlsx",
      "review_sample_row_count": 2350,
      "review_sample_sha256": "e6a9649fed0e6c8c23471bb58de7fd7b654a96cefb3d35c49a61a48a3894a50d",
      "partition_map_file": "synthetic_data/wf_fnd_003_context_gate_v1.0.partition_map.parquet",
      "partition_map_hash": "6211c5c9e2bccb154650c3b3fe5d7678d4a4908dc0d2d21b99b84012dcf5e3a4",
      "large_dataset": false,
      "data_artifact_committed": true,
      "regeneration_note": null,
      "partition_split": {
        "training": 21000,
        "evaluation": 6000,
        "edge_case": 3000
      },
      "edge_case_coverage_pct": 0.1583,
      "edge_case_by_category": {
        "adversarial": 2700,
        "boundary": 950,
        "negative": 550,
        "historical-failure": 550
      },
      "bias_test_results": {
        "tenant": {
          "column": "active_tenant_id",
          "top_value": "56ad8e4c-6b0b-8efd-b42a-076961b4a850",
          "top_share": 0.0096,
          "ceiling": 0.4,
          "passed": true
        },
        "resource_type": {
          "column": "requested_resource_type",
          "top_value": "governance",
          "top_share": 0.0351,
          "ceiling": 0.4,
          "passed": true
        }
      },
      "gold_standard_seed_corpus": [],
      "model_grounding": {
        "source": "URS-03_Context-Gate-Approval-Scope.Target-State.md (§6.1-6.4 active context + record scope resolver, §6.6 audit codes, §11.2 error codes, §12.2 enforcement chain)",
        "active_context_dimensions": "tenant_id (mandatory) + study_id, site_id, product_id, supplier_id (nullable). DEC-03-07: default context at login = tenant_id only.",
        "scope_check": "Intersection of active context vs target-record owner dimensions on every required dimension; non-empty intersection on all required dims = pass. tenant_wide and global_super_authority bypass dimension checks (snapshot still written).",
        "batch_axis_reconciliation": "Plan §7.1 names a tenant×site×product×batch matrix; URS-03 §6.1 active-context dimensions are tenant/study/site/product/supplier — there is NO batch dimension in session context. Following URS-03 (authoritative target state); batch axis superseded by study+supplier dimensions.",
        "audit_write_failure_rollback": "BR-03-13: if the audit write fails, the originating action rolls back (audit_gap_rollback edge)."
      },
      "assumption_register": {
        "note": "Declared engineering assumptions, not empirical fits.",
        "event_type_mix": {
          "source": "engineering assumption",
          "values": {
            "scope_check": 0.62,
            "context_switched": 0.18,
            "context_selected": 0.12,
            "context_reset": 0.08
          }
        },
        "active_dim_presence": {
          "source": "engineering assumption",
          "site": 0.6,
          "product": 0.5,
          "study": 0.3,
          "supplier": 0.2
        },
        "normal_scope_check_pass_rate": {
          "source": "engineering assumption",
          "value": 0.88
        },
        "tenant_wide_rate_normal": {
          "source": "engineering assumption",
          "value": 0.04
        },
        "super_authority_rate_normal": {
          "source": "engineering assumption",
          "value": 0.01
        }
      },
      "gates": {
        "G1": {
          "gate": "G1_schema",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "rows": 30000,
            "failures": 0,
            "sample": []
          }
        },
        "G2": {
          "gate": "G2_statistical",
          "status": "N/A",
          "passed": true,
          "applicable": false,
          "detail": {
            "rationale": "WF-FND-003 Context Gate has zero public Gold Standard corpus (Plan §2.1: platform-internal behavior). G2 N/A. Active-context dimensions, scope-check intersection logic, decision outcomes and audit-event codes are grounded in URS-03 Target State, not fitted to any Gold Standard anchor."
          }
        },
        "G3": {
          "gate": "G3_edge_cases",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "edge_rows": 4750,
            "total": 30000,
            "pct": 0.1583,
            "min_required": 0.15,
            "by_category": {
              "adversarial": 2700,
              "boundary": 950,
              "negative": 550,
              "historical-failure": 550
            }
          }
        },
        "G4": {
          "gate": "G4_bias",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "axes": {
              "tenant": {
                "column": "active_tenant_id",
                "top_value": "56ad8e4c-6b0b-8efd-b42a-076961b4a850",
                "top_share": 0.0096,
                "ceiling": 0.4,
                "passed": true
              },
              "resource_type": {
                "column": "requested_resource_type",
                "top_value": "governance",
                "top_share": 0.0351,
                "ceiling": 0.4,
                "passed": true
              }
            }
          }
        },
        "G5": {
          "gate": "G5_provenance",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "missing_columns": [],
            "unpopulated": {},
            "expected_columns": 11
          }
        },
        "G6": {
          "gate": "G6_determinism",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "hash_run_1": "b168926ff27834406a86e0dcbf603d517338637021e1f47a8e30bde74754111c",
            "hash_run_2": "b168926ff27834406a86e0dcbf603d517338637021e1f47a8e30bde74754111c",
            "match": true
          }
        }
      },
      "classification": "ENGINEERING_VERIFICATION",
      "advisory_label": "Synthetic data — engineering test fixture per ARCH-AI-001 AC-2. Not Gold Standard.",
      "provenance_columns": [
        "synthetic_record_id",
        "generator_id",
        "generator_version",
        "seed_value",
        "generation_timestamp_utc",
        "schema_hash_sha256",
        "is_edge_case",
        "edge_case_category",
        "partition_assignment",
        "partition_assignment_hash",
        "advisory_label"
      ]
    },
    {
      "dataset_id": "wf_fnd_004_workflow_esign",
      "workflow_id": "WF-FND-004",
      "workflow_name": "Workflow / HITL / E-Signature / Approval Authority",
      "urs_refs": [
        "URS-04"
      ],
      "generator_id": "wf_fnd_004_workflow_esign",
      "generator_version": "1.0.0",
      "generator_type": "rule_based+statistical",
      "seed_value": "vrx-fnd-004-seed-2026",
      "generation_timestamp_utc": "2026-06-08T00:00:00+00:00",
      "schema_version": "1.0.0",
      "schema_hash_sha256": "1575efde5d898795a436aa47b0c4d8f08aec7607298ba0e95947980aa4c0ee29",
      "output_file": "synthetic_data/wf_fnd_004_workflow_esign_v1.0.parquet",
      "output_format": "parquet",
      "output_row_count": 25000,
      "content_sha256": "e26b25d7aaf5161af754b864b102dadd874266b7ca487c04eb9aa6d7eaef92b9",
      "content_sha256_note": "CANONICAL G6 determinism anchor — portable across toolchains (audit F2).",
      "output_sha256": "d20264f43e99089a90d4ffc64d3e2dc3317600c4c8b092b9d6e78d3a78200c4d",
      "output_sha256_note": "Environment-bound Parquet-file integrity hash; reproducible only within the recorded build_environment, not across pyarrow versions (audit F2).",
      "build_environment": {
        "python": "3.13.12",
        "implementation": "CPython",
        "libraries": {
          "numpy": "2.4.6",
          "pandas": "3.0.3",
          "pyarrow": "24.0.0",
          "scipy": "1.17.1",
          "Faker": "40.21.0",
          "jsonschema": "4.26.0"
        }
      },
      "review_sample_file": "synthetic_data/wf_fnd_004_workflow_esign_v1.0.review_sample.xlsx",
      "review_sample_row_count": 2950,
      "review_sample_sha256": "7c6ac5d7de0a1b647f299385ab82eb50c65e8b7cc340a4939737d93f9fa6ce67",
      "partition_map_file": "synthetic_data/wf_fnd_004_workflow_esign_v1.0.partition_map.parquet",
      "partition_map_hash": "ed25647ec49af6dc3e3de9322f6f717b8284013815c9494461c7fd3bf23e5c6a",
      "large_dataset": false,
      "data_artifact_committed": true,
      "regeneration_note": null,
      "partition_split": {
        "training": 17500,
        "evaluation": 5000,
        "edge_case": 2500
      },
      "edge_case_coverage_pct": 0.184,
      "edge_case_by_category": {
        "adversarial": 2300,
        "boundary": 1100,
        "negative": 700,
        "historical-failure": 500
      },
      "bias_test_results": {
        "entity_type": {
          "column": "entity_type",
          "top_value": "change_control",
          "top_share": 0.107,
          "ceiling": 0.4,
          "passed": true
        },
        "tenant": {
          "column": "tenant_id",
          "top_value": "b26c2b31-7256-23f1-dee2-02defcf1788a",
          "top_share": 0.0097,
          "ceiling": 0.4,
          "passed": true
        },
        "node_key": {
          "column": "node_key",
          "top_value": "final_approval",
          "top_share": 0.1784,
          "ceiling": 0.4,
          "passed": true
        }
      },
      "gold_standard_seed_corpus": [],
      "model_grounding": {
        "source": "URS-04_Workflow-HITL-ESignature-Approval-Authority.Target-State.md (§5.4 lifecycle, §6.1-6.2 approval modes + e-sig + snapshots, §6.7 reauth/MFA, §11.2 error codes, §12.2 enforcement chain)",
        "approval_modes": "single, dual, sequential (signing_order enforced), parallel (unique signer per slot).",
        "esig_payload": "signed_by, tenant_id, ip, user_agent, signed_at (server-derived), meaning (>=8 chars; >=80 for override/high-risk), reason (>=8), mfa_step_up_used, content_fingerprint (SHA-256).",
        "hash_chain": "approval_authority_snapshots hash-chained per (tenant_id, entity_type, target_record_id): record_hash = SHA-256(content); previous_hash = prior record_hash (BR-04-16). REAL chains computed in this dataset — verifiable.",
        "sod": "requires_sod nodes exclude created_by/last_modified_by; sod_verdict in {passed, failed, not_applicable}.",
        "reauth_mfa": "re-authentication mandatory per regulated decision; MFA step-up mandatory for high-risk nodes (§6.7.1).",
        "replay_note": "Replay prevention not explicitly specified in URS-04 (inherited from URS-01 session/CSRF); modelled here as a per-signature replay_prevention_token nonce, with a replay_attempt edge that reuses a prior token."
      },
      "assumption_register": {
        "note": "Declared engineering assumptions, not empirical fits.",
        "approval_mode_mix": {
          "source": "engineering assumption",
          "values": {
            "single": 0.55,
            "dual": 0.2,
            "sequential": 0.15,
            "parallel": 0.1
          }
        },
        "decision_mix_final_slot": {
          "source": "engineering assumption",
          "values": {
            "approved": 0.82,
            "rejected": 0.1,
            "returned_for_correction": 0.08
          }
        },
        "high_risk_nodes": {
          "source": "URS-04 §6.7.1",
          "values": [
            "final_approval",
            "qp_release"
          ]
        },
        "entity_type_distribution": {
          "source": "engineering assumption",
          "value": "uniform over 10 regulated entity families"
        }
      },
      "gates": {
        "G1": {
          "gate": "G1_schema",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "rows": 25000,
            "failures": 0,
            "sample": []
          }
        },
        "G2": {
          "gate": "G2_statistical",
          "status": "N/A",
          "passed": true,
          "applicable": false,
          "detail": {
            "rationale": "WF-FND-004 Workflow/HITL/E-Signature has zero public Gold Standard corpus (Plan §2.1: platform-internal behavior). G2 N/A. Approval modes, e-signature payload, hash-chain, SoD verdicts, reauth/MFA step-up and error model are grounded in URS-04 Target State, not fitted to any Gold Standard anchor."
          }
        },
        "G3": {
          "gate": "G3_edge_cases",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "edge_rows": 4600,
            "total": 25000,
            "pct": 0.184,
            "min_required": 0.15,
            "by_category": {
              "adversarial": 2300,
              "boundary": 1100,
              "negative": 700,
              "historical-failure": 500
            }
          }
        },
        "G4": {
          "gate": "G4_bias",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "axes": {
              "entity_type": {
                "column": "entity_type",
                "top_value": "change_control",
                "top_share": 0.107,
                "ceiling": 0.4,
                "passed": true
              },
              "tenant": {
                "column": "tenant_id",
                "top_value": "b26c2b31-7256-23f1-dee2-02defcf1788a",
                "top_share": 0.0097,
                "ceiling": 0.4,
                "passed": true
              },
              "node_key": {
                "column": "node_key",
                "top_value": "final_approval",
                "top_share": 0.1784,
                "ceiling": 0.4,
                "passed": true
              }
            }
          }
        },
        "G5": {
          "gate": "G5_provenance",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "missing_columns": [],
            "unpopulated": {},
            "expected_columns": 11
          }
        },
        "G6": {
          "gate": "G6_determinism",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "hash_run_1": "e26b25d7aaf5161af754b864b102dadd874266b7ca487c04eb9aa6d7eaef92b9",
            "hash_run_2": "e26b25d7aaf5161af754b864b102dadd874266b7ca487c04eb9aa6d7eaef92b9",
            "match": true
          }
        }
      },
      "classification": "ENGINEERING_VERIFICATION",
      "advisory_label": "Synthetic data — engineering test fixture per ARCH-AI-001 AC-2. Not Gold Standard.",
      "provenance_columns": [
        "synthetic_record_id",
        "generator_id",
        "generator_version",
        "seed_value",
        "generation_timestamp_utc",
        "schema_hash_sha256",
        "is_edge_case",
        "edge_case_category",
        "partition_assignment",
        "partition_assignment_hash",
        "advisory_label"
      ]
    },
    {
      "dataset_id": "wf_fnd_005_authority_delegation",
      "workflow_id": "WF-FND-005",
      "workflow_name": "Authority Profile / Delegation / SoD",
      "urs_refs": [
        "URS-05"
      ],
      "generator_id": "wf_fnd_005_authority_delegation",
      "generator_version": "1.0.0",
      "generator_type": "rule_based+statistical",
      "seed_value": "vrx-fnd-005-seed-2026",
      "generation_timestamp_utc": "2026-06-08T00:00:00+00:00",
      "schema_version": "1.0.0",
      "schema_hash_sha256": "3e103a9f04d82e3f57d3f3d1b1e1d32a65a6e1c0e415fa4d63067d611d6ad57b",
      "output_file": "synthetic_data/wf_fnd_005_authority_delegation_v1.0.parquet",
      "output_format": "parquet",
      "output_row_count": 12000,
      "content_sha256": "c76accdab2494bb7a6fed96ea1e2ae46984f7392e39192872f76ddc43131f4f6",
      "content_sha256_note": "CANONICAL G6 determinism anchor — portable across toolchains (audit F2).",
      "output_sha256": "e10dff7f01bec8020393feb12bd5f2571e4671c454ad8bca5297dd5ef3207ec6",
      "output_sha256_note": "Environment-bound Parquet-file integrity hash; reproducible only within the recorded build_environment, not across pyarrow versions (audit F2).",
      "build_environment": {
        "python": "3.13.12",
        "implementation": "CPython",
        "libraries": {
          "numpy": "2.4.6",
          "pandas": "3.0.3",
          "pyarrow": "24.0.0",
          "scipy": "1.17.1",
          "Faker": "40.21.0",
          "jsonschema": "4.26.0"
        }
      },
      "review_sample_file": "synthetic_data/wf_fnd_005_authority_delegation_v1.0.review_sample.xlsx",
      "review_sample_row_count": 3350,
      "review_sample_sha256": "bc446c28134c2de998a0980867726611e98a7c5d10e55186147a57f363d0e231",
      "partition_map_file": "synthetic_data/wf_fnd_005_authority_delegation_v1.0.partition_map.parquet",
      "partition_map_hash": "bf861b1cc2c3da94a5ec4204f51635f78faa8bd306ca695ea7685a0a0ef57f7e",
      "large_dataset": false,
      "data_artifact_committed": true,
      "regeneration_note": null,
      "partition_split": {
        "training": 8400,
        "evaluation": 2400,
        "edge_case": 1200
      },
      "edge_case_coverage_pct": 0.1842,
      "edge_case_by_category": {
        "adversarial": 1100,
        "negative": 500,
        "boundary": 450,
        "historical-failure": 160
      },
      "bias_test_results": {
        "authority_key": {
          "column": "authority_key",
          "top_value": "qa_release_uk",
          "top_share": 0.0782,
          "ceiling": 0.4,
          "passed": true
        },
        "tenant": {
          "column": "tenant_id",
          "top_value": "2ea61e07-4093-fa6a-8070-bce3afddc202",
          "top_share": 0.0108,
          "ceiling": 0.4,
          "passed": true
        }
      },
      "gold_standard_seed_corpus": [],
      "model_grounding": {
        "source": "URS-05_Authority-Profile-Delegation-SoD.Target-State.md (§3 profiles, §6.4 delegation lifecycle + resolver, DEC-05-06 chain-depth/30-day, DEC-05-08 SoD launch set)",
        "delegation_lifecycle": "pending_acknowledgement -> active -> (used|revoked|expired); pending -> expired_unacknowledged.",
        "chain_depth_1": "Delegate cannot re-delegate (DEC-05-06); attempt -> DELEGATION_CHAIN_DEPTH_EXCEEDED.",
        "duration_cap_days": 30,
        "scope_subset_rule": "Delegation scope MUST be a strict subset of the delegator's scope (BR-05-04).",
        "same_key_only": "Delegation is same-key only (qp_eu->qp_eu); cross-key -> DELEGATION_KEY_MISMATCH.",
        "non_eligible_profiles": [
          "global_quality_oversight",
          "platform_super_authority"
        ],
        "sod_tier1_rules": [
          "AUTHOR_NEQ_APPROVER",
          "REVIEWER_NEQ_FINAL_APPROVER",
          "DELEGATOR_NEQ_DELEGATE",
          "CREATOR_NEQ_EFFECTIVENESS_VERIFIER",
          "SAME_USER_TWO_PARALLEL_SLOTS_FORBIDDEN"
        ],
        "sod_eval_timing": "SoD evaluated at use time (signature), not at grant time (DEC-05-10). Verdicts: passed|denied|excepted.",
        "revocation_semantics": "Revocation/expiry does NOT invalidate already-signed decisions (DEC-05-14); only blocks future use."
      },
      "assumption_register": {
        "note": "Declared engineering assumptions, not empirical fits.",
        "delegation_terminal_mix": {
          "source": "engineering assumption",
          "values": {
            "used": 0.55,
            "revoked": 0.2,
            "expired": 0.25
          }
        },
        "sod_normal_verdict_mix": {
          "source": "engineering assumption",
          "values": {
            "passed": 0.93,
            "excepted": 0.07
          }
        },
        "authority_key_distribution": {
          "source": "engineering assumption",
          "value": "uniform over delegation-eligible Tier-1 keys"
        },
        "delegation_path_mix_sod": {
          "source": "engineering assumption",
          "direct": 0.8,
          "via_delegation": 0.2
        }
      },
      "gates": {
        "G1": {
          "gate": "G1_schema",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "rows": 12000,
            "failures": 0,
            "sample": []
          }
        },
        "G2": {
          "gate": "G2_statistical",
          "status": "N/A",
          "passed": true,
          "applicable": false,
          "detail": {
            "rationale": "WF-FND-005 Authority/Delegation/SoD has zero public Gold Standard corpus (Plan §2.1: platform-internal behavior). G2 N/A. Authority-profile keys, delegation lifecycle, chain-depth-1 rule, 30-day cap, the five Tier-1 SoD rules and use-time evaluation are grounded in URS-05 Target State, not fitted to any Gold Standard anchor."
          }
        },
        "G3": {
          "gate": "G3_edge_cases",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "edge_rows": 2210,
            "total": 12000,
            "pct": 0.1842,
            "min_required": 0.15,
            "by_category": {
              "adversarial": 1100,
              "negative": 500,
              "boundary": 450,
              "historical-failure": 160
            }
          }
        },
        "G4": {
          "gate": "G4_bias",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "axes": {
              "authority_key": {
                "column": "authority_key",
                "top_value": "qa_release_uk",
                "top_share": 0.0782,
                "ceiling": 0.4,
                "passed": true
              },
              "tenant": {
                "column": "tenant_id",
                "top_value": "2ea61e07-4093-fa6a-8070-bce3afddc202",
                "top_share": 0.0108,
                "ceiling": 0.4,
                "passed": true
              }
            }
          }
        },
        "G5": {
          "gate": "G5_provenance",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "missing_columns": [],
            "unpopulated": {},
            "expected_columns": 11
          }
        },
        "G6": {
          "gate": "G6_determinism",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "hash_run_1": "c76accdab2494bb7a6fed96ea1e2ae46984f7392e39192872f76ddc43131f4f6",
            "hash_run_2": "c76accdab2494bb7a6fed96ea1e2ae46984f7392e39192872f76ddc43131f4f6",
            "match": true
          }
        }
      },
      "classification": "ENGINEERING_VERIFICATION",
      "advisory_label": "Synthetic data — engineering test fixture per ARCH-AI-001 AC-2. Not Gold Standard.",
      "provenance_columns": [
        "synthetic_record_id",
        "generator_id",
        "generator_version",
        "seed_value",
        "generation_timestamp_utc",
        "schema_hash_sha256",
        "is_edge_case",
        "edge_case_category",
        "partition_assignment",
        "partition_assignment_hash",
        "advisory_label"
      ]
    },
    {
      "dataset_id": "wf_fnd_006_audit_trail",
      "workflow_id": "WF-FND-006",
      "workflow_name": "Audit Trail / Hash Chain / Tamper-Evident",
      "urs_refs": [
        "URS-06"
      ],
      "generator_id": "wf_fnd_006_audit_trail",
      "generator_version": "1.1.0",
      "generator_type": "rule_based+statistical",
      "seed_value": "vrx-fnd-006-seed-2026",
      "generation_timestamp_utc": "2026-06-08T00:00:00+00:00",
      "schema_version": "1.0.0",
      "schema_hash_sha256": "1ce2a65016d5d70eca6bcd0c2ed8031d49ed5350b7deda0386e649cdfcdca2c8",
      "output_file": "synthetic_data/wf_fnd_006_audit_trail_v1.0.parquet",
      "output_format": "parquet",
      "output_row_count": 1000000,
      "content_sha256": "8edf5928a45a4caa9dd74804639e9d75b02ee007b789ab972f1d5da2d591d038",
      "content_sha256_note": "CANONICAL G6 determinism anchor — portable across toolchains (audit F2).",
      "output_sha256": "2108518994b16a001980142439c78a010df24299f50c28374c895a16ed91f042",
      "output_sha256_note": "Environment-bound Parquet-file integrity hash; reproducible only within the recorded build_environment, not across pyarrow versions (audit F2).",
      "build_environment": {
        "python": "3.13.12",
        "implementation": "CPython",
        "libraries": {
          "numpy": "2.4.6",
          "pandas": "3.0.3",
          "pyarrow": "24.0.0",
          "scipy": "1.17.1",
          "Faker": "40.21.0",
          "jsonschema": "4.26.0"
        }
      },
      "review_sample_file": "synthetic_data/wf_fnd_006_audit_trail_v1.0.review_sample.xlsx",
      "review_sample_row_count": 3350,
      "review_sample_sha256": "345fd2e1431a1246fb66daaeae3ae43ba6cfb4b4ecabd1b220727173e814050b",
      "partition_map_file": null,
      "partition_map_hash": "cf49e73418142634c62b39fb48837400c800bb650c5314f9efa8cd5227a3c7a9",
      "large_dataset": true,
      "data_artifact_committed": false,
      "regeneration_note": "Heavy data artifact (.parquet + partition_map) is gitignored. Regenerate bit-identical from generator + locked seed; verify against content_sha256. Generator+seed is the ALCOA Original (Plan §4).",
      "partition_split": {
        "training": 700000,
        "evaluation": 200000,
        "edge_case": 100000
      },
      "edge_case_coverage_pct": 0.175,
      "edge_case_by_category": {
        "historical-failure": 100000,
        "adversarial": 39000,
        "boundary": 30000,
        "negative": 6000
      },
      "bias_test_results": {
        "tenant": {
          "column": "tenant_id",
          "top_value": "7a8cb375-2734-d707-f7dc-7ad981bf61ce",
          "top_share": 0.007,
          "ceiling": 0.4,
          "passed": true
        },
        "action_code": {
          "column": "action_code",
          "top_value": "CAPA_CREATED",
          "top_share": 0.1573,
          "ceiling": 0.4,
          "passed": true
        }
      },
      "gold_standard_seed_corpus": [],
      "intended_use": null,
      "canonicalization_note": "URS-06 specifies RFC-8785 canonical JSON. For 1M-row tractability this generator uses a deterministic pipe-joined, alphabetically-ordered content serialization. The cryptographic property is identical (tamper-evident, linkage-verifiable); the test suite re-runs the URS-06 verifier algorithm with the SAME canonicalization and confirms clean chains verify and tampered chains are detected at the right sequence.",
      "model_grounding": {
        "source": "URS-06_Audit-Trail-HashChain-TamperEvident.Target-State.md (DEC-06-02 scopes, DEC-06-03 record_hash, DEC-06-04 fields/sequence, DEC-06-17 genesis, DEC-06-22 chain_id, §5/§7 tamper detection + verifier)",
        "chain_scopes": "per_entity | per_tenant | global. One event -> exactly one chain.",
        "chain_id_derivation": "per_entity = SHA256(tenant_id:entity_type:target_record_id); per_tenant = SHA256(tenant_id:PER_TENANT); global = SHA256(GLOBAL).",
        "record_hash": "SHA-256(previous_hash || canonical(content)). Genesis previous_hash = SHA-256(chain_id::genesis_ts) (DEC-06-17). chain_sequence monotonic per chain, starts at 1 (genesis).",
        "canonicalization_note": "URS-06 specifies RFC-8785 canonical JSON. For 1M-row tractability this generator uses a deterministic pipe-joined, alphabetically-ordered content serialization. The cryptographic property is identical (tamper-evident, linkage-verifiable); the test suite re-runs the URS-06 verifier algorithm with the SAME canonicalization and confirms clean chains verify and tampered chains are detected at the right sequence.",
        "tamper_detection": "Verifier recomputes record_hash and checks previous_hash continuity + sequence gaps. Violations: HASH_MISMATCH, CHAIN_BREAK, SEQUENCE_GAP."
      },
      "assumption_register": {
        "note": "Declared engineering assumptions, not empirical fits.",
        "scope_distribution": {
          "source": "URS-06 §10 guideline",
          "per_entity": 0.979,
          "per_tenant": 0.02,
          "global": 0.001
        },
        "action_code_mix": {
          "source": "engineering assumption (domain-event-dominant audit stream)"
        },
        "entity_chain_count": {
          "source": "engineering assumption",
          "value": 8000
        }
      },
      "gates": {
        "G1": {
          "gate": "G1_schema_vectorized",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "rows": 1000000,
            "failures": 0,
            "by_column": {},
            "extra_columns": [],
            "missing_required": []
          }
        },
        "G2": {
          "gate": "G2_statistical",
          "status": "N/A",
          "passed": true,
          "applicable": false,
          "detail": {
            "rationale": "WF-FND-006 Audit Trail has zero public Gold Standard corpus (Plan §2.1: platform-internal substrate). G2 N/A. Multi-scope hash-chain construction, deterministic chain_id derivation, monotonic sequencing, genesis anchoring, tamper detection and the audit-event vocabulary are grounded in URS-06 Target State, not fitted to any Gold Standard anchor."
          }
        },
        "G3": {
          "gate": "G3_edge_cases",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "edge_rows": 175000,
            "total": 1000000,
            "pct": 0.175,
            "min_required": 0.15,
            "by_category": {
              "historical-failure": 100000,
              "adversarial": 39000,
              "boundary": 30000,
              "negative": 6000
            }
          }
        },
        "G4": {
          "gate": "G4_bias",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "axes": {
              "tenant": {
                "column": "tenant_id",
                "top_value": "7a8cb375-2734-d707-f7dc-7ad981bf61ce",
                "top_share": 0.007,
                "ceiling": 0.4,
                "passed": true
              },
              "action_code": {
                "column": "action_code",
                "top_value": "CAPA_CREATED",
                "top_share": 0.1573,
                "ceiling": 0.4,
                "passed": true
              }
            }
          }
        },
        "G5": {
          "gate": "G5_provenance",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "missing_columns": [],
            "unpopulated": {},
            "expected_columns": 11
          }
        },
        "G6": {
          "gate": "G6_determinism",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "hash_run_1": "8edf5928a45a4caa9dd74804639e9d75b02ee007b789ab972f1d5da2d591d038",
            "hash_run_2": "8edf5928a45a4caa9dd74804639e9d75b02ee007b789ab972f1d5da2d591d038",
            "match": true
          }
        }
      },
      "classification": "ENGINEERING_VERIFICATION",
      "advisory_label": "Synthetic data — engineering test fixture per ARCH-AI-001 AC-2. Not Gold Standard.",
      "provenance_columns": [
        "synthetic_record_id",
        "generator_id",
        "generator_version",
        "seed_value",
        "generation_timestamp_utc",
        "schema_hash_sha256",
        "is_edge_case",
        "edge_case_category",
        "partition_assignment",
        "partition_assignment_hash",
        "advisory_label"
      ]
    },
    {
      "dataset_id": "wf_fnd_007_notifications",
      "workflow_id": "WF-FND-007",
      "workflow_name": "Notifications & Acknowledgments",
      "urs_refs": [
        "URS-30"
      ],
      "generator_id": "wf_fnd_007_notifications",
      "generator_version": "1.0.0",
      "generator_type": "rule_based+statistical",
      "seed_value": "vrx-fnd-007-seed-2026",
      "generation_timestamp_utc": "2026-06-08T00:00:00+00:00",
      "schema_version": "1.0.0",
      "schema_hash_sha256": "332b7193081bd8a4d38ae71736e8d12c1ab8d7a84913674760aca34a474f9838",
      "output_file": "synthetic_data/wf_fnd_007_notifications_v1.0.parquet",
      "output_format": "parquet",
      "output_row_count": 30000,
      "content_sha256": "b59bbcfbad40c91f822e1b5be653dda481c77d3520caba5798b163e044792497",
      "content_sha256_note": "CANONICAL G6 determinism anchor — portable across toolchains (audit F2).",
      "output_sha256": "3970b34516e399b2f0ee779f6f3af4a9f977135b4c6eb3cd6ec7ba346ec52b49",
      "output_sha256_note": "Environment-bound Parquet-file integrity hash; reproducible only within the recorded build_environment, not across pyarrow versions (audit F2).",
      "build_environment": {
        "python": "3.13.12",
        "implementation": "CPython",
        "libraries": {
          "numpy": "2.4.6",
          "pandas": "3.0.3",
          "pyarrow": "24.0.0",
          "scipy": "1.17.1",
          "Faker": "40.21.0",
          "jsonschema": "4.26.0"
        }
      },
      "review_sample_file": "synthetic_data/wf_fnd_007_notifications_v1.0.review_sample.xlsx",
      "review_sample_row_count": 3600,
      "review_sample_sha256": "b4a2206e9eb939846dac88379760137eb805073c8d2165feaefb39102b2ed257",
      "partition_map_file": "synthetic_data/wf_fnd_007_notifications_v1.0.partition_map.parquet",
      "partition_map_hash": "15adcf55f9b01c9c152902c7ba6bbe1f1c09a4815d8dbf095bc8a8c626fc6e37",
      "large_dataset": false,
      "data_artifact_committed": true,
      "regeneration_note": null,
      "partition_split": {
        "training": 21000,
        "evaluation": 6000,
        "edge_case": 3000
      },
      "edge_case_coverage_pct": 0.19,
      "edge_case_by_category": {
        "adversarial": 1900,
        "historical-failure": 1800,
        "boundary": 1200,
        "negative": 800
      },
      "bias_test_results": {
        "event_type": {
          "column": "event_type",
          "top_value": "regulatory_submission_approved",
          "top_share": 0.0706,
          "ceiling": 0.4,
          "passed": true
        },
        "tenant": {
          "column": "tenant_id",
          "top_value": "0c128840-cd4b-6977-4084-a81b5169df9e",
          "top_share": 0.0125,
          "ceiling": 0.4,
          "passed": true
        },
        "channel": {
          "column": "channel",
          "top_value": "email",
          "top_share": 0.3469,
          "ceiling": 0.4,
          "passed": true
        }
      },
      "gold_standard_seed_corpus": [],
      "model_grounding": {
        "source": "URS-30_Notifications.Target-State.md (6.2 records, 6.4 lifecycle, DEC-30-05 mandatory-alert forcing, DEC-30-06 retry, DEC-30-09 ack signature, DEC-30-15/16 chronic-failure->finding->CAPA, NFR-30-12 5-min SLA)",
        "channels": "email, slack, teams, webhook, in_app (smtp deprecated -> email).",
        "delivery_lifecycle": "email: pending->sent | bounced | failed->pending(retry 60s/5m/30m/2h/12h, max 5)->terminal_failed. inbox: unread->read->soft_deleted.",
        "mandatory_alert_forcing": "Mandatory-alert events force email + in_app regardless of recipient preferences (DEC-30-05); must deliver on >=1 channel within 5-min SLA (NFR-30-12) or NTF_MANDATORY_ALERT_DELIVERY_FAILED + URS-21 finding.",
        "ack_signature": "Events flagged requires_acknowledgment_signature (e.g. regulatory_submission_approved) capture a bound e-signature on acknowledgment (DEC-30-09).",
        "chronic_failure": ">=5 terminal failures/channel/24h -> notification_finding_created (URS-21) -> notification_capa_linked (URS-18).",
        "not_specified_in_urs30": "ack timeout SLA, bounce_type enum, digest-batching rules for non-critical — represented with documented engineering assumptions."
      },
      "assumption_register": {
        "note": "Declared engineering assumptions, not empirical fits.",
        "channel_mix": {
          "source": "engineering assumption",
          "values": {
            "in_app": 0.34,
            "email": 0.3,
            "slack": 0.14,
            "teams": 0.12,
            "webhook": 0.1
          }
        },
        "normal_delivery_success_rate": {
          "source": "engineering assumption",
          "value": 0.95
        },
        "in_app_ack_rate": {
          "source": "engineering assumption",
          "value": 0.7
        },
        "ack_timeout_minutes": {
          "source": "engineering assumption (URS-30 leaves SLA unspecified)",
          "value": 1440
        },
        "mandatory_sla_minutes": {
          "source": "URS-30 NFR-30-12",
          "value": 5
        }
      },
      "gates": {
        "G1": {
          "gate": "G1_schema",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "rows": 30000,
            "failures": 0,
            "sample": []
          }
        },
        "G2": {
          "gate": "G2_statistical",
          "status": "N/A",
          "passed": true,
          "applicable": false,
          "detail": {
            "rationale": "WF-FND-007 Notifications has zero public Gold Standard corpus (Plan §2.1: platform-internal dispatcher). G2 N/A. Channels, delivery state machine, retry/terminal policy, mandatory-alert forcing, acknowledgment-signature and event-trigger catalogue are grounded in URS-30 Target State, not fitted to any Gold Standard anchor."
          }
        },
        "G3": {
          "gate": "G3_edge_cases",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "edge_rows": 5700,
            "total": 30000,
            "pct": 0.19,
            "min_required": 0.15,
            "by_category": {
              "adversarial": 1900,
              "historical-failure": 1800,
              "boundary": 1200,
              "negative": 800
            }
          }
        },
        "G4": {
          "gate": "G4_bias",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "axes": {
              "event_type": {
                "column": "event_type",
                "top_value": "regulatory_submission_approved",
                "top_share": 0.0706,
                "ceiling": 0.4,
                "passed": true
              },
              "tenant": {
                "column": "tenant_id",
                "top_value": "0c128840-cd4b-6977-4084-a81b5169df9e",
                "top_share": 0.0125,
                "ceiling": 0.4,
                "passed": true
              },
              "channel": {
                "column": "channel",
                "top_value": "email",
                "top_share": 0.3469,
                "ceiling": 0.4,
                "passed": true
              }
            }
          }
        },
        "G5": {
          "gate": "G5_provenance",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "missing_columns": [],
            "unpopulated": {},
            "expected_columns": 11
          }
        },
        "G6": {
          "gate": "G6_determinism",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "hash_run_1": "b59bbcfbad40c91f822e1b5be653dda481c77d3520caba5798b163e044792497",
            "hash_run_2": "b59bbcfbad40c91f822e1b5be653dda481c77d3520caba5798b163e044792497",
            "match": true
          }
        }
      },
      "classification": "ENGINEERING_VERIFICATION",
      "advisory_label": "Synthetic data — engineering test fixture per ARCH-AI-001 AC-2. Not Gold Standard.",
      "provenance_columns": [
        "synthetic_record_id",
        "generator_id",
        "generator_version",
        "seed_value",
        "generation_timestamp_utc",
        "schema_hash_sha256",
        "is_edge_case",
        "edge_case_category",
        "partition_assignment",
        "partition_assignment_hash",
        "advisory_label"
      ]
    },
    {
      "dataset_id": "wf_fnd_008_backup_restore",
      "workflow_id": "WF-FND-008",
      "workflow_name": "Backup / Restore / Operational Resilience",
      "urs_refs": [
        "URS-35"
      ],
      "generator_id": "wf_fnd_008_backup_restore",
      "generator_version": "1.0.0",
      "generator_type": "rule_based+statistical",
      "seed_value": "vrx-fnd-008-seed-2026",
      "generation_timestamp_utc": "2026-06-08T00:00:00+00:00",
      "schema_version": "1.0.0",
      "schema_hash_sha256": "5931a7c214755663c832cb1e6d745d4de32dda417f92f247acd62bb6b8b5d464",
      "output_file": "synthetic_data/wf_fnd_008_backup_restore_v1.0.parquet",
      "output_format": "parquet",
      "output_row_count": 3750,
      "content_sha256": "ea572bcf11cb17580a58787cc2ef76b4cfc5f2019079524b526de93ec214f313",
      "content_sha256_note": "CANONICAL G6 determinism anchor — portable across toolchains (audit F2).",
      "output_sha256": "99ceae733914520cb8363d6a4e8ca83c8d6cb4faac7d768c374e14bef86f5e3c",
      "output_sha256_note": "Environment-bound Parquet-file integrity hash; reproducible only within the recorded build_environment, not across pyarrow versions (audit F2).",
      "build_environment": {
        "python": "3.13.12",
        "implementation": "CPython",
        "libraries": {
          "numpy": "2.4.6",
          "pandas": "3.0.3",
          "pyarrow": "24.0.0",
          "scipy": "1.17.1",
          "Faker": "40.21.0",
          "jsonschema": "4.26.0"
        }
      },
      "review_sample_file": "synthetic_data/wf_fnd_008_backup_restore_v1.0.review_sample.xlsx",
      "review_sample_row_count": 850,
      "review_sample_sha256": "1563652ee6bebb2cfb04d873ac88ebce38816ab98c90d385436fa1a3938c2139",
      "partition_map_file": "synthetic_data/wf_fnd_008_backup_restore_v1.0.partition_map.parquet",
      "partition_map_hash": "df9df0d0ccaaab21c34a22458da9f83142547b8544bdc1e211631669b533bd7b",
      "large_dataset": false,
      "data_artifact_committed": true,
      "regeneration_note": null,
      "partition_split": {
        "training": 2625,
        "evaluation": 750,
        "edge_case": 375
      },
      "edge_case_coverage_pct": 0.1813,
      "edge_case_by_category": {
        "adversarial": 180,
        "boundary": 180,
        "historical-failure": 170,
        "negative": 150
      },
      "bias_test_results": {
        "region": {
          "column": "region",
          "top_value": "us",
          "top_share": 0.3664,
          "ceiling": 0.4,
          "passed": true
        },
        "tenant": {
          "column": "tenant_id",
          "top_value": "491b7fbc-5e68-dbfd-ccc0-449cdb243f08",
          "top_share": 0.0107,
          "ceiling": 0.4,
          "passed": true
        }
      },
      "gold_standard_seed_corpus": [],
      "model_grounding": {
        "source": "URS-35_Infrastructure.Target-State.md (DEC-35-05 schedules, DEC-35-06 verification, DEC-35-07 restore lifecycle, DEC-35-08 RTO/RPO, DEC-35-11 e-sig, DEC-35-14 key rotation, DEC-35-15 system_context attribution, DEC-35-23 multi-region)",
        "backup_status": "in_execution -> executed | failed -> verified (verification record + e-sig required for verified).",
        "restore_lifecycle": "requested -> approved -> in_execution -> executed -> verified -> released_to_service | rejected_post_verification (QP co-sign at release).",
        "rto_rpo": "DR plan default RTO 240 min, RPO 15 min; DR tests record actual_rto/rpo + within_target; miss -> INFRA_DR_RTO_MISSED/RPO_MISSED -> URS-21 finding + URS-18 CAPA.",
        "verification": "checksum_only | manifest_review | restore_test; checksum_algorithm SHA-256 (DEC-35-06).",
        "system_actor": "Scheduled backups are system-generated. URS-35 DEC-35-15 permits system_context attribution for system events; per QS-2 + FND-006 QC DI-1 this generator uses NAMED principal system@verixa.internal (never blank).",
        "plan_reconciliation": "Plan §7.1 names ~3,650 daily backups + 100 restores; this set models the fuller URS-35 resilience picture (backup + restore + DR test + key rotation) summing to 3,750, backup-dominant."
      },
      "assumption_register": {
        "note": "Declared engineering assumptions, not empirical fits.",
        "backup_type_mix": {
          "source": "engineering assumption (weekly full + daily incremental)",
          "values": {
            "incremental": 0.78,
            "full": 0.16,
            "differential": 0.06
          }
        },
        "region_mix": {
          "source": "URS-35 launch regions us/eu/in",
          "values": {
            "in": 0.36,
            "us": 0.36,
            "eu": 0.28
          }
        },
        "normal_backup_verified_rate": {
          "source": "engineering assumption",
          "value": 0.9
        },
        "rto_target_minutes": {
          "source": "URS-35 DEC-35-08",
          "value": 240
        },
        "rpo_target_minutes": {
          "source": "URS-35 DEC-35-08",
          "value": 15
        }
      },
      "gates": {
        "G1": {
          "gate": "G1_schema",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "rows": 3750,
            "failures": 0,
            "sample": []
          }
        },
        "G2": {
          "gate": "G2_statistical",
          "status": "N/A",
          "passed": true,
          "applicable": false,
          "detail": {
            "rationale": "WF-FND-008 Backup/Restore has zero public Gold Standard corpus (Plan §2.1: platform-internal resilience). G2 N/A. Backup/restore lifecycle, verification, RTO/RPO model, system-actor attribution and audit codes are grounded in URS-35 Target State."
          }
        },
        "G3": {
          "gate": "G3_edge_cases",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "edge_rows": 680,
            "total": 3750,
            "pct": 0.1813,
            "min_required": 0.15,
            "by_category": {
              "adversarial": 180,
              "boundary": 180,
              "historical-failure": 170,
              "negative": 150
            }
          }
        },
        "G4": {
          "gate": "G4_bias",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "axes": {
              "region": {
                "column": "region",
                "top_value": "us",
                "top_share": 0.3664,
                "ceiling": 0.4,
                "passed": true
              },
              "tenant": {
                "column": "tenant_id",
                "top_value": "491b7fbc-5e68-dbfd-ccc0-449cdb243f08",
                "top_share": 0.0107,
                "ceiling": 0.4,
                "passed": true
              }
            }
          }
        },
        "G5": {
          "gate": "G5_provenance",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "missing_columns": [],
            "unpopulated": {},
            "expected_columns": 11
          }
        },
        "G6": {
          "gate": "G6_determinism",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "hash_run_1": "ea572bcf11cb17580a58787cc2ef76b4cfc5f2019079524b526de93ec214f313",
            "hash_run_2": "ea572bcf11cb17580a58787cc2ef76b4cfc5f2019079524b526de93ec214f313",
            "match": true
          }
        }
      },
      "classification": "ENGINEERING_VERIFICATION",
      "advisory_label": "Synthetic data — engineering test fixture per ARCH-AI-001 AC-2. Not Gold Standard.",
      "provenance_columns": [
        "synthetic_record_id",
        "generator_id",
        "generator_version",
        "seed_value",
        "generation_timestamp_utc",
        "schema_hash_sha256",
        "is_edge_case",
        "edge_case_category",
        "partition_assignment",
        "partition_assignment_hash",
        "advisory_label"
      ]
    },
    {
      "dataset_id": "wf_fnd_009_tenant_lifecycle",
      "workflow_id": "WF-FND-009",
      "workflow_name": "Tenant Lifecycle",
      "urs_refs": [
        "URS-08"
      ],
      "generator_id": "wf_fnd_009_tenant_lifecycle",
      "generator_version": "1.0.0",
      "generator_type": "rule_based+statistical",
      "seed_value": "vrx-fnd-009-seed-2026",
      "generation_timestamp_utc": "2026-06-08T00:00:00+00:00",
      "schema_version": "1.0.0",
      "schema_hash_sha256": "c366952e31ecdcf5ec6d9af6c2cb63e0a03dae3ae574f7ad26e710a9faee0f0f",
      "output_file": "synthetic_data/wf_fnd_009_tenant_lifecycle_v1.0.parquet",
      "output_format": "parquet",
      "output_row_count": 500,
      "content_sha256": "b49be3adb859f2e44130d36fae1c3126f5ccb908d797b14e232070722a351e88",
      "content_sha256_note": "CANONICAL G6 determinism anchor — portable across toolchains (audit F2).",
      "output_sha256": "9823fbfa046e8cc05100c79a61e81599d424981392017acd1345f45a43d02935",
      "output_sha256_note": "Environment-bound Parquet-file integrity hash; reproducible only within the recorded build_environment, not across pyarrow versions (audit F2).",
      "build_environment": {
        "python": "3.13.12",
        "implementation": "CPython",
        "libraries": {
          "numpy": "2.4.6",
          "pandas": "3.0.3",
          "pyarrow": "24.0.0",
          "scipy": "1.17.1",
          "Faker": "40.21.0",
          "jsonschema": "4.26.0"
        }
      },
      "review_sample_file": "synthetic_data/wf_fnd_009_tenant_lifecycle_v1.0.review_sample.xlsx",
      "review_sample_row_count": 280,
      "review_sample_sha256": "226570a927ff8d7ea7d460b0c9ed0a02b6bfe8b9462d19a0d0fcb21d50cb86ef",
      "partition_map_file": "synthetic_data/wf_fnd_009_tenant_lifecycle_v1.0.partition_map.parquet",
      "partition_map_hash": "ca6361f995b35e2bb6e0fbe75b86d8a3062f58ed998230b7101472492f706bd2",
      "large_dataset": false,
      "data_artifact_committed": true,
      "regeneration_note": null,
      "partition_split": {
        "training": 350,
        "evaluation": 100,
        "edge_case": 50
      },
      "edge_case_coverage_pct": 0.19,
      "edge_case_by_category": {
        "adversarial": 35,
        "boundary": 27,
        "negative": 22,
        "historical-failure": 11
      },
      "bias_test_results": {
        "jurisdiction": {
          "column": "legal_entity_jurisdiction",
          "top_value": "US",
          "top_share": 0.338,
          "ceiling": 0.4,
          "passed": true
        },
        "region": {
          "column": "data_residency_region",
          "top_value": "us",
          "top_share": 0.378,
          "ceiling": 0.4,
          "passed": true
        }
      },
      "gold_standard_seed_corpus": [],
      "model_grounding": {
        "source": "URS-08_Tenant-Management-Lifecycle.Target-State.md (DEC-08-01 record, DEC-08-02 states, DEC-08-04 exec co-sign, DEC-08-06 suspension categories, DEC-08-07 offboarding, DEC-08-09 entitlements, DEC-08-10 residency, DEC-08-12 high-risk verticals, DEC-08-14 KYC)",
        "lifecycle_states": "pending | in_setup | active | suspended | in_offboarding | offboarded | rejected | withdrawn (DEC-08-02).",
        "suspension_categories": "payment_default | regulatory_concern | security_incident | customer_requested (DEC-08-06).",
        "offboarding_gate": "Blocked by open studies / active collaborations / open delegations / open findings (DEC-08-08).",
        "high_risk_verticals": "cell_and_gene_therapy, sterile_injectables, compounding_pharmacy, clinical_trial_sponsor, controlled_substance_manufacturer, infant_formula, vaccine_manufacturer require Founder review (DEC-08-12).",
        "residency_regions": "us | eu | in (launch); uk | ca roadmap (DEC-08-05).",
        "jurisdiction_application": "legal_entity_jurisdiction uses Decision B firm-jurisdiction mix (32 IN/32 US/15 EU/10 CN/11 other); data_residency_region derived (in/us/eu) with non-launch jurisdictions distributed eu-weighted to hold G4 headroom."
      },
      "assumption_register": {
        "note": "Declared engineering assumptions, not empirical fits.",
        "state_distribution": {
          "source": "URS-08 §summary guidance (active-dominant)"
        },
        "seat_count_range": {
          "source": "engineering assumption",
          "min": 5,
          "max": 500
        },
        "residency_for_nonlaunch_jurisdiction": {
          "source": "engineering assumption (eu-weighted for G4 headroom)",
          "values": {
            "in": 0.2,
            "us": 0.2,
            "eu": 0.6
          }
        }
      },
      "gates": {
        "G1": {
          "gate": "G1_schema",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "rows": 500,
            "failures": 0,
            "sample": []
          }
        },
        "G2": {
          "gate": "G2_statistical",
          "status": "N/A",
          "passed": true,
          "applicable": false,
          "detail": {
            "rationale": "WF-FND-009 Tenant Lifecycle has zero public Gold Standard corpus (Plan §2.1: platform-internal). G2 N/A. The 8-state lifecycle, transition authorities, suspension categories, offboarding gate, residency model and entitlements are grounded in URS-08 Target State."
          }
        },
        "G3": {
          "gate": "G3_edge_cases",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "edge_rows": 95,
            "total": 500,
            "pct": 0.19,
            "min_required": 0.15,
            "by_category": {
              "adversarial": 35,
              "boundary": 27,
              "negative": 22,
              "historical-failure": 11
            }
          }
        },
        "G4": {
          "gate": "G4_bias",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "axes": {
              "jurisdiction": {
                "column": "legal_entity_jurisdiction",
                "top_value": "US",
                "top_share": 0.338,
                "ceiling": 0.4,
                "passed": true
              },
              "region": {
                "column": "data_residency_region",
                "top_value": "us",
                "top_share": 0.378,
                "ceiling": 0.4,
                "passed": true
              }
            }
          }
        },
        "G5": {
          "gate": "G5_provenance",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "missing_columns": [],
            "unpopulated": {},
            "expected_columns": 11
          }
        },
        "G6": {
          "gate": "G6_determinism",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "hash_run_1": "b49be3adb859f2e44130d36fae1c3126f5ccb908d797b14e232070722a351e88",
            "hash_run_2": "b49be3adb859f2e44130d36fae1c3126f5ccb908d797b14e232070722a351e88",
            "match": true
          }
        }
      },
      "classification": "ENGINEERING_VERIFICATION",
      "advisory_label": "Synthetic data — engineering test fixture per ARCH-AI-001 AC-2. Not Gold Standard.",
      "provenance_columns": [
        "synthetic_record_id",
        "generator_id",
        "generator_version",
        "seed_value",
        "generation_timestamp_utc",
        "schema_hash_sha256",
        "is_edge_case",
        "edge_case_category",
        "partition_assignment",
        "partition_assignment_hash",
        "advisory_label"
      ]
    },
    {
      "dataset_id": "wf_gdp_001_receipt",
      "workflow_id": "WF-GDP-001",
      "workflow_name": "Goods Receipt & Inbound Inspection",
      "urs_refs": [
        "URS-34",
        "URS-09",
        "URS-10",
        "URS-11",
        "URS-23",
        "URS-33"
      ],
      "generator_id": "wf_gdp_001_receipt",
      "generator_version": "1.1.0",
      "generator_type": "rule_based+statistical",
      "seed_value": "vrx-gdp-001-seed-2026",
      "generation_timestamp_utc": "2026-06-08T00:00:00+00:00",
      "schema_version": "1.0.0",
      "schema_hash_sha256": "02701ee59cf1ee28e3c658e349abde2337a3026187eb38fdaf110979f5550af9",
      "output_file": "synthetic_data/wf_gdp_001_receipt_v1.0.parquet",
      "output_format": "parquet",
      "output_row_count": 50000,
      "content_sha256": "1207086ba1c4e86b0df3490c3d7df1d8a665f27a81f3dafc15487da73f6a5577",
      "content_sha256_note": "CANONICAL G6 determinism anchor — portable across toolchains (audit F2).",
      "output_sha256": "7f894b0c2377cf85258bc4b0ee768be9c39bdf89af1b4b54cf1fbdab98798027",
      "output_sha256_note": "Environment-bound Parquet-file integrity hash; reproducible only within the recorded build_environment, not across pyarrow versions (audit F2).",
      "build_environment": {
        "python": "3.13.12",
        "implementation": "CPython",
        "libraries": {
          "numpy": "2.4.6",
          "pandas": "3.0.3",
          "pyarrow": "24.0.0",
          "scipy": "1.17.1",
          "Faker": "40.21.0",
          "jsonschema": "4.26.0"
        }
      },
      "review_sample_file": "synthetic_data/wf_gdp_001_receipt_v1.0.review_sample.xlsx",
      "review_sample_row_count": 2250,
      "review_sample_sha256": "089b0e41d839333a83544f964390b4a9a966b4e90deda92cb561a8fbfc4ae2e0",
      "partition_map_file": "synthetic_data/wf_gdp_001_receipt_v1.0.partition_map.parquet",
      "partition_map_hash": "dcbacc276f2e51b11313f577a055e2ce29fb4e80ffeded38ad0cba6dbbcc0909",
      "large_dataset": false,
      "data_artifact_committed": true,
      "regeneration_note": null,
      "partition_split": {
        "training": 35000,
        "evaluation": 10000,
        "edge_case": 5000
      },
      "edge_case_coverage_pct": 0.162,
      "edge_case_by_category": {
        "boundary": 2900,
        "negative": 1900,
        "historical-failure": 1700,
        "adversarial": 1600
      },
      "bias_test_results": {
        "product": {
          "column": "product_id",
          "top_value": "53b43c60-4159-5ede-919c-eec5ea375e60",
          "top_share": 0.0016,
          "ceiling": 0.4,
          "passed": true
        },
        "supplier": {
          "column": "supplier_id",
          "top_value": "9ab4152d-2964-552f-a631-8fb19d2144e5",
          "top_share": 0.0026,
          "ceiling": 0.4,
          "passed": true
        }
      },
      "gold_standard_seed_corpus": [],
      "intended_use": null,
      "canonicalization_note": null,
      "model_grounding": {
        "source": "URS-34 (gdp_warehouse_receipts §6.2.1, gdp_inventory_lots §6.2.2, DEC-34-04 URS-33 QP-release inflow gate, DEC-34-05 warehouse release + e-sig, DEC-34-07 quarantine blocks dispatch, DEC-34-16 mandatory userId); Plan v1.0 §7.4 WF-GDP-001",
        "lifecycle": "received -> quarantined -> testing -> released_for_distribution | rejected | destroyed",
        "release_gate": "released_for_distribution requires qc_disposition=passed + quarantine_active=False + release e-sig by gdp_warehouse_authority != receiver (SoD).",
        "chain_link": "site_id -> MD-001; supplier_id -> MD-003; product_id -> MD-002; batch_id -> MFG-004; cold-chain-breach/rejected -> linked_deviation_id (QE-001), linked_oos_id (QE-002)."
      },
      "assumption_register": null,
      "gates": {
        "G1": {
          "gate": "G1_schema",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "rows": 50000,
            "failures": 0,
            "sample": []
          }
        },
        "G2": {
          "gate": "G2_statistical",
          "status": "N/A",
          "passed": true,
          "applicable": false,
          "detail": {
            "rationale": "WF-GDP-001: goods-receipt status/disposition mix is URS-34-grounded engineering distribution; no public warehouse-receipt corpus for a like-for-like anchor. G2 N/A."
          }
        },
        "G3": {
          "gate": "G3_edge_cases",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "edge_rows": 8100,
            "total": 50000,
            "pct": 0.162,
            "min_required": 0.15,
            "by_category": {
              "boundary": 2900,
              "negative": 1900,
              "historical-failure": 1700,
              "adversarial": 1600
            }
          }
        },
        "G4": {
          "gate": "G4_bias",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "axes": {
              "product": {
                "column": "product_id",
                "top_value": "53b43c60-4159-5ede-919c-eec5ea375e60",
                "top_share": 0.0016,
                "ceiling": 0.4,
                "passed": true
              },
              "supplier": {
                "column": "supplier_id",
                "top_value": "9ab4152d-2964-552f-a631-8fb19d2144e5",
                "top_share": 0.0026,
                "ceiling": 0.4,
                "passed": true
              }
            }
          }
        },
        "G5": {
          "gate": "G5_provenance",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "missing_columns": [],
            "unpopulated": {},
            "expected_columns": 11
          }
        },
        "G6": {
          "gate": "G6_determinism",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "hash_run_1": "1207086ba1c4e86b0df3490c3d7df1d8a665f27a81f3dafc15487da73f6a5577",
            "hash_run_2": "1207086ba1c4e86b0df3490c3d7df1d8a665f27a81f3dafc15487da73f6a5577",
            "match": true
          }
        }
      },
      "classification": "ENGINEERING_VERIFICATION",
      "advisory_label": "Synthetic data — engineering test fixture per ARCH-AI-001 AC-2. Not Gold Standard.",
      "provenance_columns": [
        "synthetic_record_id",
        "generator_id",
        "generator_version",
        "seed_value",
        "generation_timestamp_utc",
        "schema_hash_sha256",
        "is_edge_case",
        "edge_case_category",
        "partition_assignment",
        "partition_assignment_hash",
        "advisory_label"
      ]
    },
    {
      "dataset_id": "wf_gdp_002_cold_chain_receipt",
      "workflow_id": "WF-GDP-002",
      "workflow_name": "Cold Chain Data Verification at Receipt",
      "urs_refs": [
        "URS-34",
        "URS-09",
        "URS-10",
        "URS-11",
        "URS-15",
        "URS-16",
        "URS-23"
      ],
      "generator_id": "wf_gdp_002_cold_chain_receipt",
      "generator_version": "1.0.0",
      "generator_type": "rule_based+statistical",
      "seed_value": "vrx-gdp-002-seed-2026",
      "generation_timestamp_utc": "2026-06-08T00:00:00+00:00",
      "schema_version": "1.0.0",
      "schema_hash_sha256": "c4907548e3fea70860db46a500f1f7707d52c64dc8b00f49e4fcf41c624b94e2",
      "output_file": "synthetic_data/wf_gdp_002_cold_chain_receipt_v1.0.parquet",
      "output_format": "parquet",
      "output_row_count": 30000,
      "content_sha256": "a794d6153bd4c8cb0bbc4c312864d7a568aa4b498a24689f96d70f8fd02c18f8",
      "content_sha256_note": "CANONICAL G6 determinism anchor — portable across toolchains (audit F2).",
      "output_sha256": "891255544c7c2d81853898a2938064497f5338290b0d12690db86321dd7d0e10",
      "output_sha256_note": "Environment-bound Parquet-file integrity hash; reproducible only within the recorded build_environment, not across pyarrow versions (audit F2).",
      "build_environment": {
        "python": "3.13.12",
        "implementation": "CPython",
        "libraries": {
          "numpy": "2.4.6",
          "pandas": "3.0.3",
          "pyarrow": "24.0.0",
          "scipy": "1.17.1",
          "Faker": "40.21.0",
          "jsonschema": "4.26.0"
        }
      },
      "review_sample_file": "synthetic_data/wf_gdp_002_cold_chain_receipt_v1.0.review_sample.xlsx",
      "review_sample_row_count": 1600,
      "review_sample_sha256": "69815b148083d15a46f087ddb619789c58f9343d04790252ba369335046de616",
      "partition_map_file": "synthetic_data/wf_gdp_002_cold_chain_receipt_v1.0.partition_map.parquet",
      "partition_map_hash": "249263ea5939847a75d9a953afdc435e4ff84bf2a7af747423ff5184d4d82270",
      "large_dataset": false,
      "data_artifact_committed": true,
      "regeneration_note": null,
      "partition_split": {
        "training": 21000,
        "evaluation": 6000,
        "edge_case": 3000
      },
      "edge_case_coverage_pct": 0.16,
      "edge_case_by_category": {
        "boundary": 1800,
        "adversarial": 1200,
        "historical-failure": 1050,
        "negative": 750
      },
      "bias_test_results": {
        "product": {
          "column": "product_id",
          "top_value": "fb7bd7c2-3b26-5d01-a2f9-b5d376a4ad3b",
          "top_share": 0.0016,
          "ceiling": 0.4,
          "passed": true
        },
        "supplier": {
          "column": "supplier_id",
          "top_value": "1af1b3bb-21a2-5fb7-a787-5e26746a93d4",
          "top_share": 0.0028,
          "ceiling": 0.4,
          "passed": true
        }
      },
      "gold_standard_seed_corpus": [],
      "intended_use": null,
      "canonicalization_note": null,
      "model_grounding": {
        "source": "URS-34 (gdp_warehouse_receipts §6.2.1, gdp_cold_chain_readings §6.2.5, DEC-34-09 transport excursion -> URS-15 OOS / URS-16 deviation); Plan v1.0 §7.4 WF-GDP-002; USP <1150> MKT; WHO TRS 961",
        "mkt": "transit MKT (Arrhenius, dH/R=83144/8.314~10000K computed) adjudicates the receipt cold-chain; brief excursions within deviation budget (raw breach, MKT ok) are accepted; MKT excursion above action -> quarantine + deviation (raw_breach_minutes kept as metadata).",
        "chain_link": "MKT excursions -> linked_deviation_id (QE-001), linked_oos_id (QE-002); severe -> recall_id resolving to QE-007 cold_chain_excursion recalls (bidirectional round-trip); site->MD-001, supplier->MD-003, product->MD-002, batch->MFG-004."
      },
      "assumption_register": null,
      "gates": {
        "G1": {
          "gate": "G1_schema",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "rows": 30000,
            "failures": 0,
            "sample": []
          }
        },
        "G2": {
          "gate": "G2_statistical",
          "status": "N/A",
          "passed": true,
          "applicable": false,
          "detail": {
            "rationale": "WF-GDP-002: transit temperature/MKT are physical/computed; cold-chain disposition mix is URS-34-grounded. No public cold-chain-receipt frequency corpus for a like-for-like anchor (Plan cites WHO TRS 961 as a design prior). G2 N/A; MKT adjudication deterministic."
          }
        },
        "G3": {
          "gate": "G3_edge_cases",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "edge_rows": 4800,
            "total": 30000,
            "pct": 0.16,
            "min_required": 0.15,
            "by_category": {
              "boundary": 1800,
              "adversarial": 1200,
              "historical-failure": 1050,
              "negative": 750
            }
          }
        },
        "G4": {
          "gate": "G4_bias",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "axes": {
              "product": {
                "column": "product_id",
                "top_value": "fb7bd7c2-3b26-5d01-a2f9-b5d376a4ad3b",
                "top_share": 0.0016,
                "ceiling": 0.4,
                "passed": true
              },
              "supplier": {
                "column": "supplier_id",
                "top_value": "1af1b3bb-21a2-5fb7-a787-5e26746a93d4",
                "top_share": 0.0028,
                "ceiling": 0.4,
                "passed": true
              }
            }
          }
        },
        "G5": {
          "gate": "G5_provenance",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "missing_columns": [],
            "unpopulated": {},
            "expected_columns": 11
          }
        },
        "G6": {
          "gate": "G6_determinism",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "hash_run_1": "a794d6153bd4c8cb0bbc4c312864d7a568aa4b498a24689f96d70f8fd02c18f8",
            "hash_run_2": "a794d6153bd4c8cb0bbc4c312864d7a568aa4b498a24689f96d70f8fd02c18f8",
            "match": true
          }
        }
      },
      "classification": "ENGINEERING_VERIFICATION",
      "advisory_label": "Synthetic data — engineering test fixture per ARCH-AI-001 AC-2. Not Gold Standard.",
      "provenance_columns": [
        "synthetic_record_id",
        "generator_id",
        "generator_version",
        "seed_value",
        "generation_timestamp_utc",
        "schema_hash_sha256",
        "is_edge_case",
        "edge_case_category",
        "partition_assignment",
        "partition_assignment_hash",
        "advisory_label"
      ]
    },
    {
      "dataset_id": "wf_gdp_003_dispatch",
      "workflow_id": "WF-GDP-003",
      "workflow_name": "Outbound Dispatch & Shipment (FEFO-outbound, pick-pack-release-ship)",
      "urs_refs": [
        "URS-34",
        "URS-09",
        "URS-10",
        "URS-11",
        "URS-23"
      ],
      "generator_id": "wf_gdp_003_dispatch",
      "generator_version": "1.1.0",
      "generator_type": "rule_based+statistical",
      "seed_value": "vrx-gdp-003-seed-2026",
      "generation_timestamp_utc": "2026-06-08T00:00:00+00:00",
      "schema_version": "1.0.0",
      "schema_hash_sha256": "238ca80f7aef0d88fed2ab4bb278422a6f6217aee0e3d4257261108b9c546575",
      "output_file": "synthetic_data/wf_gdp_003_dispatch_v1.0.parquet",
      "output_format": "parquet",
      "output_row_count": 100000,
      "content_sha256": "5d70203ed1d026f2571ce23b76e370908b435b60681bbe3146b112840a7193ce",
      "content_sha256_note": "CANONICAL G6 determinism anchor — portable across toolchains (audit F2).",
      "output_sha256": "a1443a2fe4dfc1d0bb027a3c9680b483cd328839fe1b7b03ed4d7d5c0cdf3af8",
      "output_sha256_note": "Environment-bound Parquet-file integrity hash; reproducible only within the recorded build_environment, not across pyarrow versions (audit F2).",
      "build_environment": {
        "python": "3.13.12",
        "implementation": "CPython",
        "libraries": {
          "numpy": "2.4.6",
          "pandas": "3.0.3",
          "pyarrow": "24.0.0",
          "scipy": "1.17.1",
          "Faker": "40.21.0",
          "jsonschema": "4.26.0"
        }
      },
      "review_sample_file": "synthetic_data/wf_gdp_003_dispatch_v1.0.review_sample.xlsx",
      "review_sample_row_count": 2650,
      "review_sample_sha256": "0e78330fcee6327d99c43f3bf2c95dad38125ec91e1744dc107fa27bc4654b9e",
      "partition_map_file": "synthetic_data/wf_gdp_003_dispatch_v1.0.partition_map.parquet",
      "partition_map_hash": "c389b8b10aa2ec2349ce6ab5df62b241be6c01dbc6110709a67123804d65fd48",
      "large_dataset": false,
      "data_artifact_committed": true,
      "regeneration_note": null,
      "partition_split": {
        "training": 70000,
        "evaluation": 20000,
        "edge_case": 10000
      },
      "edge_case_coverage_pct": 0.1585,
      "edge_case_by_category": {
        "boundary": 6360,
        "negative": 5583,
        "adversarial": 3200,
        "historical-failure": 705
      },
      "bias_test_results": {
        "product": {
          "column": "product_id",
          "top_value": "c5af3905-705b-54cf-afe9-37de122b9cc3",
          "top_share": 0.0031,
          "ceiling": 0.4,
          "passed": true
        },
        "site": {
          "column": "site_id",
          "top_value": "f57f616a-1e4a-5df6-a50c-fd0ebb0ed249",
          "top_share": 0.0738,
          "ceiling": 0.4,
          "passed": true
        }
      },
      "gold_standard_seed_corpus": [],
      "intended_use": null,
      "canonicalization_note": null,
      "model_grounding": {
        "source": "URS-34 (gdp_dispatch_orders URS-034-005 draft->allocated->picked->packed->released; gdp_shipments v2 URS-034-006/007 ->ready->released->picked_up->in_transit->delivered, release gate authority+esig+SoD; listInventoryFEFO expiry ASC excl expired; gdp_inventory_lots status gating); Plan GDP block (outbound dispatch)",
        "fefo": "earliest-expiry-first outbound allocation among released, released_zone, non-expired, in-stock lots per (product, site); deterministic + independently recomputable (fefo_rank, fefo_earliest_expiry_date). Authorized fefo_override is a real URS-34 path (authority+esig+reason).",
        "chain_link": "dispatch lot -> product (MD-002), batch (MFG-004), site (MD-001), received_lot_id (GDP-001 received lots), consignee, carrier; transport excursion -> linked_deviation_id (QE-001), recall_id (QE-007 cold_chain_excursion round-trip)."
      },
      "assumption_register": null,
      "gates": {
        "G1": {
          "gate": "G1_schema",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "rows": 100000,
            "failures": 0,
            "sample": []
          }
        },
        "G2": {
          "gate": "G2_statistical",
          "status": "N/A",
          "passed": true,
          "applicable": false,
          "detail": {
            "rationale": "WF-GDP-003: outbound dispatch movement mix is URS-34-grounded; no public dispatch-frequency corpus. G2 N/A; FEFO-outbound + dispatch-block are deterministic rules."
          }
        },
        "G3": {
          "gate": "G3_edge_cases",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "edge_rows": 15848,
            "total": 100000,
            "pct": 0.1585,
            "min_required": 0.15,
            "by_category": {
              "boundary": 6360,
              "negative": 5583,
              "adversarial": 3200,
              "historical-failure": 705
            }
          }
        },
        "G4": {
          "gate": "G4_bias",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "axes": {
              "product": {
                "column": "product_id",
                "top_value": "c5af3905-705b-54cf-afe9-37de122b9cc3",
                "top_share": 0.0031,
                "ceiling": 0.4,
                "passed": true
              },
              "site": {
                "column": "site_id",
                "top_value": "f57f616a-1e4a-5df6-a50c-fd0ebb0ed249",
                "top_share": 0.0738,
                "ceiling": 0.4,
                "passed": true
              }
            }
          }
        },
        "G5": {
          "gate": "G5_provenance",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "missing_columns": [],
            "unpopulated": {},
            "expected_columns": 11
          }
        },
        "G6": {
          "gate": "G6_determinism",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "hash_run_1": "5d70203ed1d026f2571ce23b76e370908b435b60681bbe3146b112840a7193ce",
            "hash_run_2": "5d70203ed1d026f2571ce23b76e370908b435b60681bbe3146b112840a7193ce",
            "match": true
          }
        }
      },
      "classification": "ENGINEERING_VERIFICATION",
      "advisory_label": "Synthetic data — engineering test fixture per ARCH-AI-001 AC-2. Not Gold Standard.",
      "provenance_columns": [
        "synthetic_record_id",
        "generator_id",
        "generator_version",
        "seed_value",
        "generation_timestamp_utc",
        "schema_hash_sha256",
        "is_edge_case",
        "edge_case_category",
        "partition_assignment",
        "partition_assignment_hash",
        "advisory_label"
      ]
    },
    {
      "dataset_id": "wf_gdp_005_storage_fefo",
      "workflow_id": "WF-GDP-005",
      "workflow_name": "Storage Segregation Management (FEFO, status zones)",
      "urs_refs": [
        "URS-34",
        "URS-09",
        "URS-10",
        "URS-23"
      ],
      "generator_id": "wf_gdp_005_storage_fefo",
      "generator_version": "1.1.0",
      "generator_type": "rule_based+statistical",
      "seed_value": "vrx-gdp-005-seed-2026",
      "generation_timestamp_utc": "2026-06-08T00:00:00+00:00",
      "schema_version": "1.0.0",
      "schema_hash_sha256": "dc8ed75da64177df4dab896c43e2f64543037098c7c150af40ae83af765f6cc0",
      "output_file": "synthetic_data/wf_gdp_005_storage_fefo_v1.0.parquet",
      "output_format": "parquet",
      "output_row_count": 100000,
      "content_sha256": "93082cde8a0dba7768a1eaf5758aea86424d9cc0c8ee5a7904ec6bdf733335e1",
      "content_sha256_note": "CANONICAL G6 determinism anchor — portable across toolchains (audit F2).",
      "output_sha256": "f6d66bbcad77242ad68efb418d3de1dabf57d5657f925828ca495a3a9daba4d2",
      "output_sha256_note": "Environment-bound Parquet-file integrity hash; reproducible only within the recorded build_environment, not across pyarrow versions (audit F2).",
      "build_environment": {
        "python": "3.13.12",
        "implementation": "CPython",
        "libraries": {
          "numpy": "2.4.6",
          "pandas": "3.0.3",
          "pyarrow": "24.0.0",
          "scipy": "1.17.1",
          "Faker": "40.21.0",
          "jsonschema": "4.26.0"
        }
      },
      "review_sample_file": "synthetic_data/wf_gdp_005_storage_fefo_v1.0.review_sample.xlsx",
      "review_sample_row_count": 1700,
      "review_sample_sha256": "92ec024f7c8767f89659a248056517738cfe9fb630b3330b28e6f494afbd9353",
      "partition_map_file": "synthetic_data/wf_gdp_005_storage_fefo_v1.0.partition_map.parquet",
      "partition_map_hash": "21a8e944e2c45f3315d02b1af1392a8c9ee6a44aacbd3b9515e6266110d7c6ef",
      "large_dataset": false,
      "data_artifact_committed": true,
      "regeneration_note": null,
      "partition_split": {
        "training": 70000,
        "evaluation": 20000,
        "edge_case": 10000
      },
      "edge_case_coverage_pct": 0.155,
      "edge_case_by_category": {
        "adversarial": 4500,
        "negative": 4000,
        "historical-failure": 3500,
        "boundary": 3500
      },
      "bias_test_results": {
        "product": {
          "column": "product_id",
          "top_value": "1817192d-a228-553e-b381-83fe03f87104",
          "top_share": 0.0038,
          "ceiling": 0.4,
          "passed": true
        },
        "site": {
          "column": "site_id",
          "top_value": "8e20a928-5ba5-57fe-852e-14418ffa0a32",
          "top_share": 0.0731,
          "ceiling": 0.4,
          "passed": true
        }
      },
      "gold_standard_seed_corpus": [],
      "intended_use": null,
      "canonicalization_note": null,
      "model_grounding": {
        "source": "URS-34 (gdp_inventory_lots §6.2.2, DEC-34-06 FEFO allocation + lifecycle, DEC-34-07 quarantine blocks allocation); Plan v1.0 §7.4 WF-GDP-005",
        "fefo": "earliest-expiry-first allocation among released, non-expired, in-stock lots per (product, site); deterministic + independently recomputable (fefo_rank, fefo_earliest_expiry_date).",
        "chain_link": "lot -> product (MD-002), batch (MFG-004), site (MD-001), received_lot_id (GDP-001 received lots); FEFO/zone violations -> linked_deviation_id (QE-001)."
      },
      "assumption_register": null,
      "gates": {
        "G1": {
          "gate": "G1_schema",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "rows": 100000,
            "failures": 0,
            "sample": []
          }
        },
        "G2": {
          "gate": "G2_statistical",
          "status": "N/A",
          "passed": true,
          "applicable": false,
          "detail": {
            "rationale": "WF-GDP-005: storage movement / FEFO mix is URS-34-grounded; no public warehouse-movement frequency corpus. G2 N/A; FEFO allocation is a deterministic rule."
          }
        },
        "G3": {
          "gate": "G3_edge_cases",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "edge_rows": 15500,
            "total": 100000,
            "pct": 0.155,
            "min_required": 0.15,
            "by_category": {
              "adversarial": 4500,
              "negative": 4000,
              "historical-failure": 3500,
              "boundary": 3500
            }
          }
        },
        "G4": {
          "gate": "G4_bias",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "axes": {
              "product": {
                "column": "product_id",
                "top_value": "1817192d-a228-553e-b381-83fe03f87104",
                "top_share": 0.0038,
                "ceiling": 0.4,
                "passed": true
              },
              "site": {
                "column": "site_id",
                "top_value": "8e20a928-5ba5-57fe-852e-14418ffa0a32",
                "top_share": 0.0731,
                "ceiling": 0.4,
                "passed": true
              }
            }
          }
        },
        "G5": {
          "gate": "G5_provenance",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "missing_columns": [],
            "unpopulated": {},
            "expected_columns": 11
          }
        },
        "G6": {
          "gate": "G6_determinism",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "hash_run_1": "93082cde8a0dba7768a1eaf5758aea86424d9cc0c8ee5a7904ec6bdf733335e1",
            "hash_run_2": "93082cde8a0dba7768a1eaf5758aea86424d9cc0c8ee5a7904ec6bdf733335e1",
            "match": true
          }
        }
      },
      "classification": "ENGINEERING_VERIFICATION",
      "advisory_label": "Synthetic data — engineering test fixture per ARCH-AI-001 AC-2. Not Gold Standard.",
      "provenance_columns": [
        "synthetic_record_id",
        "generator_id",
        "generator_version",
        "seed_value",
        "generation_timestamp_utc",
        "schema_hash_sha256",
        "is_edge_case",
        "edge_case_category",
        "partition_assignment",
        "partition_assignment_hash",
        "advisory_label"
      ]
    },
    {
      "dataset_id": "wf_gdp_006_cold_chain",
      "workflow_id": "WF-GDP-006",
      "workflow_name": "Cold Chain Storage Monitoring",
      "urs_refs": [
        "URS-34",
        "URS-09",
        "URS-15",
        "URS-16",
        "URS-23"
      ],
      "generator_id": "wf_gdp_006_cold_chain",
      "generator_version": "1.0.0",
      "generator_type": "rule_based+statistical",
      "seed_value": "vrx-gdp-006-seed-2026",
      "generation_timestamp_utc": "2026-06-08T00:00:00+00:00",
      "schema_version": "1.0.0",
      "schema_hash_sha256": "bc2bbd5ef6059a4259713a031d5e712cf78ac8d2d209b35378db5aa64e318c8e",
      "output_file": "synthetic_data/wf_gdp_006_cold_chain_v1.0.parquet",
      "output_format": "parquet",
      "output_row_count": 26280000,
      "content_sha256": "693003f5f47c8bb86316c0415f2372f2065147e67f79e9336036a3e377ab7a0b",
      "content_sha256_note": "CANONICAL G6 determinism anchor — portable across toolchains (audit F2).",
      "output_sha256": "9186d9327c7f03ab2448be3ec3fe3f2b9bd7bad38c8c6a1c0538af0cb459cdd1",
      "output_sha256_note": "Environment-bound Parquet-file integrity hash; reproducible only within the recorded build_environment, not across pyarrow versions (audit F2).",
      "build_environment": {
        "python": "3.13.12",
        "implementation": "CPython",
        "libraries": {
          "numpy": "2.4.6",
          "pandas": "3.0.3",
          "pyarrow": "24.0.0",
          "scipy": "1.17.1",
          "Faker": "40.21.0",
          "jsonschema": "4.26.0"
        }
      },
      "review_sample_file": "synthetic_data/wf_gdp_006_cold_chain_v1.0.review_sample.xlsx",
      "review_sample_row_count": 1500,
      "review_sample_sha256": "da7aea12d76b2f3482d5a29be11c7e11ec3cfbe547587353027a7a3c3cd47e83",
      "partition_map_file": null,
      "partition_map_hash": "c29ad5e452698e803e1ec9c78a39b31bac74bc651c001c45c96cd9ab8d99d981",
      "large_dataset": true,
      "data_artifact_committed": false,
      "regeneration_note": "Heavy data artifact (.parquet + partition_map) is gitignored. Regenerate bit-identical from generator + locked seed; verify against content_sha256. Generator+seed is the ALCOA Original (Plan §4).",
      "partition_split": {
        "training": 18396000,
        "evaluation": 5256000,
        "edge_case": 2628000
      },
      "edge_case_coverage_pct": 0.1652,
      "edge_case_by_category": {
        "boundary": 2660013,
        "adversarial": 780179,
        "historical-failure": 540221,
        "negative": 360095
      },
      "bias_test_results": {
        "storage_area": {
          "column": "storage_area_id",
          "top_value": "0a5d0551-019b-86de-3cf2-47422c91f435",
          "top_share": 0.02,
          "ceiling": 0.4,
          "passed": true
        },
        "site": {
          "column": "site_id",
          "top_value": "f57f616a-1e4a-5df6-a50c-fd0ebb0ed249",
          "top_share": 0.14,
          "ceiling": 0.4,
          "passed": true
        }
      },
      "gold_standard_seed_corpus": [],
      "intended_use": null,
      "canonicalization_note": null,
      "model_grounding": {
        "source": "URS-34 (gdp_cold_chain_readings append-only, DEC-34-09 transport excursion -> URS-15 OOS); Plan v1.0 §7.4 WF-GDP-006; USP <1150>/<1160> MKT; EU GMP Annex 11/15; WHO TRS 961",
        "mkt": "MKT = (dH/R) / -ln( mean_window( exp(-(dH/R)/T_kelvin) ) ), dH/R computed as 83144/8.314 ~ 10000 K (never hardcoded); excursion adjudicated by MKT vs labeled condition, NOT raw breach (raw_breach kept as metadata).",
        "chain_link": "MKT excursions -> linked_deviation_id (QE-001), linked_oos_id (QE-002); severe sustained -> recall_id resolving to QE-007 recalls source_type='cold_chain_excursion' (bidirectional round-trip); site_id -> MD-001."
      },
      "assumption_register": null,
      "gates": {
        "G1": {
          "gate": "G1_schema_vectorized",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "rows": 26280000,
            "failures": 0,
            "by_column": {},
            "extra_columns": [],
            "missing_required": []
          }
        },
        "G2": {
          "gate": "G2_statistical",
          "status": "N/A",
          "passed": true,
          "applicable": false,
          "detail": {
            "rationale": "WF-GDP-006: temperature/humidity/MKT are physical/computed quantities with no public Gold-Standard frequency corpus (Plan cites WHO TRS 961 + stability profiles as design priors, not an empirical anchor). G2 N/A; MKT adjudication is deterministic (Arrhenius)."
          }
        },
        "G3": {
          "gate": "G3_edge_cases",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "edge_rows": 4340508,
            "total": 26280000,
            "pct": 0.1652,
            "min_required": 0.15,
            "by_category": {
              "boundary": 2660013,
              "adversarial": 780179,
              "historical-failure": 540221,
              "negative": 360095
            }
          }
        },
        "G4": {
          "gate": "G4_bias",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "axes": {
              "storage_area": {
                "column": "storage_area_id",
                "top_value": "0a5d0551-019b-86de-3cf2-47422c91f435",
                "top_share": 0.02,
                "ceiling": 0.4,
                "passed": true
              },
              "site": {
                "column": "site_id",
                "top_value": "f57f616a-1e4a-5df6-a50c-fd0ebb0ed249",
                "top_share": 0.14,
                "ceiling": 0.4,
                "passed": true
              }
            }
          }
        },
        "G5": {
          "gate": "G5_provenance",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "missing_columns": [],
            "unpopulated": {},
            "expected_columns": 11
          }
        },
        "G6": {
          "gate": "G6_determinism",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "hash_run_1": "693003f5f47c8bb86316c0415f2372f2065147e67f79e9336036a3e377ab7a0b",
            "hash_run_2": "693003f5f47c8bb86316c0415f2372f2065147e67f79e9336036a3e377ab7a0b",
            "match": true
          }
        }
      },
      "classification": "ENGINEERING_VERIFICATION",
      "advisory_label": "Synthetic data — engineering test fixture per ARCH-AI-001 AC-2. Not Gold Standard.",
      "provenance_columns": [
        "synthetic_record_id",
        "generator_id",
        "generator_version",
        "seed_value",
        "generation_timestamp_utc",
        "schema_hash_sha256",
        "is_edge_case",
        "edge_case_category",
        "partition_assignment",
        "partition_assignment_hash",
        "advisory_label"
      ]
    },
    {
      "dataset_id": "wf_lab_003_stability",
      "workflow_id": "WF-LAB-003",
      "workflow_name": "Stability Program (ICH Q1A)",
      "urs_refs": [
        "URS-24"
      ],
      "generator_id": "wf_lab_003_stability",
      "generator_version": "1.0.0",
      "generator_type": "rule_based+statistical",
      "seed_value": "vrx-lab-003-seed-2026",
      "generation_timestamp_utc": "2026-06-08T00:00:00+00:00",
      "schema_version": "1.0.0",
      "schema_hash_sha256": "c090ab1c83ad154ac3ddaa19e609161801a6376a478044a2e7ef26a7b73fd8ee",
      "output_file": "synthetic_data/wf_lab_003_stability_v1.0.parquet",
      "output_format": "parquet",
      "output_row_count": 49920,
      "content_sha256": "244d7ba63bf4d49f9766af1f8f92eaa554ddb5a09b402890755552ca0610c1b5",
      "content_sha256_note": "CANONICAL G6 determinism anchor — portable across toolchains (audit F2).",
      "output_sha256": "8b710483aa5631381928dbfaf5cf01dd1114a8cc6c3f1cf357494de0b7feac99",
      "output_sha256_note": "Environment-bound Parquet-file integrity hash; reproducible only within the recorded build_environment, not across pyarrow versions (audit F2).",
      "build_environment": {
        "python": "3.13.12",
        "implementation": "CPython",
        "libraries": {
          "numpy": "2.4.6",
          "pandas": "3.0.3",
          "pyarrow": "24.0.0",
          "scipy": "1.17.1",
          "Faker": "40.21.0",
          "jsonschema": "4.26.0"
        }
      },
      "review_sample_file": "synthetic_data/wf_lab_003_stability_v1.0.review_sample.xlsx",
      "review_sample_row_count": 2200,
      "review_sample_sha256": "4ffd90ad1caf484874cdaa70661502634227232d34fc2da62f8fb1587700631f",
      "partition_map_file": "synthetic_data/wf_lab_003_stability_v1.0.partition_map.parquet",
      "partition_map_hash": "daa7bcdf515109573ac78ebe09a98dc646663009425e92fdc969673824ffe356",
      "large_dataset": false,
      "data_artifact_committed": true,
      "regeneration_note": null,
      "partition_split": {
        "training": 34944,
        "evaluation": 9984,
        "edge_case": 4992
      },
      "edge_case_coverage_pct": 0.1583,
      "edge_case_by_category": {
        "negative": 2700,
        "boundary": 2300,
        "historical-failure": 1700,
        "adversarial": 1200
      },
      "bias_test_results": {
        "product": {
          "column": "product_id",
          "top_value": "c4153bc1-237e-5627-8343-c625b150ddb3",
          "top_share": 0.0038,
          "ceiling": 0.4,
          "passed": true
        },
        "test_name": {
          "column": "test_name",
          "top_value": "Assay (%)",
          "top_share": 0.25,
          "ceiling": 0.4,
          "passed": true
        }
      },
      "gold_standard_seed_corpus": [],
      "intended_use": null,
      "canonicalization_note": null,
      "model_grounding": {
        "source": "stability module URS-024 (036_stability.sql tables; service.ts:868-873 spec auto-classify, 1383-1445 shelf-life, 907-973 OOS->oos_investigations); stability.schema.ts storage_type/result_status enums; Phase-1C plan LAB-003 (50k, ICH Q1A engineered prior)",
        "load_bearing": "(1) spec-breach auto-classification (result_status IFF outside spec) and (2) shelf-life = earliest-fail-1 - both independently recomputable from stored results",
        "chain_link": "product_id -> MD-002; site_id -> MD-001; batch_id -> MFG-004; linked_oos_id -> QE-002 (stability-OOS round-trip both ways)."
      },
      "assumption_register": [
        "ICH Q1A(R2) conditions (25/60, 30/65, 40/75) and timepoints (0,3,6,9,12,18,24,36 long-term; 0,3,6 accelerated; 0,3,6,9,12 intermediate) - engineered (module stores as data, not constants).",
        "Per-attribute degradation rate (linear month_rate x condition accel_factor) - engineered parametric prior; real module persists no regression (shelf-life is earliest-fail-1 heuristic).",
        "Spec limits per attribute (Assay 95-105, Impurities <=2.0, Water <=5.0, Dissolution >=80) - engineered ICH/USP-typical.",
        "linked_oos_id FK to QE-002 - the real module links stability OOS to oos_investigations via audit trail (no DB FK); modeled as an explicit FK for the synthetic join."
      ],
      "gates": {
        "G1": {
          "gate": "G1_schema",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "rows": 49920,
            "failures": 0,
            "sample": []
          }
        },
        "G2": {
          "gate": "G2_statistical",
          "status": "N/A",
          "passed": true,
          "applicable": false,
          "detail": {
            "rationale": "WF-LAB-003: ICH Q1A degradation is an ENGINEERED PARAMETRIC PRIOR (no public stability dataset), documented in assumption_register per Phase-1C plan 2.4. G2 N/A."
          }
        },
        "G3": {
          "gate": "G3_edge_cases",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "edge_rows": 7900,
            "total": 49920,
            "pct": 0.1583,
            "min_required": 0.15,
            "by_category": {
              "negative": 2700,
              "boundary": 2300,
              "historical-failure": 1700,
              "adversarial": 1200
            }
          }
        },
        "G4": {
          "gate": "G4_bias",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "axes": {
              "product": {
                "column": "product_id",
                "top_value": "c4153bc1-237e-5627-8343-c625b150ddb3",
                "top_share": 0.0038,
                "ceiling": 0.4,
                "passed": true
              },
              "test_name": {
                "column": "test_name",
                "top_value": "Assay (%)",
                "top_share": 0.25,
                "ceiling": 0.4,
                "passed": true
              }
            }
          }
        },
        "G5": {
          "gate": "G5_provenance",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "missing_columns": [],
            "unpopulated": {},
            "expected_columns": 11
          }
        },
        "G6": {
          "gate": "G6_determinism",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "hash_run_1": "244d7ba63bf4d49f9766af1f8f92eaa554ddb5a09b402890755552ca0610c1b5",
            "hash_run_2": "244d7ba63bf4d49f9766af1f8f92eaa554ddb5a09b402890755552ca0610c1b5",
            "match": true
          }
        }
      },
      "classification": "ENGINEERING_VERIFICATION",
      "advisory_label": "Synthetic data — engineering test fixture per ARCH-AI-001 AC-2. Not Gold Standard.",
      "provenance_columns": [
        "synthetic_record_id",
        "generator_id",
        "generator_version",
        "seed_value",
        "generation_timestamp_utc",
        "schema_hash_sha256",
        "is_edge_case",
        "edge_case_category",
        "partition_assignment",
        "partition_assignment_hash",
        "advisory_label"
      ]
    },
    {
      "dataset_id": "wf_lab_005_em",
      "workflow_id": "WF-LAB-005",
      "workflow_name": "Environmental Monitoring (sterile pilot)",
      "urs_refs": [
        "URS-25",
        "URS-09",
        "URS-15",
        "URS-16",
        "URS-23"
      ],
      "generator_id": "wf_lab_005_em",
      "generator_version": "1.0.0",
      "generator_type": "rule_based+statistical",
      "seed_value": "vrx-lab-005-seed-2026",
      "generation_timestamp_utc": "2026-06-08T00:00:00+00:00",
      "schema_version": "1.0.0",
      "schema_hash_sha256": "32ef01c3a5c16579c8abd089f196dd3e93ab336eba328c718fcbcbffd6a29430",
      "output_file": "synthetic_data/wf_lab_005_em_v1.0.parquet",
      "output_format": "parquet",
      "output_row_count": 876000,
      "content_sha256": "b8a4e66caa96ae897b5b64c3ee4c03445d8e717c7ad04d60e0ce84d2c9d8ac69",
      "content_sha256_note": "CANONICAL G6 determinism anchor — portable across toolchains (audit F2).",
      "output_sha256": "2434865d807963a1d4667b2fe3a352bbff3a6e44963eb7c7709a4a5530755536",
      "output_sha256_note": "Environment-bound Parquet-file integrity hash; reproducible only within the recorded build_environment, not across pyarrow versions (audit F2).",
      "build_environment": {
        "python": "3.13.12",
        "implementation": "CPython",
        "libraries": {
          "numpy": "2.4.6",
          "pandas": "3.0.3",
          "pyarrow": "24.0.0",
          "scipy": "1.17.1",
          "Faker": "40.21.0",
          "jsonschema": "4.26.0"
        }
      },
      "review_sample_file": "synthetic_data/wf_lab_005_em_v1.0.review_sample.xlsx",
      "review_sample_row_count": 1500,
      "review_sample_sha256": "bdf6128c6cf042658d6b6cbda400c774c89c31820eb8e0edc51c2fcaa24ef844",
      "partition_map_file": null,
      "partition_map_hash": "31cf31f20936fca636b64495870a1417d7a8bdd54056a6d1f955d993f1bcbf06",
      "large_dataset": true,
      "data_artifact_committed": false,
      "regeneration_note": "Heavy data artifact (.parquet + partition_map) is gitignored. Regenerate bit-identical from generator + locked seed; verify against content_sha256. Generator+seed is the ALCOA Original (Plan §4).",
      "partition_split": {
        "training": 613200,
        "evaluation": 175200,
        "edge_case": 87600
      },
      "edge_case_coverage_pct": 0.153,
      "edge_case_by_category": {
        "boundary": 89800,
        "historical-failure": 22000,
        "adversarial": 18600,
        "negative": 3600
      },
      "bias_test_results": {
        "sample_type": {
          "column": "sample_type",
          "top_value": "particulate",
          "top_share": 0.23,
          "ceiling": 0.4,
          "passed": true
        },
        "sampling_point": {
          "column": "sampling_point_id",
          "top_value": "07cc18b7-9a66-63e0-4777-c24e23f7d8ed",
          "top_share": 0.01,
          "ceiling": 0.4,
          "passed": true
        }
      },
      "gold_standard_seed_corpus": [
        {
          "gold_standard_id": "GS-002",
          "derived_anchor": "gold_standard/GS-002_cfr_em_cluster_freq.json",
          "what": "FDA Part-211 facility/microbiology (EM) citation-section frequency",
          "seeds": "EM excursion implicated_cfr_section",
          "gate": "G2 chi-square goodness-of-fit (excursion subset)"
        }
      ],
      "intended_use": null,
      "canonicalization_note": null,
      "model_grounding": {
        "source": "URS-25 (DEC-25-03 grade/sample_type, DEC-25-05 alert/action/spec auto-excursion + is_alert/is_action_exceeded, DEC-25-06 record->review->approve + SoD-25-02/03, DEC-25-07 excursion lifecycle open->under_investigation->linked_to_deviation->closed, DEC-25-17 excursion->URS-16 deviation primary consumer, DEC-25-13 batch/chamber context, DEC-25-23 GenAI prohibition); ISO 14644-1; EU GMP Annex 1 (2022); USP <1116>",
        "excursion": "result_value_numeric > applied_action_value -> action excursion (mandatory investigation + URS-16 deviation); > alert -> alert excursion; at-exactly-action-limit is inclusive (action). Grade A (ISO 5) zero-tolerance: any viable>0 = action.",
        "chain_link": "action excursions link linked_deviation_id -> WF-QE-001, linked_oos_id -> WF-QE-002; site_id -> MD-001; media-fill reads batch_id -> MFG-004."
      },
      "assumption_register": null,
      "gates": {
        "G1": {
          "gate": "G1_schema_vectorized",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "rows": 876000,
            "failures": 0,
            "by_column": {},
            "extra_columns": [],
            "missing_required": []
          }
        },
        "G2": {
          "gate": "G2_categorical",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "chi2_stat": 0.0018907668751979763,
            "p_value": 0.9999999998592771,
            "alpha": 0.05,
            "categories": 7,
            "df": 6,
            "n_observed": 6269,
            "anchor_total": 7689,
            "min_detectable_effect_w": 0.047,
            "effect_band_detectable": "small",
            "power_note": "PASS = not inconsistent with the Gold Standard at n=6269 (detects Cohen's w >= 0.047 at 80% power); demonstrates consistency, not high-confidence identity."
          }
        },
        "G3": {
          "gate": "G3_edge_cases",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "edge_rows": 134000,
            "total": 876000,
            "pct": 0.153,
            "min_required": 0.15,
            "by_category": {
              "boundary": 89800,
              "historical-failure": 22000,
              "adversarial": 18600,
              "negative": 3600
            }
          }
        },
        "G4": {
          "gate": "G4_bias",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "axes": {
              "sample_type": {
                "column": "sample_type",
                "top_value": "particulate",
                "top_share": 0.23,
                "ceiling": 0.4,
                "passed": true
              },
              "sampling_point": {
                "column": "sampling_point_id",
                "top_value": "07cc18b7-9a66-63e0-4777-c24e23f7d8ed",
                "top_share": 0.01,
                "ceiling": 0.4,
                "passed": true
              }
            }
          }
        },
        "G5": {
          "gate": "G5_provenance",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "missing_columns": [],
            "unpopulated": {},
            "expected_columns": 11
          }
        },
        "G6": {
          "gate": "G6_determinism",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "hash_run_1": "b8a4e66caa96ae897b5b64c3ee4c03445d8e717c7ad04d60e0ce84d2c9d8ac69",
            "hash_run_2": "b8a4e66caa96ae897b5b64c3ee4c03445d8e717c7ad04d60e0ce84d2c9d8ac69",
            "match": true
          }
        }
      },
      "classification": "ENGINEERING_VERIFICATION",
      "advisory_label": "Synthetic data — engineering test fixture per ARCH-AI-001 AC-2. Not Gold Standard.",
      "provenance_columns": [
        "synthetic_record_id",
        "generator_id",
        "generator_version",
        "seed_value",
        "generation_timestamp_utc",
        "schema_hash_sha256",
        "is_edge_case",
        "edge_case_category",
        "partition_assignment",
        "partition_assignment_hash",
        "advisory_label"
      ]
    },
    {
      "dataset_id": "wf_md_001_sites",
      "workflow_id": "WF-MD-001",
      "workflow_name": "Site / Facility Setup",
      "urs_refs": [
        "URS-09"
      ],
      "generator_id": "wf_md_001_sites",
      "generator_version": "1.1.0",
      "generator_type": "rule_based+statistical",
      "seed_value": "vrx-md-001-seed-2026",
      "generation_timestamp_utc": "2026-06-08T00:00:00+00:00",
      "schema_version": "1.0.0",
      "schema_hash_sha256": "fc21efb1d023fa3c869ea0d08b2a1f016571e008b46633a091d649f6a3811a91",
      "output_file": "synthetic_data/wf_md_001_sites_v1.0.parquet",
      "output_format": "parquet",
      "output_row_count": 100,
      "content_sha256": "51a804f60fe0e96b0b4584e6992a2d17e2c3f4dcd07f122efd0f8a8b54a9b101",
      "content_sha256_note": "CANONICAL G6 determinism anchor — portable across toolchains (audit F2).",
      "output_sha256": "5f77b830d078ff5a2647774b8165c0d996c6413bcdf812ae85de8f1234e6c4a7",
      "output_sha256_note": "Environment-bound Parquet-file integrity hash; reproducible only within the recorded build_environment, not across pyarrow versions (audit F2).",
      "build_environment": {
        "python": "3.13.12",
        "implementation": "CPython",
        "libraries": {
          "numpy": "2.4.6",
          "pandas": "3.0.3",
          "pyarrow": "24.0.0",
          "scipy": "1.17.1",
          "Faker": "40.21.0",
          "jsonschema": "4.26.0"
        }
      },
      "review_sample_file": "synthetic_data/wf_md_001_sites_v1.0.review_sample.xlsx",
      "review_sample_row_count": 100,
      "review_sample_sha256": "0aa63a39d89f797878c835140d94545cdde3f3a2ec281fd4dd1dbcac909e051d",
      "partition_map_file": "synthetic_data/wf_md_001_sites_v1.0.partition_map.parquet",
      "partition_map_hash": "a099d6c8e817aed3dca8df012bcf740374dc1533efd68fe3d1922cb54b4cc4db",
      "large_dataset": false,
      "data_artifact_committed": true,
      "regeneration_note": null,
      "partition_split": {
        "training": 70,
        "evaluation": 20,
        "edge_case": 10
      },
      "edge_case_coverage_pct": 0.2,
      "edge_case_by_category": {
        "boundary": 7,
        "adversarial": 6,
        "negative": 4,
        "historical-failure": 3
      },
      "bias_test_results": {
        "country": {
          "column": "country",
          "top_value": "IN",
          "top_share": 0.36,
          "ceiling": 0.4,
          "passed": true
        },
        "site_type": {
          "column": "site_type",
          "top_value": "manufacturing",
          "top_share": 0.37,
          "ceiling": 0.4,
          "passed": true
        }
      },
      "gold_standard_seed_corpus": [],
      "intended_use": null,
      "canonicalization_note": null,
      "model_grounding": {
        "source": "URS-09 (DEC-09-01 site types, DEC-09-03 record/lifecycle, DEC-09-04 cleanroom, DEC-09-15 high-risk, §6.6 audit codes)",
        "lifecycle_states": "planned | in_qualification | operational | suspended | decommissioned | withdrawn",
        "gxp_classification": "gmp | glp | gcp | gdp | multi",
        "cleanroom_grades": "A | B | C | D | not_applicable (EU GMP Annex 1)",
        "high_risk_types": "sterile_injectable_aseptic/terminal, biologic, controlled_substance, clinical_phase_1/2/3, compounding_pharmacy (DEC-09-15)"
      },
      "assumption_register": {
        "note": "Declared engineering assumptions.",
        "site_type_mix": {
          "source": "URS-09 §8 representative distribution",
          "values": {
            "manufacturing": 0.24,
            "warehouse": 0.22,
            "distribution_centre": 0.15,
            "laboratory": 0.14,
            "packaging": 0.09,
            "r_and_d": 0.07,
            "clinical_site": 0.06,
            "compounding_pharmacy": 0.02,
            "label_printing": 0.01
          }
        }
      },
      "gates": {
        "G1": {
          "gate": "G1_schema",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "rows": 100,
            "failures": 0,
            "sample": []
          }
        },
        "G2": {
          "gate": "G2_statistical",
          "status": "N/A",
          "passed": true,
          "applicable": false,
          "detail": {
            "rationale": "WF-MD-001 Sites: no Gold Standard site-master corpus locked. G2 N/A. Site types, gxp classification, cleanroom grades, lifecycle and audit codes grounded in URS-09 Target State."
          }
        },
        "G3": {
          "gate": "G3_edge_cases",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "edge_rows": 20,
            "total": 100,
            "pct": 0.2,
            "min_required": 0.15,
            "by_category": {
              "boundary": 7,
              "adversarial": 6,
              "negative": 4,
              "historical-failure": 3
            }
          }
        },
        "G4": {
          "gate": "G4_bias",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "axes": {
              "country": {
                "column": "country",
                "top_value": "IN",
                "top_share": 0.36,
                "ceiling": 0.4,
                "passed": true
              },
              "site_type": {
                "column": "site_type",
                "top_value": "manufacturing",
                "top_share": 0.37,
                "ceiling": 0.4,
                "passed": true
              }
            }
          }
        },
        "G5": {
          "gate": "G5_provenance",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "missing_columns": [],
            "unpopulated": {},
            "expected_columns": 11
          }
        },
        "G6": {
          "gate": "G6_determinism",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "hash_run_1": "51a804f60fe0e96b0b4584e6992a2d17e2c3f4dcd07f122efd0f8a8b54a9b101",
            "hash_run_2": "51a804f60fe0e96b0b4584e6992a2d17e2c3f4dcd07f122efd0f8a8b54a9b101",
            "match": true
          }
        }
      },
      "classification": "ENGINEERING_VERIFICATION",
      "advisory_label": "Synthetic data — engineering test fixture per ARCH-AI-001 AC-2. Not Gold Standard.",
      "provenance_columns": [
        "synthetic_record_id",
        "generator_id",
        "generator_version",
        "seed_value",
        "generation_timestamp_utc",
        "schema_hash_sha256",
        "is_edge_case",
        "edge_case_category",
        "partition_assignment",
        "partition_assignment_hash",
        "advisory_label"
      ]
    },
    {
      "dataset_id": "wf_md_002_products",
      "workflow_id": "WF-MD-002",
      "workflow_name": "Product / SKU / Drug Master",
      "urs_refs": [
        "URS-10"
      ],
      "generator_id": "wf_md_002_products",
      "generator_version": "1.1.0",
      "generator_type": "rule_based+statistical",
      "seed_value": "vrx-md-002-seed-2026",
      "generation_timestamp_utc": "2026-06-08T00:00:00+00:00",
      "schema_version": "1.0.0",
      "schema_hash_sha256": "8a33582b257a1a99ee76c2c77c1de03bc31d29239689b2a9731a9d15e1c37669",
      "output_file": "synthetic_data/wf_md_002_products_v1.0.parquet",
      "output_format": "parquet",
      "output_row_count": 1000,
      "content_sha256": "28a0e477a6819ec6b479e457460a342907d4bac9dfd1a4e1c4b1fdf7d6a5b5d9",
      "content_sha256_note": "CANONICAL G6 determinism anchor — portable across toolchains (audit F2).",
      "output_sha256": "9d484cfe607bd828d38e0c7c8efffc1443d12b8c18c526f5547f35071187d82a",
      "output_sha256_note": "Environment-bound Parquet-file integrity hash; reproducible only within the recorded build_environment, not across pyarrow versions (audit F2).",
      "build_environment": {
        "python": "3.13.12",
        "implementation": "CPython",
        "libraries": {
          "numpy": "2.4.6",
          "pandas": "3.0.3",
          "pyarrow": "24.0.0",
          "scipy": "1.17.1",
          "Faker": "40.21.0",
          "jsonschema": "4.26.0"
        }
      },
      "review_sample_file": "synthetic_data/wf_md_002_products_v1.0.review_sample.xlsx",
      "review_sample_row_count": 691,
      "review_sample_sha256": "3a6091b79ece558310828260cd90b90692b8f9c2ea0f8700cef82721c1aeffb0",
      "partition_map_file": "synthetic_data/wf_md_002_products_v1.0.partition_map.parquet",
      "partition_map_hash": "304a8dfbac5f24c0465930ae5c279b67c4fa81231d0dc621cdb34702f701af99",
      "large_dataset": false,
      "data_artifact_committed": true,
      "regeneration_note": null,
      "partition_split": {
        "training": 700,
        "evaluation": 200,
        "edge_case": 100
      },
      "edge_case_coverage_pct": 0.18,
      "edge_case_by_category": {
        "adversarial": 60,
        "negative": 50,
        "boundary": 40,
        "historical-failure": 30
      },
      "bias_test_results": {
        "product_category": {
          "column": "product_category",
          "top_value": "generic",
          "top_share": 0.312,
          "ceiling": 0.4,
          "passed": true
        },
        "registration_market": {
          "column": "registration_market",
          "top_value": "IN",
          "top_share": 0.319,
          "ceiling": 0.4,
          "passed": true
        }
      },
      "gold_standard_seed_corpus": [],
      "intended_use": null,
      "canonicalization_note": null,
      "model_grounding": {
        "source": "URS-10 (DEC-10-02 dose form/substance class, DEC-10-03 lifecycle, DEC-10-04 registrations, DEC-10-14 high-risk, §6.6 audit codes)",
        "lifecycle_states": "in_development | in_registration | commercial | suspended | discontinued | withdrawn",
        "substance_classes": "small_molecule_chemical | biological | vaccine | cell_therapy | gene_therapy | controlled_substance | radiopharmaceutical | combination_product",
        "registrations": "multi-jurisdiction (fda/ema/cdsco/pmda/health_canada) with pathway (nda/anda/bla/maa/cdsco_form_26/44...) and per-registration status."
      },
      "assumption_register": {
        "note": "Declared engineering assumptions.",
        "product_category_mix": {
          "source": "Plan §7.2 (generic-dominant, rebalanced for G4)",
          "values": {
            "generic": 0.33,
            "small_molecule": 0.22,
            "otc": 0.18,
            "biologic": 0.14,
            "vaccine": 0.09,
            "atmp": 0.04
          }
        }
      },
      "gates": {
        "G1": {
          "gate": "G1_schema",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "rows": 1000,
            "failures": 0,
            "sample": []
          }
        },
        "G2": {
          "gate": "G2_statistical",
          "status": "N/A",
          "passed": true,
          "applicable": false,
          "detail": {
            "rationale": "WF-MD-002 Products: no Gold Standard drug-master corpus locked for KS testing. G2 N/A. Dose forms, substance classes, lifecycle, registrations and audit codes grounded in URS-10."
          }
        },
        "G3": {
          "gate": "G3_edge_cases",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "edge_rows": 180,
            "total": 1000,
            "pct": 0.18,
            "min_required": 0.15,
            "by_category": {
              "adversarial": 60,
              "negative": 50,
              "boundary": 40,
              "historical-failure": 30
            }
          }
        },
        "G4": {
          "gate": "G4_bias",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "axes": {
              "product_category": {
                "column": "product_category",
                "top_value": "generic",
                "top_share": 0.312,
                "ceiling": 0.4,
                "passed": true
              },
              "registration_market": {
                "column": "registration_market",
                "top_value": "IN",
                "top_share": 0.319,
                "ceiling": 0.4,
                "passed": true
              }
            }
          }
        },
        "G5": {
          "gate": "G5_provenance",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "missing_columns": [],
            "unpopulated": {},
            "expected_columns": 11
          }
        },
        "G6": {
          "gate": "G6_determinism",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "hash_run_1": "28a0e477a6819ec6b479e457460a342907d4bac9dfd1a4e1c4b1fdf7d6a5b5d9",
            "hash_run_2": "28a0e477a6819ec6b479e457460a342907d4bac9dfd1a4e1c4b1fdf7d6a5b5d9",
            "match": true
          }
        }
      },
      "classification": "ENGINEERING_VERIFICATION",
      "advisory_label": "Synthetic data — engineering test fixture per ARCH-AI-001 AC-2. Not Gold Standard.",
      "provenance_columns": [
        "synthetic_record_id",
        "generator_id",
        "generator_version",
        "seed_value",
        "generation_timestamp_utc",
        "schema_hash_sha256",
        "is_edge_case",
        "edge_case_category",
        "partition_assignment",
        "partition_assignment_hash",
        "advisory_label"
      ]
    },
    {
      "dataset_id": "wf_md_003_suppliers",
      "workflow_id": "WF-MD-003",
      "workflow_name": "Supplier Master & Qualification Lifecycle",
      "urs_refs": [
        "URS-11"
      ],
      "generator_id": "wf_md_003_suppliers",
      "generator_version": "1.1.0",
      "generator_type": "rule_based+statistical",
      "seed_value": "vrx-md-003-seed-2026",
      "generation_timestamp_utc": "2026-06-08T00:00:00+00:00",
      "schema_version": "1.0.0",
      "schema_hash_sha256": "2e0beac7c01e06e5efd65ab1db3b88c360c7b80a50143024410a0394664c6dd5",
      "output_file": "synthetic_data/wf_md_003_suppliers_v1.0.parquet",
      "output_format": "parquet",
      "output_row_count": 500,
      "content_sha256": "b2f61b47a42d740a5e51691b850f1efc9d84931eb994ccc862a1811951744146",
      "content_sha256_note": "CANONICAL G6 determinism anchor — portable across toolchains (audit F2).",
      "output_sha256": "a12a77556996077900e9cfb95c336204d2d61ecec944df815d65a84e741076f7",
      "output_sha256_note": "Environment-bound Parquet-file integrity hash; reproducible only within the recorded build_environment, not across pyarrow versions (audit F2).",
      "build_environment": {
        "python": "3.13.12",
        "implementation": "CPython",
        "libraries": {
          "numpy": "2.4.6",
          "pandas": "3.0.3",
          "pyarrow": "24.0.0",
          "scipy": "1.17.1",
          "Faker": "40.21.0",
          "jsonschema": "4.26.0"
        }
      },
      "review_sample_file": "synthetic_data/wf_md_003_suppliers_v1.0.review_sample.xlsx",
      "review_sample_row_count": 500,
      "review_sample_sha256": "5a637b91a43b543a884fae7a0efdbd7142542d8611a73b0fbb110ff07c1b25af",
      "partition_map_file": "synthetic_data/wf_md_003_suppliers_v1.0.partition_map.parquet",
      "partition_map_hash": "e3c55c9f74e07d6fedf93fb6e2421a6a1ae13dfeebb66b32767828f5c0dce116",
      "large_dataset": false,
      "data_artifact_committed": true,
      "regeneration_note": null,
      "partition_split": {
        "training": 350,
        "evaluation": 100,
        "edge_case": 50
      },
      "edge_case_coverage_pct": 0.19,
      "edge_case_by_category": {
        "adversarial": 51,
        "boundary": 18,
        "historical-failure": 14,
        "negative": 12
      },
      "bias_test_results": {
        "jurisdiction": {
          "column": "legal_entity_jurisdiction",
          "top_value": "IN",
          "top_share": 0.342,
          "ceiling": 0.4,
          "passed": true
        },
        "supplier_type": {
          "column": "supplier_type",
          "top_value": "excipient_supplier",
          "top_share": 0.204,
          "ceiling": 0.4,
          "passed": true
        }
      },
      "gold_standard_seed_corpus": [
        "GS-004"
      ],
      "intended_use": null,
      "canonicalization_note": null,
      "model_grounding": {
        "source": "URS-11 (DEC-11-01 types, DEC-11-02 criticality, DEC-11-03 lifecycle, batch material gate pre-checks, §audit codes)",
        "lifecycle_states": "under_evaluation | provisionally_qualified | qualified | suspended | disqualified | rejected",
        "criticality": "critical | major | minor (drives audit cadence: annual/biennial/triennial)",
        "supplier_types": "api_supplier, excipient_supplier, cmo, cdmo, cro, ctl, packaging_material_supplier, ... (multi-select; primary captured)",
        "statistical_seeding": "Plan §7.2: repeat-offender pattern from Gold Standard recalls (GS-004). historical_quality_event_count sampled from empirical recalls-per-firm distribution."
      },
      "assumption_register": {
        "note": "Declared engineering assumptions.",
        "supplier_type_mix": {
          "source": "engineering assumption",
          "values": {
            "excipient_supplier": 0.2,
            "api_supplier": 0.16,
            "packaging_material_supplier": 0.16,
            "cmo": 0.1,
            "cdmo": 0.08,
            "cro": 0.08,
            "ctl": 0.07,
            "equipment_vendor": 0.07,
            "distribution_provider": 0.08
          }
        }
      },
      "gates": {
        "G1": {
          "gate": "G1_schema",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "rows": 500,
            "failures": 0,
            "sample": []
          }
        },
        "G2": {
          "gate": "G2_statistical",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "ks_stat": 0.015076923076923076,
            "p_value": 1.0,
            "alpha": 0.05
          }
        },
        "G3": {
          "gate": "G3_edge_cases",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "edge_rows": 95,
            "total": 500,
            "pct": 0.19,
            "min_required": 0.15,
            "by_category": {
              "adversarial": 51,
              "boundary": 18,
              "historical-failure": 14,
              "negative": 12
            }
          }
        },
        "G4": {
          "gate": "G4_bias",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "axes": {
              "jurisdiction": {
                "column": "legal_entity_jurisdiction",
                "top_value": "IN",
                "top_share": 0.342,
                "ceiling": 0.4,
                "passed": true
              },
              "supplier_type": {
                "column": "supplier_type",
                "top_value": "excipient_supplier",
                "top_share": 0.204,
                "ceiling": 0.4,
                "passed": true
              }
            }
          }
        },
        "G5": {
          "gate": "G5_provenance",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "missing_columns": [],
            "unpopulated": {},
            "expected_columns": 11
          }
        },
        "G6": {
          "gate": "G6_determinism",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "hash_run_1": "b2f61b47a42d740a5e51691b850f1efc9d84931eb994ccc862a1811951744146",
            "hash_run_2": "b2f61b47a42d740a5e51691b850f1efc9d84931eb994ccc862a1811951744146",
            "match": true
          }
        }
      },
      "classification": "ENGINEERING_VERIFICATION",
      "advisory_label": "Synthetic data — engineering test fixture per ARCH-AI-001 AC-2. Not Gold Standard.",
      "provenance_columns": [
        "synthetic_record_id",
        "generator_id",
        "generator_version",
        "seed_value",
        "generation_timestamp_utc",
        "schema_hash_sha256",
        "is_edge_case",
        "edge_case_category",
        "partition_assignment",
        "partition_assignment_hash",
        "advisory_label"
      ]
    },
    {
      "dataset_id": "wf_md_004_supplier_material_gate",
      "workflow_id": "WF-MD-004",
      "workflow_name": "Supplier-Product-Material Approval & Batch Material Gate",
      "urs_refs": [
        "URS-11",
        "URS-10",
        "URS-23",
        "URS-09"
      ],
      "generator_id": "wf_md_004_supplier_material_gate",
      "generator_version": "1.1.0",
      "generator_type": "rule_based+statistical",
      "seed_value": "vrx-md-004-seed-2026",
      "generation_timestamp_utc": "2026-06-08T00:00:00+00:00",
      "schema_version": "1.0.0",
      "schema_hash_sha256": "6622a583c2d8b65b078c8972b5ed658f232b3f8f7058a25b9a7f3e81e20e40c2",
      "output_file": "synthetic_data/wf_md_004_supplier_material_gate_v1.0.parquet",
      "output_format": "parquet",
      "output_row_count": 50000,
      "content_sha256": "33d92874621326fc697b719a10f8c94eb668e56480ed1f51535591348770afcf",
      "content_sha256_note": "CANONICAL G6 determinism anchor — portable across toolchains (audit F2).",
      "output_sha256": "67f2d0e1c079f9982643d6d699e9ef1e65130b429f1850e1b1a10703bf1dfb55",
      "output_sha256_note": "Environment-bound Parquet-file integrity hash; reproducible only within the recorded build_environment, not across pyarrow versions (audit F2).",
      "build_environment": {
        "python": "3.13.12",
        "implementation": "CPython",
        "libraries": {
          "numpy": "2.4.6",
          "pandas": "3.0.3",
          "pyarrow": "24.0.0",
          "scipy": "1.17.1",
          "Faker": "40.21.0",
          "jsonschema": "4.26.0"
        }
      },
      "review_sample_file": "synthetic_data/wf_md_004_supplier_material_gate_v1.0.review_sample.xlsx",
      "review_sample_row_count": 1650,
      "review_sample_sha256": "e40b4b5e0a0147e936129d6f648c2c21637628384a9043ff89b1bcafedb6ee1d",
      "partition_map_file": "synthetic_data/wf_md_004_supplier_material_gate_v1.0.partition_map.parquet",
      "partition_map_hash": "132c5fba5dca343b00bb6c3cb5cb9f783583de1621bb6c2536868ebe2b2decca",
      "large_dataset": false,
      "data_artifact_committed": true,
      "regeneration_note": null,
      "partition_split": {
        "training": 35000,
        "evaluation": 10000,
        "edge_case": 5000
      },
      "edge_case_coverage_pct": 0.18,
      "edge_case_by_category": {
        "adversarial": 5100,
        "negative": 2400,
        "historical-failure": 1000,
        "boundary": 500
      },
      "bias_test_results": {
        "material_type": {
          "column": "material_type",
          "top_value": "excipient",
          "top_share": 0.2611,
          "ceiling": 0.4,
          "passed": true
        },
        "tenant": {
          "column": "tenant_id",
          "top_value": "8b7f7551-1ef4-5649-9c20-2de391a89107",
          "top_share": 0.0098,
          "ceiling": 0.4,
          "passed": true
        }
      },
      "gold_standard_seed_corpus": [],
      "intended_use": null,
      "canonicalization_note": null,
      "model_grounding": {
        "source": "URS-11 batch material gate pre-checks (supplier lifecycle_state==qualified, qualification not expired, not suspended, supplier_product_linkage exists, CMO site qualified, not disqualified); EDGE-001..007.",
        "gate_decisions": "pass | deny_supplier_unqualified | deny_supplier_expired | deny_supplier_suspended | deny_supplier_disqualified | deny_not_approved_for_product | deny_not_approved_for_site",
        "six_prechecks": "[1] qualified [2] not expired [3] not suspended [4] product linkage exists [5] CMO site qualified [6] not disqualified"
      },
      "assumption_register": {
        "note": "Declared engineering assumptions.",
        "material_type_mix": {
          "source": "engineering assumption",
          "values": {
            "excipient": 0.26,
            "api": 0.22,
            "packaging_primary": 0.18,
            "packaging_secondary": 0.14,
            "raw_material": 0.12,
            "processing_aid": 0.08
          }
        },
        "normal_pass_rate": {
          "source": "URS-11 §summary (~95% pass)",
          "value": 0.95
        }
      },
      "gates": {
        "G1": {
          "gate": "G1_schema",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "rows": 50000,
            "failures": 0,
            "sample": []
          }
        },
        "G2": {
          "gate": "G2_statistical",
          "status": "N/A",
          "passed": true,
          "applicable": false,
          "detail": {
            "rationale": "WF-MD-004 batch material gate: no Gold Standard gate-decision corpus. G2 N/A. Gate decision model + 6 pre-gate supplier-qualification checks grounded in URS-11 (forward-referenced to URS-23)."
          }
        },
        "G3": {
          "gate": "G3_edge_cases",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "edge_rows": 9000,
            "total": 50000,
            "pct": 0.18,
            "min_required": 0.15,
            "by_category": {
              "adversarial": 5100,
              "negative": 2400,
              "historical-failure": 1000,
              "boundary": 500
            }
          }
        },
        "G4": {
          "gate": "G4_bias",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "axes": {
              "material_type": {
                "column": "material_type",
                "top_value": "excipient",
                "top_share": 0.2611,
                "ceiling": 0.4,
                "passed": true
              },
              "tenant": {
                "column": "tenant_id",
                "top_value": "8b7f7551-1ef4-5649-9c20-2de391a89107",
                "top_share": 0.0098,
                "ceiling": 0.4,
                "passed": true
              }
            }
          }
        },
        "G5": {
          "gate": "G5_provenance",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "missing_columns": [],
            "unpopulated": {},
            "expected_columns": 11
          }
        },
        "G6": {
          "gate": "G6_determinism",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "hash_run_1": "33d92874621326fc697b719a10f8c94eb668e56480ed1f51535591348770afcf",
            "hash_run_2": "33d92874621326fc697b719a10f8c94eb668e56480ed1f51535591348770afcf",
            "match": true
          }
        }
      },
      "classification": "ENGINEERING_VERIFICATION",
      "advisory_label": "Synthetic data — engineering test fixture per ARCH-AI-001 AC-2. Not Gold Standard.",
      "provenance_columns": [
        "synthetic_record_id",
        "generator_id",
        "generator_version",
        "seed_value",
        "generation_timestamp_utc",
        "schema_hash_sha256",
        "is_edge_case",
        "edge_case_category",
        "partition_assignment",
        "partition_assignment_hash",
        "advisory_label"
      ]
    },
    {
      "dataset_id": "wf_md_005_studies",
      "workflow_id": "WF-MD-005",
      "workflow_name": "Study Configuration",
      "urs_refs": [
        "URS-07"
      ],
      "generator_id": "wf_md_005_studies",
      "generator_version": "1.1.0",
      "generator_type": "rule_based+statistical",
      "seed_value": "vrx-md-005-seed-2026",
      "generation_timestamp_utc": "2026-06-08T00:00:00+00:00",
      "schema_version": "1.0.0",
      "schema_hash_sha256": "97bcbceb59a1dfeb043743979b6162dcfe201dee1bcf8b920bdc84dd4fcf15fc",
      "output_file": "synthetic_data/wf_md_005_studies_v1.0.parquet",
      "output_format": "parquet",
      "output_row_count": 200,
      "content_sha256": "2e98981eb7fdfd85d74c67783abd7e086a4f039f451814469a190ad74c9a0089",
      "content_sha256_note": "CANONICAL G6 determinism anchor — portable across toolchains (audit F2).",
      "output_sha256": "71a10a26243272019878359ecec1cf8a86f194333d37082aababb21f184aef03",
      "output_sha256_note": "Environment-bound Parquet-file integrity hash; reproducible only within the recorded build_environment, not across pyarrow versions (audit F2).",
      "build_environment": {
        "python": "3.13.12",
        "implementation": "CPython",
        "libraries": {
          "numpy": "2.4.6",
          "pandas": "3.0.3",
          "pyarrow": "24.0.0",
          "scipy": "1.17.1",
          "Faker": "40.21.0",
          "jsonschema": "4.26.0"
        }
      },
      "review_sample_file": "synthetic_data/wf_md_005_studies_v1.0.review_sample.xlsx",
      "review_sample_row_count": 200,
      "review_sample_sha256": "834805c3874230e5a993bfd495ede0f2d0d527821e73b5a7b7a839c983748d20",
      "partition_map_file": "synthetic_data/wf_md_005_studies_v1.0.partition_map.parquet",
      "partition_map_hash": "e95a2a523810edbf56af8fc2917ebd63dd0b1741a9f5e9c67b9af29c4bbcde93",
      "large_dataset": false,
      "data_artifact_committed": true,
      "regeneration_note": null,
      "partition_split": {
        "training": 140,
        "evaluation": 40,
        "edge_case": 20
      },
      "edge_case_coverage_pct": 0.195,
      "edge_case_by_category": {
        "adversarial": 14,
        "boundary": 11,
        "negative": 10,
        "historical-failure": 4
      },
      "bias_test_results": {
        "study_type": {
          "column": "study_type",
          "top_value": "stability",
          "top_share": 0.365,
          "ceiling": 0.4,
          "passed": true
        },
        "jurisdiction": {
          "column": "jurisdiction",
          "top_value": "US",
          "top_share": 0.34,
          "ceiling": 0.4,
          "passed": true
        }
      },
      "gold_standard_seed_corpus": [],
      "intended_use": null,
      "canonicalization_note": null,
      "model_grounding": {
        "source": "URS-07 (DEC-07-01 study types, DEC-07-02 lifecycle, DEC-07-06 amendments, DEC-07-13 retention, §6.6 audit codes)",
        "study_types": "stability | validation | method_validation | cleaning_validation | equipment_qualification | process_validation | audit_study | manufacturing_campaign | bioequivalence (clinical_phase_* disabled at launch)",
        "lifecycle_states": "draft | in_setup | active | on_hold | closed | archived | withdrawn",
        "plan_reconciliation": "Plan §7.2 names stability/process_validation/EM. Environmental Monitoring is URS-25 domain (not a URS-07 study_type); MD-005 covers the 9 URS-07 study-configuration types, stability/process_validation dominant.",
        "pull_schedule_acceptance_note": "URS-07 §2.2: pull_schedule and acceptance_criteria are NOT URS-07 core fields (owned by domain modules URS-13/24/25). Included here as domain-derived summary strings for the study-config fixture per Plan schema."
      },
      "assumption_register": {
        "note": "Declared engineering assumptions for domain-derived fields (pull schedules per ICH Q1A, PV stages, acceptance criteria)."
      },
      "gates": {
        "G1": {
          "gate": "G1_schema",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "rows": 200,
            "failures": 0,
            "sample": []
          }
        },
        "G2": {
          "gate": "G2_statistical",
          "status": "N/A",
          "passed": true,
          "applicable": false,
          "detail": {
            "rationale": "WF-MD-005 Studies: no Gold Standard study-config corpus. G2 N/A. Study types, lifecycle, amendment model, retention classes and audit codes grounded in URS-07."
          }
        },
        "G3": {
          "gate": "G3_edge_cases",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "edge_rows": 39,
            "total": 200,
            "pct": 0.195,
            "min_required": 0.15,
            "by_category": {
              "adversarial": 14,
              "boundary": 11,
              "negative": 10,
              "historical-failure": 4
            }
          }
        },
        "G4": {
          "gate": "G4_bias",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "axes": {
              "study_type": {
                "column": "study_type",
                "top_value": "stability",
                "top_share": 0.365,
                "ceiling": 0.4,
                "passed": true
              },
              "jurisdiction": {
                "column": "jurisdiction",
                "top_value": "US",
                "top_share": 0.34,
                "ceiling": 0.4,
                "passed": true
              }
            }
          }
        },
        "G5": {
          "gate": "G5_provenance",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "missing_columns": [],
            "unpopulated": {},
            "expected_columns": 11
          }
        },
        "G6": {
          "gate": "G6_determinism",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "hash_run_1": "2e98981eb7fdfd85d74c67783abd7e086a4f039f451814469a190ad74c9a0089",
            "hash_run_2": "2e98981eb7fdfd85d74c67783abd7e086a4f039f451814469a190ad74c9a0089",
            "match": true
          }
        }
      },
      "classification": "ENGINEERING_VERIFICATION",
      "advisory_label": "Synthetic data — engineering test fixture per ARCH-AI-001 AC-2. Not Gold Standard.",
      "provenance_columns": [
        "synthetic_record_id",
        "generator_id",
        "generator_version",
        "seed_value",
        "generation_timestamp_utc",
        "schema_hash_sha256",
        "is_edge_case",
        "edge_case_category",
        "partition_assignment",
        "partition_assignment_hash",
        "advisory_label"
      ]
    },
    {
      "dataset_id": "wf_mfg_004_bmr",
      "workflow_id": "WF-MFG-004",
      "workflow_name": "Batch Manufacturing Record / MBR",
      "urs_refs": [
        "URS-23"
      ],
      "generator_id": "wf_mfg_004_bmr",
      "generator_version": "1.1.0",
      "generator_type": "rule_based+statistical",
      "seed_value": "vrx-mfg-004-seed-2026",
      "generation_timestamp_utc": "2026-06-08T00:00:00+00:00",
      "schema_version": "1.0.0",
      "schema_hash_sha256": "1cf279134d70155d55f6a3a15e4c98373b66853789889ce0c51269fac4d31555",
      "output_file": "synthetic_data/wf_mfg_004_bmr_v1.0.parquet",
      "output_format": "parquet",
      "output_row_count": 2000,
      "content_sha256": "62d77607450417399a0060774684c4b3d6b3e424020b9d4f676c0417eabda316",
      "content_sha256_note": "CANONICAL G6 determinism anchor — portable across toolchains (audit F2).",
      "output_sha256": "53e543ec891f7f835fab3c6d5d347066ec792a967c921ba5e0fdb16b528e8e3b",
      "output_sha256_note": "Environment-bound Parquet-file integrity hash; reproducible only within the recorded build_environment, not across pyarrow versions (audit F2).",
      "build_environment": {
        "python": "3.13.12",
        "implementation": "CPython",
        "libraries": {
          "numpy": "2.4.6",
          "pandas": "3.0.3",
          "pyarrow": "24.0.0",
          "scipy": "1.17.1",
          "Faker": "40.21.0",
          "jsonschema": "4.26.0"
        }
      },
      "review_sample_file": "synthetic_data/wf_mfg_004_bmr_v1.0.review_sample.xlsx",
      "review_sample_row_count": 1060,
      "review_sample_sha256": "4b981838862e5b7ea1c1cd168800316c648e0ea21971be5602ae41ea00dc0d8f",
      "partition_map_file": "synthetic_data/wf_mfg_004_bmr_v1.0.partition_map.parquet",
      "partition_map_hash": "fa150e036a4507791523988774b5f6eccefc33016ead8b8296bd490ff5ccbe31",
      "large_dataset": false,
      "data_artifact_committed": true,
      "regeneration_note": null,
      "partition_split": {
        "training": 1400,
        "evaluation": 400,
        "edge_case": 200
      },
      "edge_case_coverage_pct": 0.185,
      "edge_case_by_category": {
        "negative": 120,
        "adversarial": 100,
        "boundary": 80,
        "historical-failure": 70
      },
      "bias_test_results": {
        "site": {
          "column": "site_id",
          "top_value": "5bd817d9-520e-598a-9cec-88b19a9cc2ba",
          "top_share": 0.0165,
          "ceiling": 0.4,
          "passed": true
        },
        "product": {
          "column": "product_id",
          "top_value": "601331b1-f202-55df-8025-0536962e33f5",
          "top_share": 0.004,
          "ceiling": 0.4,
          "passed": true
        }
      },
      "gold_standard_seed_corpus": [],
      "intended_use": null,
      "canonicalization_note": null,
      "model_grounding": {
        "source": "URS-23 (DEC-23-08 terminal-state gate, DEC-23-09 ownership, DEC-23-11 completeness, DEC-23-12 e-sig, DEC-23-13 SoD QA!=QP, DEC-23-22 governed reopen, audit codes)",
        "lifecycle": "created -> in_progress -> execution_complete -> under_qa_review -> pending_release -> released|rejected -> locked; governed reopen locked -> in_progress (append-only iteration)",
        "ipc": "in_spec vs [spec_low,spec_high]; out-of-spec must link a URS-16 deviation (completeness DEC-23-11)",
        "sod": "SoD-23-04 QA!=executor; SoD-23-05 QP!=QA (waiver only with documented policy)"
      },
      "assumption_register": {
        "note": "Declared engineering assumptions.",
        "normal_yield_range": {
          "source": "engineering assumption",
          "values": [
            90,
            102
          ]
        }
      },
      "gates": {
        "G1": {
          "gate": "G1_schema",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "rows": 2000,
            "failures": 0,
            "sample": []
          }
        },
        "G2": {
          "gate": "G2_statistical",
          "status": "N/A",
          "passed": true,
          "applicable": false,
          "detail": {
            "rationale": "WF-MFG-004 BMR: no Gold Standard batch-record corpus. G2 N/A. Batch lifecycle, IPC structure, yield, embedded-deviation linkage, SoD and audit codes grounded in URS-23."
          }
        },
        "G3": {
          "gate": "G3_edge_cases",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "edge_rows": 370,
            "total": 2000,
            "pct": 0.185,
            "min_required": 0.15,
            "by_category": {
              "negative": 120,
              "adversarial": 100,
              "boundary": 80,
              "historical-failure": 70
            }
          }
        },
        "G4": {
          "gate": "G4_bias",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "axes": {
              "site": {
                "column": "site_id",
                "top_value": "5bd817d9-520e-598a-9cec-88b19a9cc2ba",
                "top_share": 0.0165,
                "ceiling": 0.4,
                "passed": true
              },
              "product": {
                "column": "product_id",
                "top_value": "601331b1-f202-55df-8025-0536962e33f5",
                "top_share": 0.004,
                "ceiling": 0.4,
                "passed": true
              }
            }
          }
        },
        "G5": {
          "gate": "G5_provenance",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "missing_columns": [],
            "unpopulated": {},
            "expected_columns": 11
          }
        },
        "G6": {
          "gate": "G6_determinism",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "hash_run_1": "62d77607450417399a0060774684c4b3d6b3e424020b9d4f676c0417eabda316",
            "hash_run_2": "62d77607450417399a0060774684c4b3d6b3e424020b9d4f676c0417eabda316",
            "match": true
          }
        }
      },
      "classification": "ENGINEERING_VERIFICATION",
      "advisory_label": "Synthetic data — engineering test fixture per ARCH-AI-001 AC-2. Not Gold Standard.",
      "provenance_columns": [
        "synthetic_record_id",
        "generator_id",
        "generator_version",
        "seed_value",
        "generation_timestamp_utc",
        "schema_hash_sha256",
        "is_edge_case",
        "edge_case_category",
        "partition_assignment",
        "partition_assignment_hash",
        "advisory_label"
      ]
    },
    {
      "dataset_id": "wf_mfg_005_batch_release",
      "workflow_id": "WF-MFG-005",
      "workflow_name": "Batch Release & Disposition",
      "urs_refs": [
        "URS-23",
        "URS-04",
        "URS-05"
      ],
      "generator_id": "wf_mfg_005_batch_release",
      "generator_version": "1.1.0",
      "generator_type": "rule_based+statistical",
      "seed_value": "vrx-mfg-005-seed-2026",
      "generation_timestamp_utc": "2026-06-08T00:00:00+00:00",
      "schema_version": "1.0.0",
      "schema_hash_sha256": "c1067f51572dd3de32b3e1bb2c98dae4b2cfe146994c4f1b03a56406a315dee1",
      "output_file": "synthetic_data/wf_mfg_005_batch_release_v1.0.parquet",
      "output_format": "parquet",
      "output_row_count": 2000,
      "content_sha256": "ba4a96969f55ee5d66d6e65dc1273e67024d6786345cf70525d10284451bffd6",
      "content_sha256_note": "CANONICAL G6 determinism anchor — portable across toolchains (audit F2).",
      "output_sha256": "9d09e5cb1ae456e2e960d41582c7253170658a3078a2e49b87f49046368a40c8",
      "output_sha256_note": "Environment-bound Parquet-file integrity hash; reproducible only within the recorded build_environment, not across pyarrow versions (audit F2).",
      "build_environment": {
        "python": "3.13.12",
        "implementation": "CPython",
        "libraries": {
          "numpy": "2.4.6",
          "pandas": "3.0.3",
          "pyarrow": "24.0.0",
          "scipy": "1.17.1",
          "Faker": "40.21.0",
          "jsonschema": "4.26.0"
        }
      },
      "review_sample_file": "synthetic_data/wf_mfg_005_batch_release_v1.0.review_sample.xlsx",
      "review_sample_row_count": 670,
      "review_sample_sha256": "5d642b692e03b56473de43d0fdfef987296cb381e4d7b8637a58f5388c396e8c",
      "partition_map_file": "synthetic_data/wf_mfg_005_batch_release_v1.0.partition_map.parquet",
      "partition_map_hash": "dde240d804675aee8c11aaa55047de894718476917c3b3fb747a60d4e7bf855a",
      "large_dataset": false,
      "data_artifact_committed": true,
      "regeneration_note": null,
      "partition_split": {
        "training": 1400,
        "evaluation": 400,
        "edge_case": 200
      },
      "edge_case_coverage_pct": 0.185,
      "edge_case_by_category": {
        "adversarial": 160,
        "negative": 100,
        "boundary": 80,
        "historical-failure": 30
      },
      "bias_test_results": {
        "site": {
          "column": "site_id",
          "top_value": "ab9a0cdc-487b-5f49-848c-4a22bab627f3",
          "top_share": 0.0165,
          "ceiling": 0.4,
          "passed": true
        },
        "product": {
          "column": "product_id",
          "top_value": "bace0adc-4d82-560b-a85c-60b945c56e9c",
          "top_share": 0.0035,
          "ceiling": 0.4,
          "passed": true
        }
      },
      "gold_standard_seed_corpus": [],
      "intended_use": null,
      "canonicalization_note": null,
      "model_grounding": {
        "source": "URS-23 (DEC-23-11 pre-release completeness, DEC-23-12 e-sig, DEC-23-13 QA/QP SoD); Verification Plan §8 control #9 (release blocked when any pre-release control fails: open deviation, expired training, expired SOP, supplier suspended, equipment out of cal-status)",
        "release_lifecycle": "pending_review -> under_qa_review -> qa_reviewed -> pending_release -> released | rejected",
        "disposition": "released | rejected | quarantined | conditional",
        "pre_release_gate": "release blocked when any control fails (URS-15 OOS unresolved, URS-16 open deviation, URS-28 expired training, URS-12 superseded SOP, URS-11 supplier suspended)",
        "sod": "SoD-23-05 QP releaser != QA reviewer (waiver only with documented policy)"
      },
      "assumption_register": {
        "note": "Declared engineering assumptions."
      },
      "gates": {
        "G1": {
          "gate": "G1_schema",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "rows": 2000,
            "failures": 0,
            "sample": []
          }
        },
        "G2": {
          "gate": "G2_statistical",
          "status": "N/A",
          "passed": true,
          "applicable": false,
          "detail": {
            "rationale": "WF-MFG-005 Batch Release: no Gold Standard release-decision corpus. G2 N/A. Release lifecycle, disposition, QP authority, pre-release gate blocking and audit codes grounded in URS-23 (+§8 GxP-critical control #9 Batch Release Gate)."
          }
        },
        "G3": {
          "gate": "G3_edge_cases",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "edge_rows": 370,
            "total": 2000,
            "pct": 0.185,
            "min_required": 0.15,
            "by_category": {
              "adversarial": 160,
              "negative": 100,
              "boundary": 80,
              "historical-failure": 30
            }
          }
        },
        "G4": {
          "gate": "G4_bias",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "axes": {
              "site": {
                "column": "site_id",
                "top_value": "ab9a0cdc-487b-5f49-848c-4a22bab627f3",
                "top_share": 0.0165,
                "ceiling": 0.4,
                "passed": true
              },
              "product": {
                "column": "product_id",
                "top_value": "bace0adc-4d82-560b-a85c-60b945c56e9c",
                "top_share": 0.0035,
                "ceiling": 0.4,
                "passed": true
              }
            }
          }
        },
        "G5": {
          "gate": "G5_provenance",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "missing_columns": [],
            "unpopulated": {},
            "expected_columns": 11
          }
        },
        "G6": {
          "gate": "G6_determinism",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "hash_run_1": "ba4a96969f55ee5d66d6e65dc1273e67024d6786345cf70525d10284451bffd6",
            "hash_run_2": "ba4a96969f55ee5d66d6e65dc1273e67024d6786345cf70525d10284451bffd6",
            "match": true
          }
        }
      },
      "classification": "ENGINEERING_VERIFICATION",
      "advisory_label": "Synthetic data — engineering test fixture per ARCH-AI-001 AC-2. Not Gold Standard.",
      "provenance_columns": [
        "synthetic_record_id",
        "generator_id",
        "generator_version",
        "seed_value",
        "generation_timestamp_utc",
        "schema_hash_sha256",
        "is_edge_case",
        "edge_case_category",
        "partition_assignment",
        "partition_assignment_hash",
        "advisory_label"
      ]
    },
    {
      "dataset_id": "wf_mfg_gate_gmp",
      "workflow_id": "WF-MFG-GATE",
      "workflow_name": "GMP Manufacturing Gates (10-point composite)",
      "urs_refs": [
        "URS-33",
        "URS-23",
        "URS-11",
        "URS-10",
        "URS-09",
        "URS-28",
        "URS-12",
        "URS-16",
        "URS-15",
        "URS-18",
        "URS-04",
        "URS-05",
        "URS-06"
      ],
      "generator_id": "wf_mfg_gate_gmp",
      "generator_version": "1.1.0",
      "generator_type": "rule_based+statistical",
      "seed_value": "vrx-mfg-gate-seed-2026",
      "generation_timestamp_utc": "2026-06-08T00:00:00+00:00",
      "schema_version": "1.0.0",
      "schema_hash_sha256": "d73f6937c574d05205a0d0d23507e257cc6774cc8bf8bdf93b4ae95a1f1b5373",
      "output_file": "synthetic_data/wf_mfg_gate_gmp_v1.0.parquet",
      "output_format": "parquet",
      "output_row_count": 100000,
      "content_sha256": "858f7c371728e2a633d063cfe5468968141cbbb633715c3af9a79db6083c8b2c",
      "content_sha256_note": "CANONICAL G6 determinism anchor — portable across toolchains (audit F2).",
      "output_sha256": "a0de872dad54df4848bab0bdb1c66136e11eae9d8a674fae0b1e6110b3f37b05",
      "output_sha256_note": "Environment-bound Parquet-file integrity hash; reproducible only within the recorded build_environment, not across pyarrow versions (audit F2).",
      "build_environment": {
        "python": "3.13.12",
        "implementation": "CPython",
        "libraries": {
          "numpy": "2.4.6",
          "pandas": "3.0.3",
          "pyarrow": "24.0.0",
          "scipy": "1.17.1",
          "Faker": "40.21.0",
          "jsonschema": "4.26.0"
        }
      },
      "review_sample_file": "synthetic_data/wf_mfg_gate_gmp_v1.0.review_sample.xlsx",
      "review_sample_row_count": 1700,
      "review_sample_sha256": "8d0af074be9a5fabebb999b880c4de040ccb12dfb433196866c063e80789f4db",
      "partition_map_file": "synthetic_data/wf_mfg_gate_gmp_v1.0.partition_map.parquet",
      "partition_map_hash": "9c2ddbeb88412770527757d8d46f3fdc614a53f01c86142612932c8c87cd48d2",
      "large_dataset": false,
      "data_artifact_committed": true,
      "regeneration_note": null,
      "partition_split": {
        "training": 70000,
        "evaluation": 20000,
        "edge_case": 10000
      },
      "edge_case_coverage_pct": 0.18,
      "edge_case_by_category": {
        "adversarial": 7400,
        "boundary": 3600,
        "negative": 3600,
        "historical-failure": 3400
      },
      "bias_test_results": {
        "site": {
          "column": "site_id",
          "top_value": "b6e3c07e-8cc8-53f9-9ad3-e76acedbbe8a",
          "top_share": 0.0109,
          "ceiling": 0.4,
          "passed": true
        },
        "tenant": {
          "column": "tenant_id",
          "top_value": "d346697c-a25f-510a-b4cb-c12ed2fe887b",
          "top_share": 0.0097,
          "ceiling": 0.4,
          "passed": true
        }
      },
      "gold_standard_seed_corpus": [],
      "intended_use": null,
      "canonicalization_note": null,
      "model_grounding": {
        "source": "Verification Plan v1.0 §6.5 (the composite Batch-Level Pre-Execution and Pre-Release Gate — 10 sub-checks; §8 GxP-critical controls #9/#10); URS-33 (DEC-33-07/10/11 cross-module gates: material released-for-use, equipment qualified, process/cleaning validation)",
        "reconciliation": "WF-MFG-GATE models the Verification Plan §6.5 COMPOSITE 10-point gate (validation-scope control). URS-33 implements a subset (3-4 cross-module gates) at batch_start; the §6.5 composite additionally spans URS-11 supplier, URS-28 training, URS-12 SOP, URS-16/15/18 open quality events, URS-04/05 authority/e-sig, URS-06 atomic audit. Documented per no-drift discipline.",
        "ten_checks": [
          "material_lot_approved (URS-11/23)",
          "supplier_product_material_link_active (URS-11/10)",
          "supplier_site_qualification_valid (URS-11/09)",
          "operator_training_active (URS-28)",
          "current_sop_mbr_effective (URS-12/23)",
          "equipment_qual_cal_pm_status (URS-33 read-only/attestation §13)",
          "env_stability_status_acceptable (URS-25/24)",
          "open_deviation_oos_capa_do_not_block (URS-16/15/18)",
          "authority_esignature_for_regulated_transition (URS-04/05)",
          "audit_event_written_atomically (URS-06)"
        ],
        "gate_decision": "pass (all 10 true) | block_<check> (first failing); each sub-check exercised positive AND negative per §6.5"
      },
      "assumption_register": {
        "note": "Declared engineering assumptions.",
        "gate_type_mix": {
          "source": "engineering assumption",
          "values": {
            "pre_execution": 0.6,
            "pre_release": 0.4
          }
        }
      },
      "gates": {
        "G1": {
          "gate": "G1_schema",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "rows": 100000,
            "failures": 0,
            "sample": []
          }
        },
        "G2": {
          "gate": "G2_statistical",
          "status": "N/A",
          "passed": true,
          "applicable": false,
          "detail": {
            "rationale": "WF-MFG-GATE: no Gold Standard gate corpus. G2 N/A. The 10-point composite gate is grounded in Verification Plan §6.5 (named GxP-critical control); URS-33 implements a cross-module subset."
          }
        },
        "G3": {
          "gate": "G3_edge_cases",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "edge_rows": 18000,
            "total": 100000,
            "pct": 0.18,
            "min_required": 0.15,
            "by_category": {
              "adversarial": 7400,
              "boundary": 3600,
              "negative": 3600,
              "historical-failure": 3400
            }
          }
        },
        "G4": {
          "gate": "G4_bias",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "axes": {
              "site": {
                "column": "site_id",
                "top_value": "b6e3c07e-8cc8-53f9-9ad3-e76acedbbe8a",
                "top_share": 0.0109,
                "ceiling": 0.4,
                "passed": true
              },
              "tenant": {
                "column": "tenant_id",
                "top_value": "d346697c-a25f-510a-b4cb-c12ed2fe887b",
                "top_share": 0.0097,
                "ceiling": 0.4,
                "passed": true
              }
            }
          }
        },
        "G5": {
          "gate": "G5_provenance",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "missing_columns": [],
            "unpopulated": {},
            "expected_columns": 11
          }
        },
        "G6": {
          "gate": "G6_determinism",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "hash_run_1": "858f7c371728e2a633d063cfe5468968141cbbb633715c3af9a79db6083c8b2c",
            "hash_run_2": "858f7c371728e2a633d063cfe5468968141cbbb633715c3af9a79db6083c8b2c",
            "match": true
          }
        }
      },
      "classification": "ENGINEERING_VERIFICATION",
      "advisory_label": "Synthetic data — engineering test fixture per ARCH-AI-001 AC-2. Not Gold Standard.",
      "provenance_columns": [
        "synthetic_record_id",
        "generator_id",
        "generator_version",
        "seed_value",
        "generation_timestamp_utc",
        "schema_hash_sha256",
        "is_edge_case",
        "edge_case_category",
        "partition_assignment",
        "partition_assignment_hash",
        "advisory_label"
      ]
    },
    {
      "dataset_id": "wf_mira_001_provenance",
      "workflow_id": "WF-MIRA-001",
      "workflow_name": "AI Inference Provenance & Advisory Segregation",
      "urs_refs": [
        "URS-32",
        "URS-12"
      ],
      "generator_id": "wf_mira_001_provenance",
      "generator_version": "1.0.0",
      "generator_type": "rule_based+statistical",
      "seed_value": "vrx-mira-001-seed-2026",
      "generation_timestamp_utc": "2026-06-08T00:00:00+00:00",
      "schema_version": "1.0.0",
      "schema_hash_sha256": "9eefa158e36181755d237789e09c7926891e060456fd0ee022a253a04529f76d",
      "output_file": "synthetic_data/wf_mira_001_provenance_v1.0.parquet",
      "output_format": "parquet",
      "output_row_count": 60000,
      "content_sha256": "27cad3fd3658a9a2f8a92a5587185f1969f2eb15055aa4a77f1dcfb1d44ddae3",
      "content_sha256_note": "CANONICAL G6 determinism anchor — portable across toolchains (audit F2).",
      "output_sha256": "d315b9a9cc13bfa6df0daf3ebac4049c6e36b069af01e9c60aead58a47fcd7be",
      "output_sha256_note": "Environment-bound Parquet-file integrity hash; reproducible only within the recorded build_environment, not across pyarrow versions (audit F2).",
      "build_environment": {
        "python": "3.13.12",
        "implementation": "CPython",
        "libraries": {
          "numpy": "2.4.6",
          "pandas": "3.0.3",
          "pyarrow": "24.0.0",
          "scipy": "1.17.1",
          "Faker": "40.21.0",
          "jsonschema": "4.26.0"
        }
      },
      "review_sample_file": "synthetic_data/wf_mira_001_provenance_v1.0.review_sample.xlsx",
      "review_sample_row_count": 2450,
      "review_sample_sha256": "5f3423829c3aa5f0d10164f3a362d0ea8535fead84a6494a2d5a4413857beda6",
      "partition_map_file": "synthetic_data/wf_mira_001_provenance_v1.0.partition_map.parquet",
      "partition_map_hash": "2cd3c47453f41f582b0bd93b4c46e64f1a5bcc9c332c638ca4022f76efa55f80",
      "large_dataset": false,
      "data_artifact_committed": true,
      "regeneration_note": null,
      "partition_split": {
        "training": 42000,
        "evaluation": 12000,
        "edge_case": 6000
      },
      "edge_case_coverage_pct": 0.1667,
      "edge_case_by_category": {
        "negative": 4400,
        "boundary": 2200,
        "adversarial": 1800,
        "historical-failure": 1600
      },
      "bias_test_results": {
        "tenant": {
          "column": "tenant_id",
          "top_value": "06fecce3-10e2-5911-8892-6be114cd5b21",
          "top_share": 0.0833,
          "ceiling": 0.4,
          "passed": true
        },
        "request_type": {
          "column": "request_type",
          "top_value": "document_review",
          "top_share": 0.1262,
          "ceiling": 0.4,
          "passed": true
        }
      },
      "gold_standard_seed_corpus": [],
      "intended_use": "Engineering verification of the AI-inference provenance & advisory-segregation control surface: every inference logged to ai_requests/llm_audit_log with model name+version, prompt/input hashes, output, confidence band, triggering userId and human accept/reject/modify; advisory output NEVER written to a GxP record field as a system-generated fact; confidence->HITL gating; override provenance (was_overridden + override_by + override_reason + original_output); and the llm_audit_log SHA-256 hash-chain integrity. This dataset is DETERMINISTICALLY synthesized - it does NOT and cannot validate real generative-model behaviour (hallucination, confidence calibration, prompt-injection resistance); that is a separate intended use deferred to Phase 2B (MIRA real-model validation). Grounded in ai_requests (mig 016/227) + llm_audit_log (mig 049).",
      "canonicalization_note": null,
      "model_grounding": {
        "source": "ai_requests (mig 016, model_metadata mig 227), llm_audit_log (mig 049 hash chain; llm-audit.service.ts logCall/verifyChain), ai-gateway.service.ts (confidence gating, override tracking, ADVISORY_AI_PURPOSES_SKIP_HITL_DECISION), mira.service.ts SD-035 advisory-only; ARCH-AI-001 AC-2/3/5, DEC-31-13/14",
        "load_bearing": "llm_audit_log SHA-256 hash chain (independently recomputable) + advisory-never-system-of-record segregation",
        "chain_link": "tenant_id -> tenant pool; user_id/decided_by/override_by -> user pool; model_id -> ai_model pool (shared with WF-MIRA-003 registry); hitl_decision_id -> hitl_decision pool (shared with WF-MIRA-002)."
      },
      "assumption_register": null,
      "gates": {
        "G1": {
          "gate": "G1_schema",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "rows": 60000,
            "failures": 0,
            "sample": []
          }
        },
        "G2": {
          "gate": "G2_statistical",
          "status": "N/A",
          "passed": true,
          "applicable": false,
          "detail": {
            "rationale": "WF-MIRA-001: AI-inference provenance mix is URS-32-grounded; no public AI-governance-event frequency corpus. G2 N/A; the hash chain + gating are deterministic rules."
          }
        },
        "G3": {
          "gate": "G3_edge_cases",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "edge_rows": 10000,
            "total": 60000,
            "pct": 0.1667,
            "min_required": 0.15,
            "by_category": {
              "negative": 4400,
              "boundary": 2200,
              "adversarial": 1800,
              "historical-failure": 1600
            }
          }
        },
        "G4": {
          "gate": "G4_bias",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "axes": {
              "tenant": {
                "column": "tenant_id",
                "top_value": "06fecce3-10e2-5911-8892-6be114cd5b21",
                "top_share": 0.0833,
                "ceiling": 0.4,
                "passed": true
              },
              "request_type": {
                "column": "request_type",
                "top_value": "document_review",
                "top_share": 0.1262,
                "ceiling": 0.4,
                "passed": true
              }
            }
          }
        },
        "G5": {
          "gate": "G5_provenance",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "missing_columns": [],
            "unpopulated": {},
            "expected_columns": 11
          }
        },
        "G6": {
          "gate": "G6_determinism",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "hash_run_1": "27cad3fd3658a9a2f8a92a5587185f1969f2eb15055aa4a77f1dcfb1d44ddae3",
            "hash_run_2": "27cad3fd3658a9a2f8a92a5587185f1969f2eb15055aa4a77f1dcfb1d44ddae3",
            "match": true
          }
        }
      },
      "classification": "ENGINEERING_VERIFICATION",
      "advisory_label": "Synthetic data — engineering test fixture per ARCH-AI-001 AC-2. Not Gold Standard.",
      "provenance_columns": [
        "synthetic_record_id",
        "generator_id",
        "generator_version",
        "seed_value",
        "generation_timestamp_utc",
        "schema_hash_sha256",
        "is_edge_case",
        "edge_case_category",
        "partition_assignment",
        "partition_assignment_hash",
        "advisory_label"
      ]
    },
    {
      "dataset_id": "wf_mira_002_hitl",
      "workflow_id": "WF-MIRA-002",
      "workflow_name": "HITL Gating, Escalation & Override",
      "urs_refs": [
        "URS-32",
        "URS-12"
      ],
      "generator_id": "wf_mira_002_hitl",
      "generator_version": "1.0.0",
      "generator_type": "rule_based+statistical",
      "seed_value": "vrx-mira-002-seed-2026",
      "generation_timestamp_utc": "2026-06-08T00:00:00+00:00",
      "schema_version": "1.0.0",
      "schema_hash_sha256": "0dd1c641997175e24d9ed12102aa520f6875881d7768f4f94170640a25ac9861",
      "output_file": "synthetic_data/wf_mira_002_hitl_v1.0.parquet",
      "output_format": "parquet",
      "output_row_count": 40000,
      "content_sha256": "2bf6b1316e945ca3359e99df4c04ab30f6c25d1c5555e6e14b7900994c19e4a4",
      "content_sha256_note": "CANONICAL G6 determinism anchor — portable across toolchains (audit F2).",
      "output_sha256": "bb78f7c2f4e1d0e6b57120131aad5947ab4f8131192b1a2b9ef1d7c087f80d9c",
      "output_sha256_note": "Environment-bound Parquet-file integrity hash; reproducible only within the recorded build_environment, not across pyarrow versions (audit F2).",
      "build_environment": {
        "python": "3.13.12",
        "implementation": "CPython",
        "libraries": {
          "numpy": "2.4.6",
          "pandas": "3.0.3",
          "pyarrow": "24.0.0",
          "scipy": "1.17.1",
          "Faker": "40.21.0",
          "jsonschema": "4.26.0"
        }
      },
      "review_sample_file": "synthetic_data/wf_mira_002_hitl_v1.0.review_sample.xlsx",
      "review_sample_row_count": 2450,
      "review_sample_sha256": "82ec190aeeed87d3c2c70d8659e14a5cd0a33cc7efcdaf9d66617558fa8a6105",
      "partition_map_file": "synthetic_data/wf_mira_002_hitl_v1.0.partition_map.parquet",
      "partition_map_hash": "77c3ffb11cffff1bde2b22777d3882fe96e6aff0d719ca14e2fee2ebde1315f5",
      "large_dataset": false,
      "data_artifact_committed": true,
      "regeneration_note": null,
      "partition_split": {
        "training": 28000,
        "evaluation": 8000,
        "edge_case": 4000
      },
      "edge_case_coverage_pct": 0.1825,
      "edge_case_by_category": {
        "negative": 3300,
        "boundary": 2000,
        "adversarial": 1000,
        "historical-failure": 1000
      },
      "bias_test_results": {
        "entity_type": {
          "column": "entity_type",
          "top_value": "risk_assessment",
          "top_share": 0.2024,
          "ceiling": 0.4,
          "passed": true
        },
        "tenant": {
          "column": "tenant_id",
          "top_value": "06fecce3-10e2-5911-8892-6be114cd5b21",
          "top_share": 0.0864,
          "ceiling": 0.4,
          "passed": true
        }
      },
      "gold_standard_seed_corpus": [],
      "intended_use": "Engineering verification of the Human-in-the-Loop (HITL) decision control surface: AI-cannot-finalize (every completed/finalized decision attributable to a HUMAN with an e-signature, never the AI), separation-of-duties (decider != assigner), escalation legality (reason in sla_breach/max_iterations/manual/auto_escalation, escalated to higher authority), iteration bound (iteration_number <= gate max_iterations), and SLA enforcement (due_at / sla_warning_sent / breach->escalation). Grounded in hitl_decisions (mig 015) + hitl_escalation_log (mig 015) + hitl_gate_config (mig 014). DETERMINISTIC control-surface - NOT real-model validation (Phase 2B). Closes the hitl_decision_id join opened by WF-MIRA-001.",
      "canonicalization_note": null,
      "model_grounding": {
        "source": "hitl_decisions (mig 015), hitl_escalation_log (mig 015), hitl_gate_config (mig 014); hitl/MODULE.md (submit/override e-signature required); ai-gateway.service.ts HITL creation; ARCH-AI-001 AC-4/6, J-22, DEC-31-23",
        "load_bearing": "AI-cannot-finalize (human + e-sig on every completed decision; AI never finalizes) + escalation legality + iteration/SLA bounds - independently recomputable against the gate rules",
        "chain_link": "id -> hitl_decision pool (closes WF-MIRA-001.hitl_decision_id); ai_request_id -> the triggering AI inference (WF-MIRA-001 surface); tenant -> tenant pool; assigned_to/decided_by/escalated_to -> user pool; entity_id -> the GxP record (deviation QE-001 / capa / finding / risk pools)."
      },
      "assumption_register": null,
      "gates": {
        "G1": {
          "gate": "G1_schema",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "rows": 40000,
            "failures": 0,
            "sample": []
          }
        },
        "G2": {
          "gate": "G2_statistical",
          "status": "N/A",
          "passed": true,
          "applicable": false,
          "detail": {
            "rationale": "WF-MIRA-002: HITL decision/escalation mix is URS-32-grounded; no public HITL-decision frequency corpus. G2 N/A; gating/escalation are deterministic rules."
          }
        },
        "G3": {
          "gate": "G3_edge_cases",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "edge_rows": 7300,
            "total": 40000,
            "pct": 0.1825,
            "min_required": 0.15,
            "by_category": {
              "negative": 3300,
              "boundary": 2000,
              "adversarial": 1000,
              "historical-failure": 1000
            }
          }
        },
        "G4": {
          "gate": "G4_bias",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "axes": {
              "entity_type": {
                "column": "entity_type",
                "top_value": "risk_assessment",
                "top_share": 0.2024,
                "ceiling": 0.4,
                "passed": true
              },
              "tenant": {
                "column": "tenant_id",
                "top_value": "06fecce3-10e2-5911-8892-6be114cd5b21",
                "top_share": 0.0864,
                "ceiling": 0.4,
                "passed": true
              }
            }
          }
        },
        "G5": {
          "gate": "G5_provenance",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "missing_columns": [],
            "unpopulated": {},
            "expected_columns": 11
          }
        },
        "G6": {
          "gate": "G6_determinism",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "hash_run_1": "2bf6b1316e945ca3359e99df4c04ab30f6c25d1c5555e6e14b7900994c19e4a4",
            "hash_run_2": "2bf6b1316e945ca3359e99df4c04ab30f6c25d1c5555e6e14b7900994c19e4a4",
            "match": true
          }
        }
      },
      "classification": "ENGINEERING_VERIFICATION",
      "advisory_label": "Synthetic data — engineering test fixture per ARCH-AI-001 AC-2. Not Gold Standard.",
      "provenance_columns": [
        "synthetic_record_id",
        "generator_id",
        "generator_version",
        "seed_value",
        "generation_timestamp_utc",
        "schema_hash_sha256",
        "is_edge_case",
        "edge_case_category",
        "partition_assignment",
        "partition_assignment_hash",
        "advisory_label"
      ]
    },
    {
      "dataset_id": "wf_mira_003_model_registry",
      "workflow_id": "WF-MIRA-003",
      "workflow_name": "Model Registry, Change-Control & Drift",
      "urs_refs": [
        "URS-32",
        "URS-12"
      ],
      "generator_id": "wf_mira_003_model_registry",
      "generator_version": "1.0.0",
      "generator_type": "rule_based+statistical",
      "seed_value": "vrx-mira-003-seed-2026",
      "generation_timestamp_utc": "2026-06-08T00:00:00+00:00",
      "schema_version": "1.0.0",
      "schema_hash_sha256": "62522561066d98daec55f8b6a8f8841eb29498f1b4085ac67fca0687a35f8322",
      "output_file": "synthetic_data/wf_mira_003_model_registry_v1.0.parquet",
      "output_format": "parquet",
      "output_row_count": 24000,
      "content_sha256": "c282212f3e96cbe17be40afb056ca912fc79f0a33dd4344dc4aa8c6bc1b8aaa2",
      "content_sha256_note": "CANONICAL G6 determinism anchor — portable across toolchains (audit F2).",
      "output_sha256": "cd3d130337285fd08aaae5cf6b01bc58eddcdf134ee3650c8c8070c0c3570dbf",
      "output_sha256_note": "Environment-bound Parquet-file integrity hash; reproducible only within the recorded build_environment, not across pyarrow versions (audit F2).",
      "build_environment": {
        "python": "3.13.12",
        "implementation": "CPython",
        "libraries": {
          "numpy": "2.4.6",
          "pandas": "3.0.3",
          "pyarrow": "24.0.0",
          "scipy": "1.17.1",
          "Faker": "40.21.0",
          "jsonschema": "4.26.0"
        }
      },
      "review_sample_file": "synthetic_data/wf_mira_003_model_registry_v1.0.review_sample.xlsx",
      "review_sample_row_count": 2300,
      "review_sample_sha256": "051784d17f6bc50c7636b6495d19b689a6bf55ad4f89cb691c72354a14775913",
      "partition_map_file": "synthetic_data/wf_mira_003_model_registry_v1.0.partition_map.parquet",
      "partition_map_hash": "18f6335f3eac4ad9da5f22748f22d75e40fb36bda0032b61580fe524903db0a8",
      "large_dataset": false,
      "data_artifact_committed": true,
      "regeneration_note": null,
      "partition_split": {
        "training": 16800,
        "evaluation": 4800,
        "edge_case": 2400
      },
      "edge_case_coverage_pct": 0.1708,
      "edge_case_by_category": {
        "negative": 1700,
        "boundary": 1200,
        "historical-failure": 600,
        "adversarial": 600
      },
      "bias_test_results": {
        "model_type": {
          "column": "model_type",
          "top_value": "deviation_predictor",
          "top_share": 0.2,
          "ceiling": 0.4,
          "passed": true
        },
        "tenant": {
          "column": "tenant_id",
          "top_value": "dbe35d12-d4db-550a-b2ca-8bc24ee425c4",
          "top_share": 0.089,
          "ceiling": 0.4,
          "passed": true
        }
      },
      "gold_standard_seed_corpus": [],
      "intended_use": "Engineering verification of the AI model registry / change-control / drift-monitoring control surface: model qualification lifecycle (pending->validating->validated->qualified->deployed; deprecated/blocked), validation-before-deployment (qualified+active requires validated + qualification evidence + measured metric >= acceptance threshold), change-control on version/prompt/retrain changes, drift detection (measured metric < acceptance threshold -> drift_detected -> revalidation), intended-use statement + risk classification, train/test separation (Annex 22 5.3), and requires_review on predictions. Grounded in model_registry (mig 049) + ai_model_registry (mig 021) + prediction_log (mig 021) + ai_scoring_results/ai_scoring_models (mig 045). DETERMINISTIC control-surface - NOT real-model validation (Phase 2B). Closes model_id -> ai_model (completes model->inference->HITL chain with WF-MIRA-001/002). NO real LLM.",
      "canonicalization_note": null,
      "model_grounding": {
        "source": "model_registry (mig 049 qualification_status/evidence/is_active), ai_model_registry (mig 021 validation_status/model_type/hyperparameters), prediction_log (mig 021 requires_review/confidence_band), ai_scoring_results+ai_scoring_models (mig 045); EU Annex 22 5-6, FDA CSA, GMLP 6; QS-22/24",
        "load_bearing": "qualification gate (validated + evidence + metric>=threshold + change-control before active-qualified) + drift detection (drift_detected == metric<threshold) - independently recomputable per model row",
        "chain_link": "model_id -> ai_model pool (closes WF-MIRA-001.model_id; completes model->inference->HITL); tenant -> tenant pool; qualified_by -> user pool; change_control_id -> change_control pool (CRC change-control)."
      },
      "assumption_register": null,
      "gates": {
        "G1": {
          "gate": "G1_schema",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "rows": 24000,
            "failures": 0,
            "sample": []
          }
        },
        "G2": {
          "gate": "G2_statistical",
          "status": "N/A",
          "passed": true,
          "applicable": false,
          "detail": {
            "rationale": "WF-MIRA-003: model lifecycle/drift mix is URS-32-grounded; no public model-registry frequency corpus. G2 N/A; qualification/drift are deterministic rules."
          }
        },
        "G3": {
          "gate": "G3_edge_cases",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "edge_rows": 4100,
            "total": 24000,
            "pct": 0.1708,
            "min_required": 0.15,
            "by_category": {
              "negative": 1700,
              "boundary": 1200,
              "historical-failure": 600,
              "adversarial": 600
            }
          }
        },
        "G4": {
          "gate": "G4_bias",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "axes": {
              "model_type": {
                "column": "model_type",
                "top_value": "deviation_predictor",
                "top_share": 0.2,
                "ceiling": 0.4,
                "passed": true
              },
              "tenant": {
                "column": "tenant_id",
                "top_value": "dbe35d12-d4db-550a-b2ca-8bc24ee425c4",
                "top_share": 0.089,
                "ceiling": 0.4,
                "passed": true
              }
            }
          }
        },
        "G5": {
          "gate": "G5_provenance",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "missing_columns": [],
            "unpopulated": {},
            "expected_columns": 11
          }
        },
        "G6": {
          "gate": "G6_determinism",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "hash_run_1": "c282212f3e96cbe17be40afb056ca912fc79f0a33dd4344dc4aa8c6bc1b8aaa2",
            "hash_run_2": "c282212f3e96cbe17be40afb056ca912fc79f0a33dd4344dc4aa8c6bc1b8aaa2",
            "match": true
          }
        }
      },
      "classification": "ENGINEERING_VERIFICATION",
      "advisory_label": "Synthetic data — engineering test fixture per ARCH-AI-001 AC-2. Not Gold Standard.",
      "provenance_columns": [
        "synthetic_record_id",
        "generator_id",
        "generator_version",
        "seed_value",
        "generation_timestamp_utc",
        "schema_hash_sha256",
        "is_edge_case",
        "edge_case_category",
        "partition_assignment",
        "partition_assignment_hash",
        "advisory_label"
      ]
    },
    {
      "dataset_id": "wf_qe_001_deviation",
      "workflow_id": "WF-QE-001",
      "workflow_name": "Deviation Management",
      "urs_refs": [
        "URS-16"
      ],
      "generator_id": "wf_qe_001_deviation",
      "generator_version": "1.1.0",
      "generator_type": "rule_based+statistical",
      "seed_value": "vrx-qe-001-seed-2026",
      "generation_timestamp_utc": "2026-06-08T00:00:00+00:00",
      "schema_version": "1.0.0",
      "schema_hash_sha256": "05f00966a5ea4454cde7e749ec4b462fc29e900bac536dc3efbf4a5880953175",
      "output_file": "synthetic_data/wf_qe_001_deviation_v1.0.parquet",
      "output_format": "parquet",
      "output_row_count": 5000,
      "content_sha256": "6bdbf98e0a8d78dfb752e1784bed39414a311c5e647409bb4700d89062d0679d",
      "content_sha256_note": "CANONICAL G6 determinism anchor — portable across toolchains (audit F2).",
      "output_sha256": "574a513c6b21978ff2e4486af16d8e3e4afb6d0fd4ca2a5716a3f0fa4db306e3",
      "output_sha256_note": "Environment-bound Parquet-file integrity hash; reproducible only within the recorded build_environment, not across pyarrow versions (audit F2).",
      "build_environment": {
        "python": "3.13.12",
        "implementation": "CPython",
        "libraries": {
          "numpy": "2.4.6",
          "pandas": "3.0.3",
          "pyarrow": "24.0.0",
          "scipy": "1.17.1",
          "Faker": "40.21.0",
          "jsonschema": "4.26.0"
        }
      },
      "review_sample_file": "synthetic_data/wf_qe_001_deviation_v1.0.review_sample.xlsx",
      "review_sample_row_count": 1200,
      "review_sample_sha256": "88a63d5aa9d271e6818c3e5a16666c5eb54a7e5ae8bc0d9d6994674e3bcb6aad",
      "partition_map_file": "synthetic_data/wf_qe_001_deviation_v1.0.partition_map.parquet",
      "partition_map_hash": "808a86b19d0a7ed51bfc37b8397c32d31dc4797c440cd5fa8cdef2f7b5f09817",
      "large_dataset": false,
      "data_artifact_committed": true,
      "regeneration_note": null,
      "partition_split": {
        "training": 3500,
        "evaluation": 1000,
        "edge_case": 500
      },
      "edge_case_coverage_pct": 0.16,
      "edge_case_by_category": {
        "boundary": 260,
        "negative": 210,
        "historical-failure": 180,
        "adversarial": 150
      },
      "bias_test_results": {
        "practice": {
          "column": "practice",
          "top_value": "gmp",
          "top_share": 0.3388,
          "ceiling": 0.4,
          "passed": true
        },
        "product": {
          "column": "product_id",
          "top_value": "8f9ae082-722c-5a26-9a85-ab09c5307c2e",
          "top_share": 0.003,
          "ceiling": 0.4,
          "passed": true
        }
      },
      "gold_standard_seed_corpus": [],
      "intended_use": null,
      "canonicalization_note": null,
      "model_grounding": {
        "source": "URS-16 (DEC-16-02 state machine, DEC-16-05 severity, DEC-16-06 severity-driven closure matrix, DEC-16-08 immutability, DEC-16-09 void, DEC-16-21 critical exec co-sign, DEC-16-22 governed reopen, SoD-16-01/04/07, DEC-16-19 GenAI prohibition)",
        "lifecycle": "draft -> investigating -> closed; draft|investigating -> voided; closed -> investigating (governed reopen, append-only iteration). Closed/voided immutable (DEC-16-08).",
        "severity_matrix": "minor=2 sigs (QA reviewer + closure authority); major=3 (+practice lead); critical=4 (+executive authority, DEC-16-21)",
        "sod": "SoD-16-01 discoverer!=investigator; SoD-16-04 investigator!=closure authority; SoD-16-07 critical closure!=executive co-signer",
        "rca_capa_attachment": "severity-driven (URS-16 §16/§20): RCA_RATE minor/major/critical=0.10/0.75/0.97; CAPA conditional on RCA (P(CAPA|RCA)=0.50/0.80/0.95); attached only for investigating/closed deviations."
      },
      "assumption_register": {
        "note": "Declared engineering assumptions.",
        "linkage_rates": {
          "source": "URS-16 §17 synthetic guidance",
          "values": {
            "batch": 0.3,
            "rca": 0.05,
            "capa": 0.08,
            "linked_change": 0.02,
            "linked_oos": 0.02
          }
        }
      },
      "gates": {
        "G1": {
          "gate": "G1_schema",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "rows": 5000,
            "failures": 0,
            "sample": []
          }
        },
        "G2": {
          "gate": "G2_statistical",
          "status": "N/A",
          "passed": true,
          "applicable": false,
          "detail": {
            "rationale": "WF-QE-001 Deviation: practice-by-severity mix and lifecycle are URS-16-grounded engineering distributions; the Gold Standard 483/citation corpora (GS-002/005) describe inspector observations, not internal deviation severity frequency, so no like-for-like KS anchor. G2 N/A; classification/state-machine/SoD/escalation grounded in URS-16 (DEC-16-02/05/06/21/22)."
          }
        },
        "G3": {
          "gate": "G3_edge_cases",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "edge_rows": 800,
            "total": 5000,
            "pct": 0.16,
            "min_required": 0.15,
            "by_category": {
              "boundary": 260,
              "negative": 210,
              "historical-failure": 180,
              "adversarial": 150
            }
          }
        },
        "G4": {
          "gate": "G4_bias",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "axes": {
              "practice": {
                "column": "practice",
                "top_value": "gmp",
                "top_share": 0.3388,
                "ceiling": 0.4,
                "passed": true
              },
              "product": {
                "column": "product_id",
                "top_value": "8f9ae082-722c-5a26-9a85-ab09c5307c2e",
                "top_share": 0.003,
                "ceiling": 0.4,
                "passed": true
              }
            }
          }
        },
        "G5": {
          "gate": "G5_provenance",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "missing_columns": [],
            "unpopulated": {},
            "expected_columns": 11
          }
        },
        "G6": {
          "gate": "G6_determinism",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "hash_run_1": "6bdbf98e0a8d78dfb752e1784bed39414a311c5e647409bb4700d89062d0679d",
            "hash_run_2": "6bdbf98e0a8d78dfb752e1784bed39414a311c5e647409bb4700d89062d0679d",
            "match": true
          }
        }
      },
      "classification": "ENGINEERING_VERIFICATION",
      "advisory_label": "Synthetic data — engineering test fixture per ARCH-AI-001 AC-2. Not Gold Standard.",
      "provenance_columns": [
        "synthetic_record_id",
        "generator_id",
        "generator_version",
        "seed_value",
        "generation_timestamp_utc",
        "schema_hash_sha256",
        "is_edge_case",
        "edge_case_category",
        "partition_assignment",
        "partition_assignment_hash",
        "advisory_label"
      ]
    },
    {
      "dataset_id": "wf_qe_002_oos",
      "workflow_id": "WF-QE-002",
      "workflow_name": "OOS/OOT Investigation",
      "urs_refs": [
        "URS-15"
      ],
      "generator_id": "wf_qe_002_oos",
      "generator_version": "1.0.0",
      "generator_type": "rule_based+statistical",
      "seed_value": "vrx-qe-002-seed-2026",
      "generation_timestamp_utc": "2026-06-08T00:00:00+00:00",
      "schema_version": "1.0.0",
      "schema_hash_sha256": "22222b11e60fecf26ff2c9b88e0911aa77579154aae23470cae5cf55fc6c6ca8",
      "output_file": "synthetic_data/wf_qe_002_oos_v1.0.parquet",
      "output_format": "parquet",
      "output_row_count": 3000,
      "content_sha256": "ff3354355eb3c4fef9d902763810cc1b3919388f919a9f05dba015308c4c6663",
      "content_sha256_note": "CANONICAL G6 determinism anchor — portable across toolchains (audit F2).",
      "output_sha256": "ced4b9f34e29bf0bdb7653247435a0ce1c0f61fad404d0aaf6ed1122c174690c",
      "output_sha256_note": "Environment-bound Parquet-file integrity hash; reproducible only within the recorded build_environment, not across pyarrow versions (audit F2).",
      "build_environment": {
        "python": "3.13.12",
        "implementation": "CPython",
        "libraries": {
          "numpy": "2.4.6",
          "pandas": "3.0.3",
          "pyarrow": "24.0.0",
          "scipy": "1.17.1",
          "Faker": "40.21.0",
          "jsonschema": "4.26.0"
        }
      },
      "review_sample_file": "synthetic_data/wf_qe_002_oos_v1.0.review_sample.xlsx",
      "review_sample_row_count": 1560,
      "review_sample_sha256": "7bd9b976898cc08d0cde3365042813390e99836b60f807d08d2ea0fe35b9626d",
      "partition_map_file": "synthetic_data/wf_qe_002_oos_v1.0.partition_map.parquet",
      "partition_map_hash": "bae95162e6cae344b847d7fb7f1f99e290dc3de0d7708c137bb0dc934019eed2",
      "large_dataset": false,
      "data_artifact_committed": true,
      "regeneration_note": null,
      "partition_split": {
        "training": 2100,
        "evaluation": 600,
        "edge_case": 300
      },
      "edge_case_coverage_pct": 0.16,
      "edge_case_by_category": {
        "negative": 170,
        "boundary": 130,
        "adversarial": 100,
        "historical-failure": 80
      },
      "bias_test_results": {
        "test_name": {
          "column": "test_name",
          "top_value": "assay",
          "top_share": 0.135,
          "ceiling": 0.4,
          "passed": true
        },
        "product": {
          "column": "product_id",
          "top_value": "361b8b76-02e2-544e-88bf-ed3bc3f9e1db",
          "top_share": 0.0037,
          "ceiling": 0.4,
          "passed": true
        }
      },
      "gold_standard_seed_corpus": [],
      "intended_use": null,
      "canonicalization_note": null,
      "model_grounding": {
        "source": "URS-15 (DEC-15-02 state machine, DEC-15-04 Phase I, DEC-15-05 Phase II OOS-REQ-006 parent advance, DEC-15-06 retest USP<1010>, DEC-15-07 disposition, DEC-15-09 OOT adjudication, DEC-15-11 deterministic statistical engine, DEC-15-21 inconclusive exec co-sign, SoD-15-02/03/04, DEC-15-18 GenAI prohibition)",
        "lifecycle_oos": "opened -> phase1_in_progress -> closed_lab_error | phase2_in_progress -> retest_in_progress -> pending_disposition -> confirmed|invalidated|inconclusive -> closed_*; reopened via executive authority",
        "lifecycle_oot": "oot_open -> adjudicated {escalate_to_oos | proactive_action | false_positive}",
        "sod": "SoD-15-02 analyst!=phase1 signer; SoD-15-03 phase1 supervisor!=phase2 investigator; SoD-15-04 investigator!=disposition authority"
      },
      "assumption_register": {
        "note": "Declared engineering assumptions (URS-15 §18 — not specified in URS).",
        "lab_error_phase1_rate": {
          "source": "FDA OOS Guidance typical",
          "values": 0.25
        },
        "retest_confirms_original_rate": {
          "source": "engineering assumption",
          "values": 0.65
        }
      },
      "gates": {
        "G1": {
          "gate": "G1_schema",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "rows": 3000,
            "failures": 0,
            "sample": []
          }
        },
        "G2": {
          "gate": "G2_statistical",
          "status": "N/A",
          "passed": true,
          "applicable": false,
          "detail": {
            "rationale": "WF-QE-002 OOS/OOT: lifecycle/disposition mix is URS-15-grounded; no public OOS investigation-outcome corpus exists for a like-for-like KS anchor (GS recall/483 corpora are downstream artifacts, not lab investigation distributions). G2 N/A; Phase I/II separation, retest/outlier, disposition matrix, OOT control-chart rules grounded in URS-15 + FDA OOS Guidance + USP <1010>."
          }
        },
        "G3": {
          "gate": "G3_edge_cases",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "edge_rows": 480,
            "total": 3000,
            "pct": 0.16,
            "min_required": 0.15,
            "by_category": {
              "negative": 170,
              "boundary": 130,
              "adversarial": 100,
              "historical-failure": 80
            }
          }
        },
        "G4": {
          "gate": "G4_bias",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "axes": {
              "test_name": {
                "column": "test_name",
                "top_value": "assay",
                "top_share": 0.135,
                "ceiling": 0.4,
                "passed": true
              },
              "product": {
                "column": "product_id",
                "top_value": "361b8b76-02e2-544e-88bf-ed3bc3f9e1db",
                "top_share": 0.0037,
                "ceiling": 0.4,
                "passed": true
              }
            }
          }
        },
        "G5": {
          "gate": "G5_provenance",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "missing_columns": [],
            "unpopulated": {},
            "expected_columns": 11
          }
        },
        "G6": {
          "gate": "G6_determinism",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "hash_run_1": "ff3354355eb3c4fef9d902763810cc1b3919388f919a9f05dba015308c4c6663",
            "hash_run_2": "ff3354355eb3c4fef9d902763810cc1b3919388f919a9f05dba015308c4c6663",
            "match": true
          }
        }
      },
      "classification": "ENGINEERING_VERIFICATION",
      "advisory_label": "Synthetic data — engineering test fixture per ARCH-AI-001 AC-2. Not Gold Standard.",
      "provenance_columns": [
        "synthetic_record_id",
        "generator_id",
        "generator_version",
        "seed_value",
        "generation_timestamp_utc",
        "schema_hash_sha256",
        "is_edge_case",
        "edge_case_category",
        "partition_assignment",
        "partition_assignment_hash",
        "advisory_label"
      ]
    },
    {
      "dataset_id": "wf_qe_004_complaint",
      "workflow_id": "WF-QE-004",
      "workflow_name": "Complaint Handling",
      "urs_refs": [
        "URS-14"
      ],
      "generator_id": "wf_qe_004_complaint",
      "generator_version": "1.0.0",
      "generator_type": "rule_based+statistical",
      "seed_value": "vrx-qe-004-seed-2026",
      "generation_timestamp_utc": "2026-06-08T00:00:00+00:00",
      "schema_version": "1.0.0",
      "schema_hash_sha256": "a7bb29c9f99081615602b719b8ae509ef9593861c04877a6b9b9141d0369aeb5",
      "output_file": "synthetic_data/wf_qe_004_complaint_v1.0.parquet",
      "output_format": "parquet",
      "output_row_count": 3000,
      "content_sha256": "c390c3c4c50fa0bc41fc73936943486665165fb064d396345cf56d94e8400dc8",
      "content_sha256_note": "CANONICAL G6 determinism anchor — portable across toolchains (audit F2).",
      "output_sha256": "559b896b8444df92a52b9da27b86fe71fd7b458afd370153df71e190127566d0",
      "output_sha256_note": "Environment-bound Parquet-file integrity hash; reproducible only within the recorded build_environment, not across pyarrow versions (audit F2).",
      "build_environment": {
        "python": "3.13.12",
        "implementation": "CPython",
        "libraries": {
          "numpy": "2.4.6",
          "pandas": "3.0.3",
          "pyarrow": "24.0.0",
          "scipy": "1.17.1",
          "Faker": "40.21.0",
          "jsonschema": "4.26.0"
        }
      },
      "review_sample_file": "synthetic_data/wf_qe_004_complaint_v1.0.review_sample.xlsx",
      "review_sample_row_count": 950,
      "review_sample_sha256": "f8d32b651f9572cee386eeb72ccab6dbd19d9ac44cece923305262ad891ef66b",
      "partition_map_file": "synthetic_data/wf_qe_004_complaint_v1.0.partition_map.parquet",
      "partition_map_hash": "41340de3b10891c835c1f9e2f915a9e7dfe2ca24b47337ef08343d46cd7b65fc",
      "large_dataset": false,
      "data_artifact_committed": true,
      "regeneration_note": null,
      "partition_split": {
        "training": 2100,
        "evaluation": 600,
        "edge_case": 300
      },
      "edge_case_coverage_pct": 0.15,
      "edge_case_by_category": {
        "negative": 190,
        "boundary": 110,
        "adversarial": 90,
        "historical-failure": 60
      },
      "bias_test_results": {
        "category": {
          "column": "category",
          "top_value": "quality_defect",
          "top_share": 0.31,
          "ceiling": 0.4,
          "passed": true
        },
        "product": {
          "column": "product_id",
          "top_value": "43d1e08e-f9ff-5c4a-8f0b-df8e3e0ff49a",
          "top_share": 0.004,
          "ceiling": 0.4,
          "passed": true
        }
      },
      "gold_standard_seed_corpus": [],
      "intended_use": null,
      "canonicalization_note": null,
      "model_grounding": {
        "source": "URS-14 (DEC-14-02 state machine, DEC-14-06/07 AE+PV reportability, DEC-14-10 FAR CMP-006, DEC-14-12 recall CMP-007, DEC-14-13 recall matrix, DEC-14-15 6 closure prerequisites, DEC-14-21 recall exec co-sign, DEC-14-25 PII, SoD-14-01/02/07, DEC-14-18 GenAI prohibition)",
        "lifecycle": "received -> triaged -> {rejected_invalid | under_investigation | pending_response}; under_investigation -> field_action_pending -> pending_response -> response_issued -> closed; closed -> reopened (executive co-sign)",
        "escalation": "disposition=field_action_required -> FAR (CMP-006, complaint_id linkage required); recall class_i/ii/iii requires executive co-sign all classes (DEC-14-21)",
        "sod": "SoD-14-01 intake!=triage; SoD-14-02 creator/triage!=investigator; SoD-14-07 closure authority!=creator"
      },
      "assumption_register": {
        "note": "Declared engineering assumptions.",
        "batch_link_rate": {
          "source": "engineering assumption",
          "values": 0.45
        },
        "pv_reportable_serious_cosign": {
          "source": "URS-14 SoD-14-03",
          "values": "serious/unexpected AE requires PV lead co-sign"
        }
      },
      "gates": {
        "G1": {
          "gate": "G1_schema",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "rows": 3000,
            "failures": 0,
            "sample": []
          }
        },
        "G2": {
          "gate": "G2_statistical",
          "status": "N/A",
          "passed": true,
          "applicable": false,
          "detail": {
            "rationale": "WF-QE-004 Complaint: category/escalation mix is URS-14-grounded engineering distribution; GS-004 recalls is a downstream FDA enforcement corpus (already seeds MD-003), not a complaint-intake distribution for a like-for-like KS anchor. G2 N/A; lifecycle, escalation (FAR/recall), PV reportability, closure prerequisites and SoD grounded in URS-14."
          }
        },
        "G3": {
          "gate": "G3_edge_cases",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "edge_rows": 450,
            "total": 3000,
            "pct": 0.15,
            "min_required": 0.15,
            "by_category": {
              "negative": 190,
              "boundary": 110,
              "adversarial": 90,
              "historical-failure": 60
            }
          }
        },
        "G4": {
          "gate": "G4_bias",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "axes": {
              "category": {
                "column": "category",
                "top_value": "quality_defect",
                "top_share": 0.31,
                "ceiling": 0.4,
                "passed": true
              },
              "product": {
                "column": "product_id",
                "top_value": "43d1e08e-f9ff-5c4a-8f0b-df8e3e0ff49a",
                "top_share": 0.004,
                "ceiling": 0.4,
                "passed": true
              }
            }
          }
        },
        "G5": {
          "gate": "G5_provenance",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "missing_columns": [],
            "unpopulated": {},
            "expected_columns": 11
          }
        },
        "G6": {
          "gate": "G6_determinism",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "hash_run_1": "c390c3c4c50fa0bc41fc73936943486665165fb064d396345cf56d94e8400dc8",
            "hash_run_2": "c390c3c4c50fa0bc41fc73936943486665165fb064d396345cf56d94e8400dc8",
            "match": true
          }
        }
      },
      "classification": "ENGINEERING_VERIFICATION",
      "advisory_label": "Synthetic data — engineering test fixture per ARCH-AI-001 AC-2. Not Gold Standard.",
      "provenance_columns": [
        "synthetic_record_id",
        "generator_id",
        "generator_version",
        "seed_value",
        "generation_timestamp_utc",
        "schema_hash_sha256",
        "is_edge_case",
        "edge_case_category",
        "partition_assignment",
        "partition_assignment_hash",
        "advisory_label"
      ]
    },
    {
      "dataset_id": "wf_qe_007_recall",
      "workflow_id": "WF-QE-007",
      "workflow_name": "Recall Lifecycle / Distribution Execution",
      "urs_refs": [
        "URS-34",
        "URS-14",
        "URS-30"
      ],
      "generator_id": "wf_qe_007_recall",
      "generator_version": "1.0.0",
      "generator_type": "rule_based+statistical",
      "seed_value": "vrx-qe-007-seed-2026",
      "generation_timestamp_utc": "2026-06-08T00:00:00+00:00",
      "schema_version": "1.0.0",
      "schema_hash_sha256": "a90224c06113e0893dcad91408d3b0eb3bd588aa9b581b282c2996a759d4c2e3",
      "output_file": "synthetic_data/wf_qe_007_recall_v1.0.parquet",
      "output_format": "parquet",
      "output_row_count": 500000,
      "content_sha256": "a16f12dde8a2ba75ddbd80efa166e6fcdfe5350a3ffb0ed292c389fc1c36e67a",
      "content_sha256_note": "CANONICAL G6 determinism anchor — portable across toolchains (audit F2).",
      "output_sha256": "a4772424795bf7d6fc0bb9064bee6f2e5629b5f5b0dc82533f69e5f59acde036",
      "output_sha256_note": "Environment-bound Parquet-file integrity hash; reproducible only within the recorded build_environment, not across pyarrow versions (audit F2).",
      "build_environment": {
        "python": "3.13.12",
        "implementation": "CPython",
        "libraries": {
          "numpy": "2.4.6",
          "pandas": "3.0.3",
          "pyarrow": "24.0.0",
          "scipy": "1.17.1",
          "Faker": "40.21.0",
          "jsonschema": "4.26.0"
        }
      },
      "review_sample_file": "synthetic_data/wf_qe_007_recall_v1.0.review_sample.xlsx",
      "review_sample_row_count": 2100,
      "review_sample_sha256": "9cb4b1ff8ff1136e8d716f3930a10e843c63364ecafa0f8e4a0605e8385ce70b",
      "partition_map_file": null,
      "partition_map_hash": "aeaaae4f9551e0f277b432fbceca3587c574e646225788dfcb6749bac9995c4b",
      "large_dataset": true,
      "data_artifact_committed": false,
      "regeneration_note": "Heavy data artifact (.parquet + partition_map) is gitignored. Regenerate bit-identical from generator + locked seed; verify against content_sha256. Generator+seed is the ALCOA Original (Plan §4).",
      "partition_split": {
        "training": 350000,
        "evaluation": 100000,
        "edge_case": 50000
      },
      "edge_case_coverage_pct": 0.18,
      "edge_case_by_category": {
        "boundary": 42000,
        "negative": 23000,
        "adversarial": 14000,
        "historical-failure": 11000
      },
      "bias_test_results": {
        "product": {
          "column": "product_id",
          "top_value": "747aed61-6de7-5b1e-bcc1-72241c57bd2b",
          "top_share": 0.04,
          "ceiling": 0.4,
          "passed": true
        },
        "consignee": {
          "column": "consignee_id",
          "top_value": "8d35abe6-9e0b-3aee-571e-f9fb271cdb73",
          "top_share": 0.0002,
          "ceiling": 0.4,
          "passed": true
        }
      },
      "gold_standard_seed_corpus": [
        {
          "gold_standard_id": "GS-004",
          "file": "gold_standard/GS-004_s006_fda_recalls.csv",
          "what": "openFDA Drug Enforcement Recalls Class I/II/III frequency",
          "seeds": "recall_classification",
          "gate": "G2 chi-square goodness-of-fit"
        }
      ],
      "intended_use": null,
      "canonicalization_note": null,
      "model_grounding": {
        "source": "URS-34 (DEC-34-13 recall_class enum + lifecycle opened->notification_dispatched->progressing->closed, DEC-34-12/BR-34-11 reconciliation mismatch -> deviation, DEC-34-15 incomplete recovery -> finding, DEC-34-17 bound e-sig, DEC-34-19 coordinator authority, DEC-34-18 AI advisory-only); URS-14 complaint->recall escalation; URS-30 notification cascade (consignee grain)",
        "classification": "FDA 21 CFR Part 7: class_i (serious, 24h notify) / class_ii (moderate) / class_iii (minor). Seeded from GS-004.",
        "reconciliation": "units_returned vs units_distributed; mismatch -> URS-16 deviation; Class I incomplete recovery -> URS-21 finding (DEC-34-15)",
        "chain_link": "complaint-sourced recalls link complaint_id -> WF-QE-004 (recall-escalated complaints); product_id -> MD-002; batch_id -> MFG-004."
      },
      "assumption_register": null,
      "gates": {
        "G1": {
          "gate": "G1_schema_vectorized",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "rows": 500000,
            "failures": 0,
            "by_column": {},
            "extra_columns": [],
            "missing_required": []
          }
        },
        "G2": {
          "gate": "G2_categorical",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "chi2_stat": 0.26411483253588514,
            "p_value": 0.8762906802944481,
            "alpha": 0.05,
            "categories": 3,
            "df": 2,
            "n_observed": 100,
            "anchor_total": 200,
            "min_detectable_effect_w": 0.31,
            "effect_band_detectable": "medium",
            "power_note": "PASS = not inconsistent with the Gold Standard at n=100 (detects Cohen's w >= 0.31 at 80% power); demonstrates consistency, not high-confidence identity."
          }
        },
        "G3": {
          "gate": "G3_edge_cases",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "edge_rows": 90000,
            "total": 500000,
            "pct": 0.18,
            "min_required": 0.15,
            "by_category": {
              "boundary": 42000,
              "negative": 23000,
              "adversarial": 14000,
              "historical-failure": 11000
            }
          }
        },
        "G4": {
          "gate": "G4_bias",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "axes": {
              "product": {
                "column": "product_id",
                "top_value": "747aed61-6de7-5b1e-bcc1-72241c57bd2b",
                "top_share": 0.04,
                "ceiling": 0.4,
                "passed": true
              },
              "consignee": {
                "column": "consignee_id",
                "top_value": "8d35abe6-9e0b-3aee-571e-f9fb271cdb73",
                "top_share": 0.0002,
                "ceiling": 0.4,
                "passed": true
              }
            }
          }
        },
        "G5": {
          "gate": "G5_provenance",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "missing_columns": [],
            "unpopulated": {},
            "expected_columns": 11
          }
        },
        "G6": {
          "gate": "G6_determinism",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "hash_run_1": "a16f12dde8a2ba75ddbd80efa166e6fcdfe5350a3ffb0ed292c389fc1c36e67a",
            "hash_run_2": "a16f12dde8a2ba75ddbd80efa166e6fcdfe5350a3ffb0ed292c389fc1c36e67a",
            "match": true
          }
        }
      },
      "classification": "ENGINEERING_VERIFICATION",
      "advisory_label": "Synthetic data — engineering test fixture per ARCH-AI-001 AC-2. Not Gold Standard.",
      "provenance_columns": [
        "synthetic_record_id",
        "generator_id",
        "generator_version",
        "seed_value",
        "generation_timestamp_utc",
        "schema_hash_sha256",
        "is_edge_case",
        "edge_case_category",
        "partition_assignment",
        "partition_assignment_hash",
        "advisory_label"
      ]
    },
    {
      "dataset_id": "wf_qe_lc_lifecycle",
      "workflow_id": "WF-QE-LC",
      "workflow_name": "Quality-Event Lifecycle Transition Events",
      "urs_refs": [
        "URS-16",
        "URS-15",
        "URS-18",
        "URS-17",
        "URS-13",
        "URS-21",
        "URS-19"
      ],
      "generator_id": "wf_qe_lc_lifecycle",
      "generator_version": "1.0.0",
      "generator_type": "rule_based+statistical",
      "seed_value": "vrx-qe-lc-seed-2026",
      "generation_timestamp_utc": "2026-06-08T00:00:00+00:00",
      "schema_version": "1.0.0",
      "schema_hash_sha256": "477a016a560ac65fc8f32a7c0e9d59aec3e869a1dca6cff495ba7fe29844d8da",
      "output_file": "synthetic_data/wf_qe_lc_lifecycle_v1.0.parquet",
      "output_format": "parquet",
      "output_row_count": 86036,
      "content_sha256": "3c9495a8181fd2232809b3ab355cfe94a9f1de0f74a232ad7c74cf86ac5c5bae",
      "content_sha256_note": "CANONICAL G6 determinism anchor — portable across toolchains (audit F2).",
      "output_sha256": "35ebe6ee4bcfa0dd6096af6ca3925471f11e55e3a905ff36e14fd0579f634435",
      "output_sha256_note": "Environment-bound Parquet-file integrity hash; reproducible only within the recorded build_environment, not across pyarrow versions (audit F2).",
      "build_environment": {
        "python": "3.13.12",
        "implementation": "CPython",
        "libraries": {
          "numpy": "2.4.6",
          "pandas": "3.0.3",
          "pyarrow": "24.0.0",
          "scipy": "1.17.1",
          "Faker": "40.21.0",
          "jsonschema": "4.26.0"
        }
      },
      "review_sample_file": "synthetic_data/wf_qe_lc_lifecycle_v1.0.review_sample.xlsx",
      "review_sample_row_count": 1450,
      "review_sample_sha256": "8bc95fc2bd9028c122fb03258109b2f24c5307d284380570dd1ae6c37400fb9b",
      "partition_map_file": "synthetic_data/wf_qe_lc_lifecycle_v1.0.partition_map.parquet",
      "partition_map_hash": "1bdc70745c0bd7e6f09d068a4030d97c178cd541708e1b8e0a1e4de1ac389bac",
      "large_dataset": false,
      "data_artifact_committed": true,
      "regeneration_note": null,
      "partition_split": {
        "training": 60225,
        "evaluation": 17207,
        "edge_case": 8604
      },
      "edge_case_coverage_pct": 0.1863,
      "edge_case_by_category": {
        "boundary": 5456,
        "adversarial": 3934,
        "negative": 3921,
        "historical-failure": 2717
      },
      "bias_test_results": {
        "entity_type": {
          "column": "entity_type",
          "top_value": "change_control",
          "top_share": 0.3001,
          "ceiling": 0.4,
          "passed": true
        },
        "tenant": {
          "column": "tenant_id",
          "top_value": "19f64474-765d-543a-9dd9-cd72e3599782",
          "top_share": 0.0078,
          "ceiling": 0.4,
          "passed": true
        }
      },
      "gold_standard_seed_corpus": [],
      "intended_use": null,
      "canonicalization_note": null,
      "model_grounding": {
        "source": "Append-only *_lifecycle_events tables across URS-16/15/18/17/13/21/19 (from_state,to_state,actor,signature,reason,at_timestamp). FSM edge sets per the lifecycle state machines defined in each module (DEC-16-02, DEC-15-02, DEC-18-02, DEC-17-02, DEC-13-06.1, DEC-21-02, DEC-19-02).",
        "transition_legality": "G9 uses the per-entity FSM_EDGES spec; every emitted (from,to) must be a legal edge. Emitted edges are asserted to be a subset of the spec edge set (non-tautological).",
        "reopen": "Governed reopen modelled as a legal closed-state -> reopen-target edge for entities whose snapshot reopen state equals the reopen target (deviation/oos/change/finding/risk)."
      },
      "assumption_register": null,
      "gates": {
        "G1": {
          "gate": "G1_schema",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "rows": 86036,
            "failures": 0,
            "sample": []
          }
        },
        "G2": {
          "gate": "G2_statistical",
          "status": "N/A",
          "passed": true,
          "applicable": false,
          "detail": {
            "rationale": "WF-QE-LC: transition events are a structural projection of the locked snapshots' FSM paths; no external Gold-Standard transition-frequency corpus exists for a like-for-like anchor. G2 N/A."
          }
        },
        "G3": {
          "gate": "G3_edge_cases",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "edge_rows": 16028,
            "total": 86036,
            "pct": 0.1863,
            "min_required": 0.15,
            "by_category": {
              "boundary": 5456,
              "adversarial": 3934,
              "negative": 3921,
              "historical-failure": 2717
            }
          }
        },
        "G4": {
          "gate": "G4_bias",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "axes": {
              "entity_type": {
                "column": "entity_type",
                "top_value": "change_control",
                "top_share": 0.3001,
                "ceiling": 0.4,
                "passed": true
              },
              "tenant": {
                "column": "tenant_id",
                "top_value": "19f64474-765d-543a-9dd9-cd72e3599782",
                "top_share": 0.0078,
                "ceiling": 0.4,
                "passed": true
              }
            }
          }
        },
        "G5": {
          "gate": "G5_provenance",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "missing_columns": [],
            "unpopulated": {},
            "expected_columns": 11
          }
        },
        "G6": {
          "gate": "G6_determinism",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "hash_run_1": "3c9495a8181fd2232809b3ab355cfe94a9f1de0f74a232ad7c74cf86ac5c5bae",
            "hash_run_2": "3c9495a8181fd2232809b3ab355cfe94a9f1de0f74a232ad7c74cf86ac5c5bae",
            "match": true
          }
        }
      },
      "classification": "ENGINEERING_VERIFICATION",
      "advisory_label": "Synthetic data — engineering test fixture per ARCH-AI-001 AC-2. Not Gold Standard.",
      "provenance_columns": [
        "synthetic_record_id",
        "generator_id",
        "generator_version",
        "seed_value",
        "generation_timestamp_utc",
        "schema_hash_sha256",
        "is_edge_case",
        "edge_case_category",
        "partition_assignment",
        "partition_assignment_hash",
        "advisory_label"
      ]
    },
    {
      "dataset_id": "wf_reg_001_intelligence",
      "workflow_id": "WF-REG-001",
      "workflow_name": "Regulatory Intelligence",
      "urs_refs": [
        "URS-27",
        "URS-12"
      ],
      "generator_id": "wf_reg_001_intelligence",
      "generator_version": "1.0.0",
      "generator_type": "rule_based+statistical",
      "seed_value": "vrx-reg-001-seed-2026",
      "generation_timestamp_utc": "2026-06-08T00:00:00+00:00",
      "schema_version": "1.0.0",
      "schema_hash_sha256": "83446e2384f4778444f1030c3bb49177eff976c8b9b9cabcdbf2899c86abcc73",
      "output_file": "synthetic_data/wf_reg_001_intelligence_v1.0.parquet",
      "output_format": "parquet",
      "output_row_count": 30000,
      "content_sha256": "9430fa68ae9dce42971a8cd4263c5e61f61cd2a305f7a335a509cabc08f5fe1e",
      "content_sha256_note": "CANONICAL G6 determinism anchor — portable across toolchains (audit F2).",
      "output_sha256": "4a00e86163bf0a2d45a8c08effd0bcc114492622428acf1472b9e882c5267842",
      "output_sha256_note": "Environment-bound Parquet-file integrity hash; reproducible only within the recorded build_environment, not across pyarrow versions (audit F2).",
      "build_environment": {
        "python": "3.13.12",
        "implementation": "CPython",
        "libraries": {
          "numpy": "2.4.6",
          "pandas": "3.0.3",
          "pyarrow": "24.0.0",
          "scipy": "1.17.1",
          "Faker": "40.21.0",
          "jsonschema": "4.26.0"
        }
      },
      "review_sample_file": "synthetic_data/wf_reg_001_intelligence_v1.0.review_sample.xlsx",
      "review_sample_row_count": 2400,
      "review_sample_sha256": "2717acf8c95e37dcf2716b3de8c518d50185c36f294fe21dc916a2e2b8c8281c",
      "partition_map_file": "synthetic_data/wf_reg_001_intelligence_v1.0.partition_map.parquet",
      "partition_map_hash": "be6eec140f98ef9eec420458f0e3ec1b70ed3c86202e38525d4ce080d9a16b6a",
      "large_dataset": false,
      "data_artifact_committed": true,
      "regeneration_note": null,
      "partition_split": {
        "training": 21000,
        "evaluation": 6000,
        "edge_case": 3000
      },
      "edge_case_coverage_pct": 0.1733,
      "edge_case_by_category": {
        "negative": 2100,
        "boundary": 1550,
        "adversarial": 800,
        "historical-failure": 750
      },
      "bias_test_results": {
        "source_type": {
          "column": "source_type",
          "top_value": "recall_enforcement",
          "top_share": 0.3086,
          "ceiling": 0.4,
          "passed": true
        },
        "tenant": {
          "column": "tenant_id",
          "top_value": "750e3804-3316-5833-a166-681d6be0c2c4",
          "top_share": 0.0864,
          "ceiling": 0.4,
          "passed": true
        }
      },
      "gold_standard_seed_corpus": [],
      "intended_use": "Engineering verification of the Regulatory Intelligence control surface (URS-27): regulatory feed-item ingestion -> triage -> impact assessment -> {change_raised|monitoring|dismissed} lifecycle; AI advisory relevance/classification that is NEVER written to GxP fields and CANNOT finalize (ai_advisory_flag + ai_cannot_finalize + HITL); RAG-source governance (freshness/citation/no real-domain URLs). LIMITED-RISK AI per EU AI Act / Annex 22. Grounded in the regulatory module (mig 038 regulatory_feed_items/submissions/commitments). The AI-advisory fields (ai_relevance_score, ai_suggested_impact, ai_request_id, human_accept_status, ai_advisory_flag) are a documented ENGINEERED EXTENSION beyond the current product schema (the Phase-1D plan specifies them; the module today has only relevance_score) - see assumption_register. Reuses the MIRA canonical keys. NO real LLM.",
      "canonicalization_note": null,
      "model_grounding": {
        "source": "regulatory module mig 038 (regulatory_feed_items relevance_score, regulatory_submissions status FSM, regulatory_commitments); pipeline-spec.md:59-62 lifecycle FSM, :10-14 advisory-only RAG governance; CLAUDE.md QS-21/QS-23; PHASE_1D_BUILD_PLAN.md REG-001 (30k, 5th real G2)",
        "load_bearing": "(1) real G2 chi-square recall_classification vs GS-004; (2) AI-cannot-finalize advisory boundary (CC raised only after human acceptance; advisory never in a GxP field) - recomputable; (3) lifecycle FSM validity",
        "chain_link": "tenant; assigned_to/human_decided_by -> user; impacted_product_id -> MD-002; impacted_site_id -> MD-001; linked_change_control_id -> CRC change_control (draft); linked_recall_id -> QE-007 recall; ai_request_id -> ai_request namespace (REG-001 range); hitl_decision_id -> hitl_decision namespace; ai_model_config_id -> MIRA-003 ai_model (40)."
      },
      "assumption_register": [
        "AI-advisory fields (ai_relevance_score, ai_suggested_impact, ai_classification_confidence, ai_request_id, human_accept_status, human_decided_by, ai_advisory_flag) are an ENGINEERED EXTENSION beyond the current product schema (mig 038 has only relevance_score); the Phase-1D plan specifies them per QS-21 / Annex 22.",
        "Layer-1 (public document) vs Layer-2 (tenant projection) is a LOGICAL partition within one table, not separate tables (the module uses regulatory_feed_items).",
        "content_hash dedup key + jurisdiction column are plan-specified, not in the current module schema.",
        "REG-001 AI inferences use a DISJOINT ai_request namespace range (60000+) - distinct regulatory-classification inferences sharing the canonical ai_request table, NOT membership in MIRA-001's 0..59999 sample."
      ],
      "gates": {
        "G1": {
          "gate": "G1_schema",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "rows": 30000,
            "failures": 0,
            "sample": []
          }
        },
        "G2": {
          "gate": "G2_categorical",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "chi2_stat": 1.5251281447004117,
            "p_value": 0.4664688324709444,
            "alpha": 0.05,
            "categories": 3,
            "df": 2,
            "n_observed": 9258,
            "anchor_total": 200,
            "min_detectable_effect_w": 0.032,
            "effect_band_detectable": "small",
            "power_note": "PASS = not inconsistent with the Gold Standard at n=9258 (detects Cohen's w >= 0.032 at 80% power); demonstrates consistency, not high-confidence identity."
          }
        },
        "G3": {
          "gate": "G3_edge_cases",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "edge_rows": 5200,
            "total": 30000,
            "pct": 0.1733,
            "min_required": 0.15,
            "by_category": {
              "negative": 2100,
              "boundary": 1550,
              "adversarial": 800,
              "historical-failure": 750
            }
          }
        },
        "G4": {
          "gate": "G4_bias",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "axes": {
              "source_type": {
                "column": "source_type",
                "top_value": "recall_enforcement",
                "top_share": 0.3086,
                "ceiling": 0.4,
                "passed": true
              },
              "tenant": {
                "column": "tenant_id",
                "top_value": "750e3804-3316-5833-a166-681d6be0c2c4",
                "top_share": 0.0864,
                "ceiling": 0.4,
                "passed": true
              }
            }
          }
        },
        "G5": {
          "gate": "G5_provenance",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "missing_columns": [],
            "unpopulated": {},
            "expected_columns": 11
          }
        },
        "G6": {
          "gate": "G6_determinism",
          "status": "PASS",
          "passed": true,
          "applicable": true,
          "detail": {
            "hash_run_1": "9430fa68ae9dce42971a8cd4263c5e61f61cd2a305f7a335a509cabc08f5fe1e",
            "hash_run_2": "9430fa68ae9dce42971a8cd4263c5e61f61cd2a305f7a335a509cabc08f5fe1e",
            "match": true
          }
        }
      },
      "classification": "ENGINEERING_VERIFICATION",
      "advisory_label": "Synthetic data — engineering test fixture per ARCH-AI-001 AC-2. Not Gold Standard.",
      "provenance_columns": [
        "synthetic_record_id",
        "generator_id",
        "generator_version",
        "seed_value",
        "generation_timestamp_utc",
        "schema_hash_sha256",
        "is_edge_case",
        "edge_case_category",
        "partition_assignment",
        "partition_assignment_hash",
        "advisory_label"
      ]
    }
  ]
}
