Synthetic healthcare data is artificially generated patient, clinical, and claims data that mirrors the statistical patterns of real records without containing actual protected health information. Healthcare and health-tech teams use it to build and test EHR systems, process claims, run interoperability tests, and train AI models in HIPAA-context environments — provided the generation process is validated against re-identification risk. Common use cases include EHR/EMR testing, claims processing, FHIR/HL7 interoperability testing, and AI model training on clinical data.

Is synthetic patient data considered PHI under HIPAA?

Properly generated synthetic patient data that contains no real patient identifiers is generally not treated as protected health information under HIPAA. HIPAA's definition of PHI turns on whether information can identify an individual, alone or combined with other data a covered entity holds — and data built to a specification, rather than derived by lightly obscuring real records, doesn't carry that identifying link by construction. That's the same premise that makes synthetic data useful for privacy-sensitive testing generally. Still, "synthetic" isn't automatically synonymous with "compliant": the generation process itself has to be validated against re-identification risk, because a poorly built model can leak recognizable patterns from whatever it was trained on.

What makes synthetic patient data defensible in practice comes down to three things:

  • No real identifiers anywhere in the output. Names, medical record numbers, dates of birth, and other direct identifiers are never copied from real records — they're generated fresh, with no link back to an actual patient.
  • Validated against re-identification risk. The dataset is tested to confirm that no combination of quasi-identifiers — age, ZIP code, a rare diagnosis, a visit pattern — could single out a real person. Even well-built synthetic data still carries some re-identification risk, which is why validating against it matters.
  • A documented generation process. Compliance and legal teams need to see how the data was produced — from scratch, from a schema, or modeled on a real source — because that documentation is what a HIPAA expert-determination review or an internal audit will ask for.

Quality and compliance overlap here more than most teams expect. A synthetic dataset that's statistically thin — too uniform, missing the correlations real clinical data has — is both a poor test double and a re-identification red flag, since unnatural uniformity is easy to spot and hard to defend. That's why synthetic patient data for HIPAA-context environments should be measured with the same fidelity and privacy metrics used to validate any synthetic dataset, not assumed compliant because a vendor calls it "synthetic." Get both right — no real identifiers, validated fidelity — and you have HIPAA-compliant synthetic data, a defensible substitute for production PHI across the use cases below.

Building synthetic EHR data and patient-record test environments

The largest and most common healthcare synthetic-data need is realistic synthetic EHR data: a stand-in patient-record environment that a development team can throw real load at, break intentionally, and never worry about exposing an actual patient. Tonic Fabricate generates that kind of environment from scratch — patient, encounter, diagnosis, and claims tables built with referential integrity intact across every linked table, so a synthetic patient's encounters, diagnoses, and billing records all reference the same patient ID the way they would in a real EHR. Fabricate can build this purely from a schema and a set of rules, or it can connect to a real source system and model new records on its structure and distributions. Either way, the result is new synthetic records rather than copies of real patient data, so production PHI isn't carried forward into the test environment.

Structured tables are only half the picture, though. A realistic EHR test environment also has to carry the unstructured layer that real clinical systems generate constantly — physician notes, discharge summaries, nursing documentation — and that's where the sensitive information tends to concentrate, scattered through free text rather than sitting in a labeled column. Tonic Textual complements Fabricate here, applying the same de-identification and synthesis principles to that free-text content: it detects the PHI embedded in clinical notes and replaces it with realistic synthetic substitutes, so a note reads like a real discharge summary without describing a real patient.

The Tonic Advantage: Fabricate generates synthetic patient, encounter, diagnosis, and claims data that preserves referential integrity across every linked table — a synthetic patient's records stay consistent from admission through billing — without putting production PHI at risk.

Pairing the two means a complete, HIPAA-compliant synthetic data environment — structured tables and the clinical narrative that surrounds them — rather than a database that looks realistic until someone opens a note field. The same discipline applies more broadly across synthetic data for software testing and QA: realistic volume, referential consistency, and edge-case coverage that production data can't reliably supply on demand. Healthcare just raises the stakes, because the alternative to synthetic test data is scrubbing or restricting access to real PHI, both of which slow teams down in ways a well-built synthetic environment doesn't.

Testing claims processing and payer systems without production PHI

Claims data is arguably more sensitive to expose in a lower environment than clinical records, because a single claim links a patient's diagnosis, treatment, cost, and payer relationship together — a concentration of PHI and financial detail that makes a claims database one of the highest-value production sources to protect. Testing claims processing and payer systems well means generating data that exercises the logic those systems actually run, not just plausible-looking rows.

That typically covers a handful of recurring test scenarios:

  1. Adjudication logic — claims that should be approved, partially approved, or denied under specific coverage rules, so the adjudication engine's decision logic gets exercised across its real branches, not just the happy path.
  2. Denial and appeal flows — claims that trigger a denial reason code, followed by the appeal and resubmission data needed to test how the system handles a multi-step dispute.
  3. Coordination of benefits — claims involving more than one payer, testing whether the system correctly determines primary versus secondary responsibility and splits payment accordingly.
  4. Load testing at claim volume — enough synthetic claims, generated quickly, to test system performance under the volume a payer or clearinghouse actually processes, without waiting for real claims to accumulate or risking a production data pull.

Generating this kind of test data means understanding the workflow it supports, not just the schema it fills. Tonic.ai's healthcare data de-identification and synthesis solutions cover this exact workflow, because claims and payer testing is one of the more common reasons healthcare and health-tech teams turn to synthetic data in the first place — the sensitivity of a real claims database is exactly the kind of production dependency that makes teams look for an alternative.

Interoperability testing: FHIR, HL7, and C-CDA data exchange

Interoperability testing needs synthetic data that's realistic in two dimensions at once: the clinical content has to make sense, and the data has to be correctly structured against whatever exchange standard the integration test is targeting. A record that's syntactically valid but clinically nonsensical — a lab result dated before the encounter that ordered it — will pass a schema validator and still fail to catch real integration bugs, because it never exercises the logic that depends on the data making sense.

The standards that come up most often in healthcare interoperability testing each need their own treatment:

  • FHIR (Fast Healthcare Interoperability Resources) — the modern standard for exchanging structured healthcare data via APIs, built around discrete resources (Patient, Observation, Encounter, and similar) that reference each other. Testing a FHIR integration means generating resources that are both individually valid and correctly cross-referenced.
  • HL7 v2 — the older messaging standard still widely used for real-time clinical events, such as admissions, lab results, and orders, between hospital systems. HL7 v2 messages are pipe-delimited and less strictly typed than FHIR, so test data needs to match the specific message structure a given interface expects.
  • C-CDA (Consolidated Clinical Document Architecture) — the standard for clinical document exchange, such as a discharge summary or continuity-of-care document, structured as XML with defined sections for problems, medications, and allergies.

Generating synthetic data that's correctly formed against these standards, not just plausible-looking clinical content, is what makes an interoperability test mean anything. A test environment built on data that skips this structure will pass in staging and then fail against a real trading partner's system — exactly the failure mode interoperability testing exists to catch before it reaches production.

Training and fine-tuning AI models on safe healthcare data

Training or fine-tuning an AI model on healthcare data runs into the same problem clinical documentation always does: the most useful signal — physician reasoning, patient history, diagnostic nuance — lives in unstructured text that's full of PHI. Tonic Textual addresses this directly, using proprietary NER models to detect PHI in clinical notes, discharge summaries, and similar free text, then either redacting those values or replacing them with realistic synthetic substitutes so the text stays usable for training without exposing a real patient.

That capability supports a range of AI and ML use cases in healthcare:

  • Diagnostic and predictive model training — models that flag risk or predict outcomes from clinical history need training data with intact clinical detail, which is exactly what synthesis, rather than blanket redaction, preserves.
  • LLM fine-tuning on clinical notes — adapting a general-purpose language model to read and generate clinical documentation for a specific specialty or care setting.
  • Rare-cohort augmentation for clinical trials — when a condition or patient cohort is too small to train on directly, de-identified real examples can seed additional synthetic examples that expand the training set without multiplying exposure of the original sensitive records.

That last use case is also where Textual and Fabricate pair up: de-identify the sensitive clinical text with Textual, then point Fabricate at that de-identified set as a model for generating more of it. It's the same augmentation pattern used across the broader landscape of synthetic data for machine learning and AI, applied to the specific case of clinical text that's both sensitive and scarce.

Wellthy, a healthcare company, reported a 50% reduction in flagged care team actions, along with enhanced AI development capabilities and streamlined workflow productivity, after applying synthesis-based de-identification to its clinical-text workflows — a concrete example of what safely unlocking clinical text for AI development can look like in practice, not just in a test environment.

The Tonic Advantage: Textual detects PHI in clinical notes with proprietary NER models, then redacts or synthesizes that text — replacing a real name or diagnosis detail with a realistic substitute rather than blacking it out — so the note stays usable for AI model training instead of becoming a string of redaction marks.