Compliance
Dejan MurkoALCOA Principles: An Accountability Map for Data Integrity
At a glance
- ALCOA is the data-integrity framework: data should be Attributable, Legible, Contemporaneous, Original, and Accurate. ALCOA+ adds Complete, Consistent, Enduring, and Available; ALCOA++ adds Traceable.
- This guide treats ALCOA not as a glossary but as an accountability map: for each attribute, who owns it on a trial and the specific failure it prevents.
- Data integrity in clinical trials is everyone’s job, but in different places: sponsor, principal investigator, coordinator (CRC), data manager, and vendors each own particular attributes.
- It matters because trial data drives decisions about patient safety and a product’s future. Broken integrity means unreliable conclusions and inspection findings.
- Most integrity breaches map cleanly to a violated attribute (a back-dated form breaks Contemporaneous; an unattributable entry breaks Attributable). Naming the attribute names the fix.
If you keep hearing “ALCOA” in audits and want more than a definition, this is the page. The competent ranking pages define each attribute, then stop, leaving them as a flat list. The questions that actually matter on a running trial are “who owns this attribute?” and “what does breaking it look like?” This guide answers both, attribute by attribute, and clears up the ALCOA versus ALCOA+ versus ALCOA++ confusion along the way.
It stays at the principles altitude. For how an audit trail technically works, see the audit-trail guide; for the US electronic-records rule, the 21 CFR Part 11 guides; for validation, the CSV guides. Here the focus is the framework, the ownership, and the stakes.
What ALCOA means (and why it exists)
ALCOA is an acronym for the core attributes that make data trustworthy for regulatory purposes. The MHRA’s data-integrity guidance states it plainly: ALCOA being Attributable, Legible, Contemporaneous, Original, and Accurate, with the ”+” referring to Complete, Consistent, Enduring, and Available (Data Integrity Guidance and Definitions). The FDA’s data-integrity guidance uses the same base set: attributable, legible, contemporaneously recorded, original or a true copy, and accurate (Data Integrity and CGMP Q&A). The framework exists because regulators need a shared, checkable definition of “data you can trust,” and ALCOA is it.
The five original attributes, defined plainly
- Attributable. You can tell who recorded the data and when. Every entry traces to a person.
- Legible. The data is readable and permanent, and stays so over time.
- Contemporaneous. The data is recorded at the time the activity happened, not reconstructed later.
- Original. It is the original record (or a verified true copy), not a transcription of unknown fidelity.
- Accurate. It is correct, truthful, and free of errors that change its meaning.
ALCOA+ and ALCOA++: what the extensions added and why
ALCOA was, in the MHRA’s framing, historically regarded as defining the attributes of data suitable for regulatory purposes, and the extensions make explicit what was always implied across the data lifecycle.
The + four (Complete, Consistent, Enduring, Available)
- Complete. All the data is there, including repeats, reanalyses, and metadata, nothing quietly dropped.
- Consistent. The data is internally consistent, with events in expected sequence and no contradictions.
- Enduring. It lasts, recorded on durable media and preserved for its full retention period.
- Available. It can be retrieved and accessed when needed, including by inspectors.
The MHRA frames these as ensuring the data is complete, consistent, enduring, and available throughout the data lifecycle (Data Integrity Guidance and Definitions).
The ++ traceable attribute, and version confusion to avoid
ALCOA++ adds Traceable: you can follow the data through every transformation, from original capture through processing to final report, with the audit trail and metadata that connect them. A note on versioning: the MHRA observes that it prefers to talk about ALCOA rather than “ALCOA+”, because the additional attributes were always part of trustworthy data, not a separate, later standard. The practical takeaway is not to get lost in ”+” versus ”++”: treat the full set of attributes as one coherent definition of data integrity rather than competing standards you must reconcile.
ALCOA on paper and on electronic records
The attributes are the same whether the record is on paper or in a system; what changes is how each is satisfied and where it breaks. The framework was written to apply across both media, and most trials run a mix.
- On paper, Attributable comes from a handwritten initial and date, Contemporaneous from recording at the moment rather than copying up from a sticky note later, and Original from keeping the first-written page rather than a retyped fair copy. The classic paper failures are illegible entries, late back-filling, and lost originals.
- On electronic records, the same attributes are delivered by controls instead of penmanship: a unique login and audit trail carry Attributable and Contemporaneous, system validation and access control protect Original and Accurate, and backup and retention policy carry Enduring and Available. The failure modes shift accordingly, to shared logins, disabled audit trails, and unvalidated systems.
The practical point is not that electronic is safer than paper or the reverse; it is that each attribute has a different concrete control on each medium, and an inspector checks the control that fits the record in front of them.
ALCOA, attribute by attribute, in a clinical trial
Here is the accountability map: each attribute, a primary owner, and the trial failure it prevents.
| Attribute | Primary owner(s) | Trial failure it prevents |
|---|---|---|
| Attributable | CRC, investigator, data manager | An entry no one can be tied to; shared logins masking who did what |
| Legible | CRC, data manager | Illegible or impermanent records that cannot be read or last |
| Contemporaneous | CRC, investigator | Back-dated or reconstructed-from-memory case report forms |
| Original | Investigator, data manager | Transcriptions passed off as source; lost originals |
| Accurate | CRC, investigator, data manager | Wrong values that change a result or a safety signal |
| Complete | Data manager, sponsor | Dropped repeats, missing metadata, selectively omitted data |
| Consistent | Data manager, sponsor | Contradictory or out-of-sequence records |
| Enduring | Sponsor, data manager, vendor | Data lost before its retention period ends |
| Available | Sponsor, vendor | Records that cannot be produced for an inspector |
| Traceable | Data manager, vendor | A result that cannot be tied back to its source through the audit trail |
The point of the map is that “data integrity” is not one person’s job; it is a distributed responsibility where each attribute has a natural owner, and a breach usually traces to a specific owner failing a specific attribute.
Who is responsible for data integrity in clinical trials
Responsibility is shared but not vague:
- Sponsor. Ultimately accountable for the integrity of the trial data, including oversight of vendors and the systems that hold the data. The sponsor owns the lifecycle attributes (Complete, Enduring, Available) most strongly.
- Principal investigator (PI). Responsible for the integrity of data generated at the site and for proper documentation and oversight of site staff.
- Clinical research coordinator (CRC). On the front line of Attributable, Contemporaneous, and Accurate, the person recording data as activities happen.
- Data manager. Owns the integrity of the data as it is processed, cleaned, and transformed, Consistent, Complete, and Traceable in particular.
- Vendors. Whoever runs the systems (EDC, eTMF, and others) is responsible for the technical controls that keep data Enduring, Available, and Traceable, under the sponsor’s oversight.
No single role can deliver data integrity alone; it is the sum of each owner doing their part, with the sponsor accountable for the whole.
Why data integrity matters in clinical trials
Three stakes, in plain terms:
- Patient safety. Trial data informs decisions about whether a treatment is safe and effective. Corrupted data can hide a safety signal or manufacture a false one.
- Valid conclusions. The entire purpose of a trial is reliable results. Data that is not trustworthy makes the conclusions, and the resources spent reaching them, worthless.
- Inspection findings. Regulators test data integrity directly, and breaches are among the most serious findings a trial can receive, with consequences up to rejected submissions.
Common data-integrity breaches mapped to ALCOA
Most real breaches are an attribute violation in disguise:
- Back-dated or late case report forms violate Contemporaneous.
- Shared logins or unsigned entries violate Attributable.
- Transcribed values with the original discarded violate Original.
- Selectively reported results, dropped repeats violate Complete.
- Values that do not match source violate Accurate.
- Records lost or unretrievable at inspection violate Enduring or Available.
- A result that cannot be tied back to its source violates Traceable.
Naming the violated attribute is the first step to fixing it, because it points straight at the owner and the control that failed.
Frequently asked questions
What does ALCOA stand for? Attributable, Legible, Contemporaneous, Original, and Accurate, the five core attributes of trustworthy data.
What is the difference between ALCOA, ALCOA+, and ALCOA++? ALCOA is the original five. ALCOA+ adds Complete, Consistent, Enduring, and Available. ALCOA++ adds Traceable. Many regulators treat these as one coherent definition rather than separate standards.
Who is responsible for data integrity in clinical trials? Everyone, in different places: the sponsor is ultimately accountable (and owns vendor oversight), the PI owns site data, the CRC owns front-line recording, the data manager owns data processing, and vendors own the technical controls.
Why is data integrity important in clinical trials? Because trial data drives decisions about patient safety and a product’s future, and breaches lead to invalid conclusions and serious inspection findings.
How do common breaches map to ALCOA? Each typically violates one attribute: back-dating breaks Contemporaneous, shared logins break Attributable, lost originals break Original, dropped data breaks Complete, and so on.
The bottom line
ALCOA, with its + and ++ extensions, is the shared definition of trustworthy data: Attributable, Legible, Contemporaneous, Original, Accurate, plus Complete, Consistent, Enduring, Available, and Traceable. Treat it not as an acronym to memorize but as an accountability map, each attribute has an owner and prevents a specific failure. Know which attribute a risk threatens and who owns it, and data integrity becomes a managed responsibility rather than an audit surprise.
Sources
Dejan Murko
Dejan is the co-founder of Mayet, building software for biotech and pharma teams.