The app for independent voices

AI doesn’t exist without data.

Which makes tracking, classifying, and organizing it so key for AI governance and security.

Two new standards help companies tackle this problem:

-> HITRUST’s AI Security Assessment with Certification

-> ISO/IEC 42001:2023

What HITRUST requires:

Baseline Unique IDs (BUID) 07.07aAISecOrganizational.4-5 state, for data used:

-> to train, fine-tune, test, and validate AI models

-> in retrieval-augmented generation (RAG)

Organizations must:

-> Maintain a catalog of trusted data sources

-> Inventory data used, including at least:

--- Provenance

--- Sensitivity

For ISO 42001, organizations don’t NEED to do any of these. But these (optional) Annex A controls require tracking:

A.4.3: Data resources

This includes information about:

-> Retention

-> Intended use

-> Update/modification

-> Quality (duplicating A.7.4 in my opinion)

-> Provenance (duplicative as well, of A.7.3 and A.7.5)

of “data resources utilized for the AI system.”

It doesn’t specifically say “training,” so this can include data for AI processing.

A.7.2: Data for development/enhancement of AI systems

This requirement focuses on data’s:

-> Privacy and security implications (duplicates A.7.3)

-> Potential security and safety threats

-> Accuracy/integrity (duplicates A.7.4)

-> Transparency and explainability

-> Representativeness

A.7.3: Acquisition of data

A broad control you can summarize as “data governance.” It requires noting:

-> Sources

-> Categories

-> Quantities

-> Demographics/biases

-> Data rights/ownership

-> Privacy and security requirements

A.7.4: Quality of data

ISO/IEC 25024:2015 (the relevant reference) defines data quality as the degree to which the data’s

-> characteristics satisfy

-> stated and implied needs

-> when used under specified conditions.

A.7.5: Data provenance

This is information about data’s:

-> update

-> creation

-> validation

-> abstraction

-> transcription

-> transfer of control

A.7.6: Data preparation

This control requires documenting granular steps in the model training process like:

-> Encoding

-> Data cleaning

-> Normalization

The verdict?

ISO 42001's Annex A controls have much heavier demands for data management

This makes sense because 42001 is an AI governance standard, while HITRUST’s certification is a security-focused one.

There is understandably a lot of overlap, though.

Are you considering ISO 42001 or HITRUST certification (or both)?

Jan 9, 2025
at
4:55 PM

Log in or sign up

Join the most interesting and insightful discussions.