Control
Data Classification Before AI Use
Classifying data before AI use defines what may be shared, transformed, summarized, retained, or logged.
data security classification privacy
What it constrains
Data Classification Before AI Use prevents teams from treating all context as harmless prompt material. It gives the AI workflow a boundary before data enters a model, retrieval system, connector, or log store.
Implementation
- Define simple classes such as public, internal, confidential, and restricted.
- Map each approved AI tool and workflow to allowed classes.
- Provide examples that match real tasks: summarizing contracts, drafting emails, analyzing logs, reviewing code, or querying customer records.
- Define who can approve exceptions and how long they last.
- Review whether prompts, outputs, embeddings, and logs inherit the same classification.
Owner
Data governance, privacy, security architecture, and the process owner should share the decision. The business owner understands use; security and privacy understand exposure.
Evidence
- Classification policy linked to AI use cases.
- Approved tool entries with data limits.
- Examples employees can apply without interpretation games.
- Exception records and review dates.
- Redaction or prevention mechanisms where needed.
Common errors
- Classifying source data but ignoring prompts and outputs.
- Allowing confidential data because the model is “internal” without checking retention and access.
- Writing categories that employees cannot apply to real work.
- Reviewing only the model and not retrieval, connectors, exports, or logs.
Related risks
- Sensitive Data Disclosure
- Shadow AI
- Prompt Injection