This page is part of the ForgeSDLC knowledge base — an AI-assisted, human-directed methodology for taking product work from concept to production. For the core operating model and vocabulary, see Forge SDLC overview and What is ForgeSDLC?.

CRISP-DM: Cross-Industry Standard Process for Data Mining

Purpose: Project-agnostic reference for CRISP-DM — the most widely adopted methodology for data mining and applied machine learning. Use it to structure discovery, modeling, and handoff without tying the blueprint to a specific tool stack.

Audience: Teams following DATA-SCIENCE.md and approaches/README.md.

Overview

CRISP-DM (Cross-Industry Standard Process for Data Mining) is a process model, not a single algorithm or product. It organizes work from business intent through deployed models with explicit feedback loops. Its strength is shared vocabulary across business, analytics, and engineering: everyone can anchor discussions in the same six phases.

The process is iterative by design. Evaluation often sends you back to data preparation or modeling; deployment feedback revisits business understanding. Treat phase boundaries as decision gates, not linear milestones.

Six phases at a glance

Phase	Objectives	Key activities	Typical outputs	Typical duration %*
Business Understanding	Align analytics with business goals and constraints	Stakeholder interviews; problem framing; success criteria; risk and scope	Problem statement, success metrics, project plan, feasibility note	~15–20%
Data Understanding	Know what data exists and whether it can support the goals	Initial collection; EDA; quality checks; hypothesis sketches	Data dictionary, EDA report, quality assessment, gap analysis	~15–20%
Data Preparation	Produce analysis-ready datasets and reproducible transforms	Selection, cleaning, integration, feature construction, formatting	Clean datasets, feature specs, transformation code/pipelines	~25–35%
Modeling	Fit candidate models and compare approaches	Technique selection; design of experiments; train/validation protocol; tuning	Trained models, experiment logs, comparison tables	~10–20%
Evaluation	Judge whether results meet business and technical bars	Metric review vs criteria; error analysis; cost/benefit; deploy decision	Evaluation report, go/no-go, documented limitations	~10–15%
Deployment	Put the solution into operation and sustain value	Rollout plan; monitoring; maintenance; documentation; knowledge transfer	Deployed artifact, runbooks, monitoring dashboards, final report	~5–15%

*Percentages are indicative only; highly regulated or messy data domains shift time toward understanding and preparation.

Phase deep dives

Business Understanding

Sub-task	What to do	Output
Problem framing	Translate business language into an analytical or ML problem	Formal problem statement
Success criteria	Define measurable business and model-level success	KPIs, acceptance thresholds
Data mining goals	Specify modeling type (e.g. classification, forecasting) and constraints	Technical goal doc
Project plan	Resources, timeline, data access, ethics/compliance checkpoints	Plan with milestones

Data Understanding

Sub-task	What to do	Output
Initial collection	Acquire or catalog sources; document lineage where possible	Inventory, access agreements
EDA	Univariate/multivariate summaries; visualizations; segment checks	EDA notebook or report
Data quality	Missingness, duplicates, outliers, label noise	Quality scorecard
Early insights	Hypotheses that inform preparation and modeling	Short insight memo

Data Preparation

Sub-task	What to do	Output
Selection	Choose rows, columns, time windows aligned to the target	Subset definitions
Cleaning	Imputation, deduplication, outlier handling (with rationale)	Cleaning rules
Feature construction	Aggregations, ratios, encodings, domain features	Feature definitions
Integration	Joins across sources; consistent keys and grain	Integrated tables
Formatting	Types, schemas, splits compatible with modeling tools	Model-ready datasets

Modeling

Sub-task	What to do	Output
Technique selection	Match algorithms to data size, interpretability, latency	Shortlist with rationale
Train/test split	Respect leakage rules (time, groups, stratification)	Split protocol
Hyperparameter tuning	Search strategy; validation discipline	Tuned configs
Model building	Train final candidates; track seeds and code versions	Model artifacts + metadata

Evaluation

Sub-task	What to do	Output
Business criteria check	Compare metrics to thresholds from Business Understanding	Pass/fail vs criteria
Model review	Error analysis, robustness spot checks, fairness slices	Review notes
Deployment decision	Go, iterate, or stop — with documented trade-offs	Decision record

Deployment

Sub-task	What to do	Output
Deployment plan	Rollout, rollback, ownership	Runbook
Monitoring	Data drift, performance, latency, business KPIs	Dashboards, alerts
Maintenance	Retrain triggers, data refresh, dependency updates	Ops checklist
Final report	Lessons learned, limitations, handover	Closing documentation

CRISP-DM vs other methodologies

Dimension	CRISP-DM	TDSP (Microsoft)	SEMMA (SAS)	KDD Process
Origin / style	Cross-industry standard; phase-oriented	Team roles, agile sprints, code structure	Sample, Explore, Modify, Model, Assess	Research-oriented knowledge discovery
Business linkage	Strong explicit phase	Strong (business metrics, backlog)	Weaker; more data-centric	Strong on problem understanding
Iteration	Core to the model	Agile loops + phases	Implicit in Assess	Multiple loops
Typical audience	General analytics / ML	Enterprise Azure/data teams	SAS-centric shops	Academia / R&D framing

Use CRISP-DM when you want a neutral, widely recognized scaffold; pair it with MLOps when production automation is required (see mlops.md).

CRISP-DM mapped to SDLC (high level)

Generic SDLC letters A–F here mean: A charter/requirements, B design, C build/integrate, D implementation detail, E test/verify, F release/operate — adjust to your org’s naming.

Business Understanding → A/B (intent, scope, success definition).
Data Understanding + Data Preparation → C/D (data products, pipelines, integration).
Modeling → D (training code, experiments).
Evaluation → E (verification against criteria).
Deployment → F (release, monitoring, operations).

Adapting CRISP-DM for modern ML

Deep learning: Modeling and Evaluation expand — longer training cycles, more hardware dependency, and need for baseline + ablation discipline. Data Preparation often includes large-scale labeling and augmentation policies.
MLOps: Deployment is not a one-time step; it includes pipelines, registries, and continuous monitoring. Feedback loops are automated (retrain triggers) rather than only manual project reviews.
Feature stores and real-time serving: Data Preparation and Deployment overlap with online/offline feature consistency and point-in-time correctness — classic CRISP-DM documents assumed batch analytics unless you extend them explicitly.

Anti-patterns

Anti-pattern	Why it hurts	Better habit
Skipping Business Understanding	Optimizes the wrong metric; shelf-ware models	Lock success criteria before heavy modeling
Data leakage in preparation	Inflated offline metrics; production failure	Time-safe splits, group splits, pipeline encapsulation
Single evaluation metric	Hidden failure modes (e.g. good AUC, bad recall on minority)	Multi-metric dashboards + slice analysis
Treating Deployment as “IT handoff”	No monitoring owner; silent drift	Shared SLOs and runbooks from day one

External references

CRISP-DM 1.0 — IBM / SPSS consortium process model (widely cited standard document).
Witten, Frank, Hall, and Pal — Data Mining: Practical Machine Learning Tools and Techniques — practical grounding for phases and evaluation.

Keep project-specific model documentation in docs/product/ and experiment logs in docs/development/, not in this file.

Navigate