Assessments

Assessments are the platform’s control plane for AI red teaming with DreadAIRT. An assessment is a named container that organizes attack runs against an AI system and aggregates their results into analytics, findings, and compliance reports.

What an assessment is

An assessment answers: How vulnerable is this AI system to adversarial attacks?

You provide:

a target system to probe (defined via the SDK)
one or more attack strategies (TAP, GOAT, Crescendo, PAIR, and others)
goals describing what the attacks should attempt

The SDK executes attack runs locally, emitting telemetry as OpenTelemetry spans. Those spans flow to ClickHouse, and the API materializes analytics from the trace data on demand.

An assessment belongs to a project within a workspace and accumulates results across multiple attack runs over time.

Key concepts

Concept	Definition
Assessment	A named, project-scoped container for a red teaming campaign
Attack Run	A single execution of an attack strategy (e.g., one TAP run with a specific goal)
Trial	An individual attempt within an attack run — one conversation or prompt exchange with the target
ASR	Attack Success Rate — the fraction of trials that achieved the stated goal
Risk Score	A composite metric combining ASR, severity, and attack effectiveness
Transform	An adversarial technique applied to prompts (encoding, persuasion, injection, etc.)
Compliance Tag	A mapping from attack results to security framework categories (OWASP, MITRE ATLAS, NIST, Google SAIF)

Execution flow

The assessment lifecycle spans the SDK and the platform:

SDK — the dreadnode.airt module launches attacks against a target, producing structured telemetry as OTEL spans
ClickHouse — spans are ingested and stored for high-volume querying
API — materializes analytics from ClickHouse traces when requested, persists assessment metadata and reports in Postgres

Analytics and reporting

The platform provides several levels of analysis, all derived from trace data in ClickHouse:

Assessment-level:

Aggregated trace statistics (total attacks, trials, ASR, risk scores)
Per-attack span breakdowns with success rates and severity
Individual trial spans with filtering by attack name, minimum score, and jailbreak status

Project-level:

Cross-assessment findings with severity, category, and attack name filtering
Executive summary with risk trends, compliance posture, and top vulnerabilities
Automated report generation combining findings across all assessments in a project

Compliance mapping: Results are tagged against industry security frameworks:

OWASP Top 10 for LLM Applications
MITRE ATLAS
NIST AI Risk Management Framework
Google Secure AI Framework (SAIF)

Reports

Reports are generated from assessment or project data and persisted for later retrieval. A report captures a point-in-time snapshot of findings, risk scores, and compliance posture. Reports can be generated at both the individual assessment level and the project level (consolidating across all assessments).

API surface

Assessment resources are workspace-scoped:

POST /api/v1/org/{org}/ws/{workspace}/airt/assessments
GET /api/v1/org/{org}/ws/{workspace}/airt/assessments
GET /api/v1/org/{org}/ws/{workspace}/airt/assessments/{assessment_id}
PATCH /api/v1/org/{org}/ws/{workspace}/airt/assessments/{assessment_id}
DELETE /api/v1/org/{org}/ws/{workspace}/airt/assessments/{assessment_id}

Analytics and traces for an assessment:

GET /api/v1/org/{org}/ws/{workspace}/airt/assessments/{assessment_id}/analytics
GET /api/v1/org/{org}/ws/{workspace}/airt/assessments/{assessment_id}/traces
GET /api/v1/org/{org}/ws/{workspace}/airt/assessments/{assessment_id}/traces/attacks
GET /api/v1/org/{org}/ws/{workspace}/airt/assessments/{assessment_id}/traces/trials

Reports for an assessment:

POST /api/v1/org/{org}/ws/{workspace}/airt/assessments/{assessment_id}/reports
GET /api/v1/org/{org}/ws/{workspace}/airt/assessments/{assessment_id}/reports
GET /api/v1/org/{org}/ws/{workspace}/airt/assessments/{assessment_id}/reports/{report_id}

Project-level aggregation:

GET /api/v1/org/{org}/ws/{workspace}/airt/projects/{project}/summary
GET /api/v1/org/{org}/ws/{workspace}/airt/projects/{project}/findings
POST /api/v1/org/{org}/ws/{workspace}/airt/projects/{project}/reports/generate