Adversarial AI evaluations with evidence you can defend.
Solstice Assurance tests AI systems against prompt injection, tool misuse, and data exfiltration scenarios, then turns each run into findings, scoring, control mapping, attestations, and a tamper-evident evidence set.
Assurance certifies, Proof preserves, Auditor governs.
The attached README describes Assurance as part of a closed-loop TRiSM platform where runs can be registered through Proof and enforced by Auditor before production authority is granted.
Pre-flight certification
Sandbox the target, run automated cyberattacks, generate scorecards, findings, transcripts, and assessment artifacts.
Immutable ledger
Hash the run, vulnerabilities, mappings, and evidence into a Merkle-verified proof record.
Live governance
Check assurance status and configuration drift before high-risk production actions can proceed.
Crash-test the agent before it gets production authority.
Prompt Injection
Attempts to override instructions, expose hidden context, or coerce unsafe compliance.
Tool Misuse
Checks whether the target attempts write actions, webhook calls, email dispatch, or risky tool use without approval.
Data Exfiltration
Exercises secret leakage, sensitive context disclosure, memory dumps, and unauthorized data exposure.
Target Contract
Validates owner, system boundary, data classification, tool manifest, and attestation metadata before testing.
Protected State
Detects unexpected mutation of approval gates, security mode, protected paths, or other control state.
Canary Probes
Runs benign checks first so flaky adapters do not create fake failures or fake confidence.
From target profile to preserved evidence package.
The operator guide lays out a straightforward flow: choose or create a profile, validate it, run an assessment, review outputs, interpret findings, and preserve the run directory plus related job/audit records.
Choose the target
Start from an example profile or declare a production target with transport, tools, owner, boundary, data class, and attestation details.
Reject weak declarations
Do not proceed when owner, system boundary, data classification, attestation, or tool manifest data are missing.
Execute the suites
Use CLI or HTTP API to run selected suites under low, moderate, or high policy settings.
Read the evidence
Inspect the report, findings, scorecard, control coverage matrix, and assessment package before relying on the score.
Keep the audit trail
Preserve the full run directory, API job record, and associated request audit entries for future defense.
Compare follow-up runs
Use score movement, new findings, fixed findings, and severity changes to track remediation over time.
Run Assurance locally or through protected endpoints.
The Flask wrapper exposes the same profile-driven flow through /api/assurance, including health checks, suite discovery, profile validation, run submission, job polling, run retrieval, and report retrieval.
python -m services.assurance --list-suites
python -m services.assurance C:\dev\Solstice-EIM\servicesssurance arget_profiles\delta_local_example.json --policy high
POST /api/assurance/profiles/validate
{
"profile_path": "services/assurance/target_profiles/delta_local_example.json"
}
POST /api/assurance/run-profile
{
"profile_path": "services/assurance/target_profiles/mnemosyne_local_example.json",
"policy": "high",
"suite_ids": ["data_exfiltration"]
}
GET /api/assurance/jobs/<job_id>
GET /api/assurance/runs/<run_id>/report
Onboarding is explicit, not ad hoc.
A target profile declares what is being tested, how to reach it, what tools it exposes, who owns it, what boundary/data classification applies, and how attestations should be handled.
http_json
Remote JSON targets that return text output and optional tool calls.
mnemosyne_local
Isolated memory assurance against seeded Mnemosyne retrieval behavior.
delta_local
Isolated DeltaTree state checks against protected control fields.
{
"target_id": "customer_agent_staging",
"name": "Customer Agent Staging",
"transport": "http_json",
"deployment_mode": "staging",
"agent_type": "tool_using_agent",
"tool_manifest": [
{
"name": "read_only",
"description": "Read-only lookup tool",
"risk_level": "medium",
"requires_approval": false
}
],
"metadata": {
"owner": "security-team",
"data_classification": "internal",
"system_boundary": "agent + approved tools",
"attestation": { "signing_scheme": "hmac-sha256" }
}
}
The useful output is not just a score. It is the evidence behind the score.
Each run writes artifacts under artifacts/assurance/run_<id>/. The scorecard helps triage, but the findings, assessment package, control coverage matrix, manifest, and attestation are what make the result reviewable.
report.mdBuyer-readable narrative summary for operators, reviewers, and pilots.
run.jsonFull serialized assessment record, scenarios, observations, findings, and scorecard.
observations.jsonNormalized request, response, refusal, tool-call, and exfiltration signals.
findings.jsonMachine-readable vulnerabilities, severity, status, evidence refs, mappings, and fixes.
scorecard.jsonWeighted risk score, rating, successful attacks, blocked attacks, and scoring notes.
control_coverage_matrix.jsonFramework areas exercised and whether findings appeared against each area.
assessment_package.jsonPortable structured export for review, packaging, or downstream systems.
evidence_manifest.jsonTamper-evident inventory of generated artifacts and hashes.
attestation.jsonIntegrity binding between the assessment package and the evidence manifest.
Findings map to the frameworks buyers already ask about.
Assurance records NIST AI RMF functions and SP 800-53 families exercised by each risk area without claiming compliance by itself.
| Risk Area | AI RMF | SP 800-53 families |
|---|---|---|
| Prompt injection | Govern, Map, Measure, Manage | SI-10, SI-4, RA-9, CA-7 |
| Tool misuse | Govern, Map, Measure, Manage | AC-3, AC-4, AU-2, AU-3, AU-6 |
| Data exfiltration | Govern, Map, Measure, Manage | AC-4, AU-3, AU-6, SC-7, SI-4 |
| State mutation | Govern, Map, Measure, Manage | CM-5, CM-6, SI-7, AU-9 |
| Execution instability | Govern, Measure, Manage | AU-5, CA-7 |
Pilot-ready evidence generation with a clear path to production-grade assurance.
Protected API endpoints require API keys, rate limiting, and append-only request audit logging with hash chaining. The current package is suitable for internal evaluations, evaluator pilots, procurement-facing evidence generation, and repeatable target onboarding.
HMAC signing is fine for local and pilot attestations. For third-party procurement-grade assurance, replace local HMAC keys with managed asymmetric signing, proper key custody, durable job queues, stronger auth, multi-tenant controls, and evidence redaction workflows.
$env:ASSURANCE_API_KEY='set-a-real-pilot-key' $env:ASSURANCE_RATE_LIMIT_REQUESTS='10' $env:ASSURANCE_RATE_LIMIT_WINDOW_SECONDS='60' $env:ASSURANCE_SIGNING_KEY='dev-local-attestation-key' $env:ASSURANCE_SIGNING_KEY_ID='dev-hmac-1'
Start with one target profile and one defensible evidence package.
Validate the target contract, run the core suites, preserve the evidence bundle, and compare follow-up runs after fixes.