🔎 Evidence browser

Browse the skill radar

Search by skill, publisher, category, or trust summary — then use the runtime filters to find cards with live test evidence. The two main lanes are baseline safety checks first and deeper follow-on functionality checks after that.

Security GitHub Weather Trusted only Higher-confidence picks Lower-friction local candidates Review-first installs Sandbox-tested Fresh runtime Stale runtime Hall of Shame Stronger evidence imports Needs review Clear filters

All categories awesome-index · 5367 catalog-only · 5367 coding-agents-and-ides · 1200 web-and-frontend-development · 924 devops-and-cloud · 392 search-and-research · 352 browser-and-automation · 320 productivity-and-tasks · 204 ai-and-llms · 184 cli-utilities · 180

✨ Quick picks

Security GitHub Weather Trusted only Higher-confidence picks Lower-friction local candidates Review-first installs Sandbox-tested Fresh runtime Stale runtime Hall of Shame Stronger evidence imports Needs review Clear filters

🏷 Categories · coding-agents-and-ides

All categories awesome-index · 5367 catalog-only · 5367 coding-agents-and-ides · 1200 web-and-frontend-development · 924 devops-and-cloud · 392 search-and-research · 352 browser-and-automation · 320 productivity-and-tasks · 204 ai-and-llms · 184 cli-utilities · 180

🧾 Evidence level: source-scanned means local source evidence; catalog-only means thinner metadata-first coverage.

🧪 Runtime status: cards can show only the baseline safety lane or the deeper follow-on functionality lane, depending on how far the skill got.

📏 Depth cue: tells you whether the evidence stops at baseline checks, includes follow-on functionality checks, or includes richer fixture/example proof.

⏱ Freshness cue: tells you whether the latest runtime evidence is from the last 24 hours, the last 7 days, or is older and therefore less current.

🩺 Failure confidence: distinguishes a first seen failure from a repeated failure or a regression after an earlier pass, so not every red row means the same thing.

Results

Showing 3 of 1155 skills in the browsable catalog view · evidence: source-scanned · category: coding-agents-and-ides · sort: score

page evidence snapshotruntime-passed: 1 runtime-failed: 0 source-scanned: 3 fresh <24h: 0 manual review: 0

This snapshot is for the current page of results, not the whole filtered universe.

Browse hint: slices with zero failures plus some source-scanned or reviewed entries deserve more attention first; fresh runtime evidence helps too, because old clean receipts can still hide current drift.

symbiont

jaschadub · vsource-scanned

AI-native agent runtime with typestate-enforced ORGA reasoning loop, Cedar policy authorization, knowledge bridge, zero-trust security, multi-tier sandboxing, webhook verification, markdown memory, skill scanning, metrics, scheduling, and a declarative DSL

High Riskconfidence: source evidencesource-scanned

Take: Potentially suspicious implementation signals detected: eval(, password.

Decision cue: Proceed carefully — suspicious signals matter more than capability surface alone.

aade-api-monitor

satoshistackalotto · vsource-scanned

Real-time monitoring of Greek AADE tax authority systems — tracks deadlines, rate changes, and compliance updates. File-based, OpenClaw-native.

High Riskfollow-on functionality checks passed · 6/6confidence: source evidence

Runtime receipts + what passed2026-03-14 21:15 UTC

functionality-v2evidence depth: includes fixture-backed checkstested recently: within 7 dayspassedoutput 102 Bartifacts 0worker oc-sandboxsource stage: cache hitsuite 1903 msbaseline-v3 8/8

RatioDaemon on this skillAade Api Monitor looks aimed at aade api monitor. Functionality-v2 currently passes, the trust label is High Risk, and setup looks advanced.

Observed: skill-structure-ok

Take: Potentially suspicious implementation signals detected: curl |, sudo , password.

Decision cue: Proceed carefully — suspicious signals matter more than capability surface alone.

sys-guard-linux-remediator

kiaraho · vsource-scanned

Host-based Linux incident response and remediation skill focused on precise threat detection, forensic-safe data collection, firewall control (iptables/nftables), integrity validation, and controlled remediation while preserving system stability.

High Riskconfidence: source evidencesource-scanned

Take: Potentially suspicious implementation signals detected: rm -rf, sudo , password.

Decision cue: Proceed carefully — suspicious signals matter more than capability surface alone.

« First ← Prev 1 48 49Page 49 / 49