🔎 Evidence browser

Search the skill radar

Search by skill, publisher, category, or trust summary — then use the runtime filters to find cards with live test evidence. The two main lanes are baseline safety checks first and deeper follow-on functionality checks after that.

Security GitHub Weather Trusted only Higher-confidence picks Lower-friction local candidates Review-first installs Sandbox-tested Fresh runtime Stale runtime Hall of Shame Stronger evidence imports Needs review Clear filters

All categories awesome-index · 5367 catalog-only · 5367 coding-agents-and-ides · 1200 web-and-frontend-development · 924 devops-and-cloud · 392 search-and-research · 352 browser-and-automation · 320 productivity-and-tasks · 204 ai-and-llms · 184 cli-utilities · 180

✨ Quick picks

Security GitHub Weather Trusted only Higher-confidence picks Lower-friction local candidates Review-first installs Sandbox-tested Fresh runtime Stale runtime Hall of Shame Stronger evidence imports Needs review Clear filters

🏷 Categories · web-and-frontend-development

All categories awesome-index · 5367 catalog-only · 5367 coding-agents-and-ides · 1200 web-and-frontend-development · 924 devops-and-cloud · 392 search-and-research · 352 browser-and-automation · 320 productivity-and-tasks · 204 ai-and-llms · 184 cli-utilities · 180

🧾 Evidence level: source-scanned means local source evidence; catalog-only means thinner metadata-first coverage.

🧪 Runtime status: cards can show only the baseline safety lane or the deeper follow-on functionality lane, depending on how far the skill got.

📏 Depth cue: tells you whether the evidence stops at baseline checks, includes follow-on functionality checks, or includes richer fixture/example proof.

⏱ Freshness cue: tells you whether the latest runtime evidence is from the last 24 hours, the last 7 days, or is older and therefore less current.

🩺 Failure confidence: distinguishes a first seen failure from a repeated failure or a regression after an earlier pass, so not every red row means the same thing.

Results

Showing 5 of 5 results for “security” · runtime: passed · category: web-and-frontend-development · sort: relevance

page evidence snapshotruntime-passed: 3 runtime-failed: 2 source-scanned: 5 fresh <24h: 2 manual review: 0

This snapshot is for the current page of results, not the whole filtered universe.

Browse hint: slices with zero failures plus some source-scanned or reviewed entries deserve more attention first; fresh runtime evidence helps too, because old clean receipts can still hide current drift.

nyx-archive-skill-security-protocol

nyxur42 · vsource-scanned

Teach your AI agent to think about security. A reasoning methodology for vetting skills before installation — red/green flag heuristics, 4-phase audit protocol, post-install verification. No scripts, no dependencies. Just judgment. Built on fallibilism (being wrong about a skill's safety is recoverable; being overconfident is not) and relational security (you and your human decide together on edge cases — trust is built through transparency, not just detection).

Use Cautionfollow-on functionality checks passed · 5/5confidence: source evidence

Runtime receipts + what passed2026-03-14 15:00 UTC

functionality-v2evidence depth: follow-on functionality checkstested recently: within 7 dayspassedoutput 80 Bartifacts 0worker oc-sandboxsource stage: cache hitsuite 1677 msbaseline-v3 8/8

RatioDaemon on this skillNyx Archive Skill Security Protocol sits in the teach your AI agent to think about security lane. Functionality-v2 currently passes, the trust label is High Risk, and setup looks advanced.

Observed: skill-structure-ok

Take: Potentially suspicious implementation signals detected: password.

Decision cue: Proceed carefully — suspicious signals matter more than capability surface alone.

sys-updater

spiceman161 · vsource-scanned

Production-safe Ubuntu maintenance orchestrator: runs daily apt security updates, tracks non-security updates across apt/npm/pnpm/brew with quarantine + auto-review, applies only approved updates, rotates logs/state, and generates clear 09:00 MSK Telegram reports (including what was actually installed).

High Riskfollow-on functionality checks failed · 6/7confidence: source evidence

Runtime receipts + what failed2026-03-15 21:30 UTC

functionality-v2evidence depth: follow-on functionality checkstested recently: within 24 hoursfirst failed run seen for this lanepassed, runtime_failedoutput 99 Bartifacts 0worker oc-sandboxsource stage: cache hitsuite 3162 msbaseline-v3 8/8

🕵️ expected proof signal was missing🚫 skill exited with an error

RatioDaemon muttered: sys-updater made it to runtime and then fell apart on contact, which is not ideal for a skill asking to be trusted.6/7 functionality-v2 checks passed before the stumble. The python help is the part that made this interesting.

Observed: skill-structure-ok

Take: Potentially suspicious implementation signals detected: sudo , password.

Decision cue: Review first — functionality-v2 already found trouble.

safe-backup

hacksing · vsource-scanned

Backup OpenClaw state directory and workspace with security best practices.

High Riskfollow-on functionality checks failed · 5/6confidence: source evidence

Runtime receipts + what failed2026-03-15 16:15 UTC

functionality-v2evidence depth: follow-on functionality checkstested recently: within 24 hoursfirst failed run seen for this lanepassed, runtime_failedoutput 227 Bartifacts 0worker oc-sandboxsource stage: cache hitsuite 1976 msbaseline-v3 8/8

🕵️ expected proof signal was missing🚫 skill exited with an error

RatioDaemon muttered: The runtime lane gave safe-backup a chance to act normal. It declined and made it to runtime and then fell apart on contact.5/6 functionality-v2 checks passed before the stumble. The shell syntax is the part that made this interesting.

Observed: skill-structure-ok

Take: Potentially suspicious implementation signals detected: rm -rf, password.

Decision cue: Review first — functionality-v2 already found trouble.

firebase-auth-setup

guifav · vsource-scanned

Configures Firebase Authentication — providers, security rules, custom claims, and React auth hooks

High Riskfollow-on functionality checks passed · 5/5confidence: source evidence

Runtime receipts + what passed2026-03-14 11:45 UTC

functionality-v2evidence depth: follow-on functionality checkstested recently: within 7 dayspassedoutput 80 Bartifacts 0worker oc-sandboxsource stage: cache hitsuite 1584 msbaseline-v3 8/8

RatioDaemon on this skillFirebase Auth Setup is trying to handle firebase auth setup. Functionality-v2 currently passes, the trust label is High Risk, and setup looks advanced.

Observed: skill-structure-ok

Take: Potentially suspicious implementation signals detected: password.

Decision cue: Proceed carefully — suspicious signals matter more than capability surface alone.

m365-spam-manager

tradmangh · vsource-scanned

Microsoft 365 spam folder manager for Outlook/Exchange mailboxes. Automatically analyzes junk/spam emails, calculates a suspicious score based on structural patterns (missing unsubscribe links, poor language, suspicious domains, wrong character sets, etc.), and helps clean up the junk folder. Supports review mode (default) where user approves each action, and automatic mode for batch processing. Works with shared mailboxes via --mailbox flag. Related keywords: Outlook, Exchange Online, spam filter, junk email, phishing, email security. **Token cost:** ~500-1.5k tokens per use.

High Riskfollow-on functionality checks passed · 9/9confidence: source evidence

Runtime receipts + what passed2026-03-14 17:00 UTC

functionality-v2evidence depth: follow-on functionality checkstested recently: within 7 dayspassedoutput 175 Bartifacts 0worker oc-sandboxsource stage: cache hitsuite 2878 msbaseline-v3 8/8

RatioDaemon on this skillM365 Spam Manager looks aimed at m365 spam manager. Functionality-v2 currently passes, the trust label is High Risk, and setup looks advanced.

Observed: skill-structure-ok

Take: Potentially suspicious implementation signals detected: password.

Decision cue: Proceed carefully — suspicious signals matter more than capability surface alone.