🔎 Evidence browser

Search the skill radar

Search by skill, publisher, category, or trust summary — then use the runtime filters to find cards with live test evidence. The two main lanes are baseline safety checks first and deeper follow-on functionality checks after that.

Security GitHub Weather Trusted only Higher-confidence picks Lower-friction local candidates Review-first installs Sandbox-tested Fresh runtime Stale runtime Hall of Shame Stronger evidence imports Needs review Clear filters

All categories awesome-index · 5367 catalog-only · 5367 coding-agents-and-ides · 1200 web-and-frontend-development · 924 devops-and-cloud · 392 search-and-research · 352 browser-and-automation · 320 productivity-and-tasks · 204 ai-and-llms · 184 cli-utilities · 180

✨ Quick picks

Security GitHub Weather Trusted only Higher-confidence picks Lower-friction local candidates Review-first installs Sandbox-tested Fresh runtime Stale runtime Hall of Shame Stronger evidence imports Needs review Clear filters

🏷 Categories

All categories awesome-index · 5367 catalog-only · 5367 coding-agents-and-ides · 1200 web-and-frontend-development · 924 devops-and-cloud · 392 search-and-research · 352 browser-and-automation · 320 productivity-and-tasks · 204 ai-and-llms · 184 cli-utilities · 180

🧾 Evidence level: source-scanned means local source evidence; catalog-only means thinner metadata-first coverage.

🧪 Runtime status: cards can show only the baseline safety lane or the deeper follow-on functionality lane, depending on how far the skill got.

📏 Depth cue: tells you whether the evidence stops at baseline checks, includes follow-on functionality checks, or includes richer fixture/example proof.

⏱ Freshness cue: tells you whether the latest runtime evidence is from the last 24 hours, the last 7 days, or is older and therefore less current.

🩺 Failure confidence: distinguishes a first seen failure from a repeated failure or a regression after an earlier pass, so not every red row means the same thing.

Results

Showing 2 of 2 results for “spiceman161” · runtime: passed · freshness: fresh · sort: relevance

page evidence snapshotruntime-passed: 1 runtime-failed: 1 source-scanned: 2 fresh <24h: 2 manual review: 0

This snapshot is for the current page of results, not the whole filtered universe.

Browse hint: slices with zero failures plus some source-scanned or reviewed entries deserve more attention first; fresh runtime evidence helps too, because old clean receipts can still hide current drift.

playwright-browser-automation

spiceman161 · vsource-scanned

Browser automation using Playwright API directly. Navigate websites, interact with elements, extract data, take screenshots, generate PDFs, record videos, and automate complex workflows. More reliable than MCP approach.

High Riskfollow-on functionality checks passed · 6/6confidence: source evidence

Runtime receipts + what passed2026-03-16 07:15 UTC

functionality-v2evidence depth: follow-on functionality checkstested recently: within 24 hourspassedoutput 99 Bartifacts 0worker oc-sandboxsource stage: cache hitsuite 2087 msbaseline-v3 8/8

RatioDaemon muttered: playwright-browser-automation cleared baseline-v3 without trying anything cute.6/6 functionality-v2 checks passed. Pleasantly boring.

Observed: skill-structure-ok

Take: Potentially suspicious implementation signals detected: eval(, sudo , password.

Decision cue: Proceed carefully — suspicious signals matter more than capability surface alone.

sys-updater

spiceman161 · vsource-scanned

Production-safe Ubuntu maintenance orchestrator: runs daily apt security updates, tracks non-security updates across apt/npm/pnpm/brew with quarantine + auto-review, applies only approved updates, rotates logs/state, and generates clear 09:00 MSK Telegram reports (including what was actually installed).

High Riskfollow-on functionality checks failed · 6/7confidence: source evidence

Runtime receipts + what failed2026-03-15 21:30 UTC

functionality-v2evidence depth: follow-on functionality checkstested recently: within 24 hoursfirst failed run seen for this lanepassed, runtime_failedoutput 99 Bartifacts 0worker oc-sandboxsource stage: cache hitsuite 3162 msbaseline-v3 8/8

🕵️ expected proof signal was missing🚫 skill exited with an error

RatioDaemon muttered: sys-updater made it to runtime and then fell apart on contact, which is not ideal for a skill asking to be trusted.6/7 functionality-v2 checks passed before the stumble. The python help is the part that made this interesting.

Observed: skill-structure-ok

Take: Potentially suspicious implementation signals detected: sudo , password.

Decision cue: Review first — functionality-v2 already found trouble.