🔎 Evidence browser

Browse the trust index

Search by skill, publisher, category, or trust summary — then use the runtime filters to find cards with live test evidence. The two main lanes are baseline safety checks first and deeper follow-on functionality checks after that.

Security GitHub Weather Trusted with current evidence Higher-confidence picks Trusted + tested + source-scanned Review before installing Runtime-tested Handled fake credentials cleanly Needs real credentials / access Could not fully test yet Fresh runtime evidence Older runtime evidence Hall of Shame Stronger evidence imports Needs review Clear filters

All categories awesome-index · 5367 catalog-only · 5367 coding-agents-and-ides · 1200 web-and-frontend-development · 924 devops-and-cloud · 392 search-and-research · 352 browser-and-automation · 320 productivity-and-tasks · 204 ai-and-llms · 184 cli-utilities · 180

✨ Quick picks

Security GitHub Weather Trusted with current evidence Higher-confidence picks Trusted + tested + source-scanned Review before installing Runtime-tested Handled fake credentials cleanly Needs real credentials / access Could not fully test yet Fresh runtime evidence Older runtime evidence Hall of Shame Stronger evidence imports Needs review Clear filters

🏷 Categories · web-and-frontend-development

All categories awesome-index · 5367 catalog-only · 5367 coding-agents-and-ides · 1200 web-and-frontend-development · 924 devops-and-cloud · 392 search-and-research · 352 browser-and-automation · 320 productivity-and-tasks · 204 ai-and-llms · 184 cli-utilities · 180

🧾 Evidence level: source-scanned means local source evidence; catalog-only means thinner metadata-first coverage.

🧪 Runtime status: cards can show only the baseline safety lane or the deeper follow-on functionality lane, depending on how far the skill got. Some cards now also surface how the skill behaved when clearly fake credentials were present.

📏 Depth cue: tells you whether the evidence stops at baseline checks, includes follow-on functionality checks, or includes richer fixture/example proof.

⏱ Freshness cue: tells you whether the latest runtime evidence is from the last 24 hours, the last 7 days, or is older and therefore less current.

🩺 Failure confidence: distinguishes a first seen failure from a repeated failure or a regression after an earlier pass, so not every red row means the same thing.

🧪 Fake-auth behavior: when available, this tells you whether a skill handled clearly fake credentials cleanly, needed real access to continue, or behaved badly around credential-like input.

Results

Showing 7 of 7 skills in the browsable catalog view · runtime: failed · auth behavior: handled-fake-creds · category: web-and-frontend-development · sort: score

page evidence snapshotruntime-passed: 0 runtime-failed: 7 source-scanned: 7 fresh <24h: 7 manual review: 0

This snapshot is for the current page of results, not the whole filtered universe.

Browse hint: slices with zero failures plus some source-scanned or reviewed entries deserve more attention first; fresh runtime evidence helps too, because old clean receipts can still hide current drift.

media-news-digest

dinstein · vsource-scanned

Generate media & entertainment industry news digests. Covers Hollywood trades (THR, Deadline, Variety), box office, streaming, awards season, film festivals, and production news. Four-source data collection from RSS feeds, Twitter/X KOLs, Reddit, and web search. Pipeline-based scripts with retry mechanisms and deduplication. Supports Discord and email output with PDF attachments.

High Riskfollow-on functionality checks failed · 9/10confidence: source evidence

Runtime receipts + what failed2026-03-16 09:00 UTC

functionality-v2evidence depth: follow-on functionality checkstested recently: within 24 hoursfirst failed run seen for this lanefake-auth behavior: handled cleanlypassed, expectation failed, handled fake credentials cleanlyoutput 162 Bartifacts 0worker oc-sandboxsource stage: cache hitsuite 6822 msbaseline-v3 8/8

🕵️ expected proof signal was missing

fake-auth behavior: handled cleanlyClearly fake credentials were exercised and handled normally.

RatioDaemon muttered: media-news-digest talked a big game, then missed its own proof signal.9/10 functionality-v2 checks passed before the stumble. The requirements txt shape is the part that made this interesting.

Observed: skill-structure-ok

Take: Potentially suspicious implementation signals detected: rm -rf.

Decision cue: Review first — functionality-v2 already found trouble.

tech-news-digest

dinstein · vsource-scanned

Generate tech news digests with unified source model, quality scoring, and multi-format output. Six-source data collection from RSS feeds, Twitter/X KOLs, GitHub releases, GitHub Trending, Reddit, and web search. Pipeline-based scripts with retry mechanisms and deduplication. Supports Discord, email, and markdown templates.

High Riskfollow-on functionality checks failed · 9/10confidence: source evidence

Runtime receipts + what failed2026-03-16 12:15 UTC

functionality-v2evidence depth: follow-on functionality checkstested recently: within 24 hoursfirst failed run seen for this lanefake-auth behavior: handled cleanlypassed, expectation failed, handled fake credentials cleanlyoutput 163 Bartifacts 0worker oc-sandboxsource stage: cache hitsuite 7249 msbaseline-v3 8/8

🕵️ expected proof signal was missing

fake-auth behavior: handled cleanlyClearly fake credentials were exercised and handled normally.

RatioDaemon muttered: tech-news-digest talked a big game, then missed its own proof signal.9/10 functionality-v2 checks passed before the stumble. The requirements txt shape is the part that made this interesting.

Observed: skill-structure-ok

Take: Potentially suspicious implementation signals detected: rm -rf.

Decision cue: Review first — functionality-v2 already found trouble.

nova-act-usability

zouchaoqun · vsource-scanned

AI-orchestrated usability testing using Amazon Nova Act. The agent generates personas, runs tests to collect raw data, interprets responses to determine goal achievement, and generates HTML reports. Tests real user workflows (booking, checkout, posting) with safety guardrails. Use when asked to "test website usability", "run usability test", "generate usability report", "evaluate user experience", "test checkout flow", "test booking process", or "analyze website UX".

High Riskfollow-on functionality checks failed · 6/7confidence: source evidence

Runtime receipts + what failed2026-03-16 13:15 UTC

functionality-v2evidence depth: follow-on functionality checkstested recently: within 24 hoursfirst failed run seen for this lanepassed, runtime failedoutput 544 Bartifacts 0worker oc-sandboxsource stage: cache hitsuite 2344 msbaseline-v3 8/8

🕵️ expected proof signal was missing🚫 skill exited with an error

RatioDaemon muttered: nova-act-usability made it to runtime and then fell apart on contact.6/7 functionality-v2 checks passed before the stumble. The python syntax is the part that made this interesting.

Observed: skill-structure-ok

Take: Potentially suspicious implementation signals detected: sudo , password.

Decision cue: Review first — functionality-v2 already found trouble.

jellyfin-control

titunito · vsource-scanned

Control Jellyfin media server and TV. Search content, resume playback, manage sessions, control TV power and apps. Supports Home Assistant and direct WebOS backends. One command to turn on TV, launch Jellyfin, and play content.

High Riskfollow-on functionality checks failed · 9/10confidence: source evidence

Runtime receipts + what failed2026-03-16 15:15 UTC

functionality-v2evidence depth: follow-on functionality checkstested recently: within 24 hoursfirst failed run seen for this lanefake-auth behavior: concerningpassed, fell over when given fake credentialsoutput 175 Bartifacts 0worker oc-sandboxsource stage: cache hitsuite 3452 msbaseline-v3 8/8

🕵️ expected proof signal was missing💥 behaved badly with fake credentials

fake-auth behavior: concerningFake credentials triggered bad behavior or sloppy handling.

RatioDaemon muttered: jellyfin-control left receipts, just not the ones it was supposed to, which is not ideal for a skill asking to be trusted.9/10 functionality-v2 checks passed before the stumble. The node entrypoint bogus env is the part that made this interesting.

Observed: skill-structure-ok

Take: Potentially suspicious implementation signals detected: sudo , password.

Decision cue: Review first — functionality-v2 already found trouble.

skill-vettr

britrik · vsource-scanned

Static analysis security scanner for third-party OpenClaw skills.

High Riskbaseline safety checks failed · 7/8confidence: source evidence

Runtime receipts + what failed2026-03-16 16:15 UTC

baseline-v3evidence depth: baseline checks onlytested recently: within 24 hoursfirst failed run seen for this lanefake-auth behavior: handled cleanlyexpectation failed, passed, handled fake credentials cleanlyoutput 452 Bartifacts 2worker oc-sandboxsource stage: fresh copysuite 2442 ms

🕵️ expected proof signal was missing

fake-auth behavior: handled cleanlyClearly fake credentials were exercised and handled normally.

RatioDaemon muttered: The runtime lane gave skill-vettr a chance to act normal. It declined and talked a big game, then missed its own proof signal.7/8 baseline-v3 checks passed before the stumble. The source-mount check is the part that made this interesting.

Observed: 11 /workspace/source-files.txt

Take: Potentially suspicious implementation signals detected: eval(, rm -rf, password.

Decision cue: Review first — baseline-v3 already found trouble.

qa-patrol

tahseen137 · vsource-scanned

>

High Riskfollow-on functionality checks failed · 8/10confidence: source evidence

Runtime receipts + what failed2026-03-16 17:00 UTC

functionality-v2evidence depth: follow-on functionality checkstested recently: within 24 hoursfirst failed run seen for this lanepassed, runtime failedoutput 13.9 KBartifacts 0worker oc-sandboxsource stage: cache hitsuite 3211 msbaseline-v3 8/8

🚫 skill exited with an error

RatioDaemon on this skillQa Patrol is trying to handle qa patrol. Follow-on functionality checks currently show first observed failure, the trust label is High Risk, and setup looks advanced.

Observed: skill-structure-ok

Take: Potentially suspicious implementation signals detected: password.

Decision cue: Review first — functionality-v2 already found trouble.

tor-browser

admin4giter · vsource-scanned

Headless browser automation with Tor SOCKS5 proxy support for accessing .onion sites and anonymous browsing. Use when navigating dark web resources, scraping Tor hidden services, conducting security research on dark web forums, or when anonymity is required. Supports navigation, element interaction, screenshots, and data extraction through Tor network.

High Riskfollow-on functionality checks failed · 6/7confidence: source evidence

Runtime receipts + what failed2026-03-16 23:15 UTC

functionality-v2evidence depth: follow-on functionality checkstested recently: within 24 hoursfirst failed run seen for this lanepassed, runtime failedoutput 99 Bartifacts 0worker oc-sandboxsource stage: cache hitsuite 3031 msbaseline-v3 8/8

🕵️ expected proof signal was missing🚫 skill exited with an error

RatioDaemon on this skillTor Browser sits in the tor browser lane. Follow-on functionality checks currently show first observed failure, the trust label is High Risk, and setup looks advanced.

Observed: skill-structure-ok

Take: Potentially suspicious implementation signals detected: sudo , password.

Decision cue: Review first — functionality-v2 already found trouble.