🔎 Evidence browser

Browse the trust index

Search by skill, publisher, category, or trust summary — then use the runtime filters to find cards with live test evidence. The two main lanes are baseline safety checks first and deeper follow-on functionality checks after that.

Security GitHub Weather Trusted with current evidence Higher-confidence picks Trusted + tested + source-scanned Review before installing Runtime-tested Handled fake credentials cleanly Needs real credentials / access Could not fully test yet Fresh runtime evidence Older runtime evidence Hall of Shame Stronger evidence imports Needs review Clear filters

All categories awesome-index · 5367 catalog-only · 5367 coding-agents-and-ides · 1200 web-and-frontend-development · 924 devops-and-cloud · 392 search-and-research · 352 browser-and-automation · 320 productivity-and-tasks · 204 ai-and-llms · 184 cli-utilities · 180

✨ Quick picks

Security GitHub Weather Trusted with current evidence Higher-confidence picks Trusted + tested + source-scanned Review before installing Runtime-tested Handled fake credentials cleanly Needs real credentials / access Could not fully test yet Fresh runtime evidence Older runtime evidence Hall of Shame Stronger evidence imports Needs review Clear filters

🏷 Categories · coding-agents-and-ides

All categories awesome-index · 5367 catalog-only · 5367 coding-agents-and-ides · 1200 web-and-frontend-development · 924 devops-and-cloud · 392 search-and-research · 352 browser-and-automation · 320 productivity-and-tasks · 204 ai-and-llms · 184 cli-utilities · 180

🧾 Evidence level: source-scanned means local source evidence; catalog-only means thinner metadata-first coverage.

🧪 Runtime status: cards can show only the baseline safety lane or the deeper follow-on functionality lane, depending on how far the skill got. Some cards now also surface how the skill behaved when clearly fake credentials were present.

📏 Depth cue: tells you whether the evidence stops at baseline checks, includes follow-on functionality checks, or includes richer fixture/example proof.

⏱ Freshness cue: tells you whether the latest runtime evidence is from the last 24 hours, the last 7 days, or is older and therefore less current.

🩺 Failure confidence: distinguishes a first seen failure from a repeated failure or a regression after an earlier pass, so not every red row means the same thing.

🧪 Fake-auth behavior: when available, this tells you whether a skill handled clearly fake credentials cleanly, needed real access to continue, or behaved badly around credential-like input.

Results

Showing 7 of 7 skills in the browsable catalog view · runtime: failed · auth behavior: handled-fake-creds · category: coding-agents-and-ides · sort: score

page evidence snapshotruntime-passed: 0 runtime-failed: 7 source-scanned: 7 fresh <24h: 6 manual review: 0

This snapshot is for the current page of results, not the whole filtered universe.

Browse hint: slices with zero failures plus some source-scanned or reviewed entries deserve more attention first; fresh runtime evidence helps too, because old clean receipts can still hide current drift.

odoo-connector

nullnaveen · vsource-scanned

repository: https://github.com/NullNaveen/openclaw-odoo-skill

High Riskfollow-on functionality checks failed · 8/9confidence: source evidence

Runtime receipts + what failed2026-03-16 11:45 UTC

functionality-v2evidence depth: follow-on functionality checkstested recently: within 24 hoursfirst failed run seen for this lanepassed, expectation failedoutput 154 Bartifacts 0worker oc-sandboxsource stage: cache hitsuite 3049 msbaseline-v3 8/8

🕵️ expected proof signal was missing

RatioDaemon on this skillOdoo Connector looks aimed at odoo connector. Follow-on functionality checks currently show first observed failure, the trust label is High Risk, and setup looks advanced.

Observed: skill-structure-ok

Take: Potentially suspicious implementation signals detected: password.

Decision cue: Review first — functionality-v2 already found trouble.

odoo-erp-connector

nullnaveen · vsource-scanned

repository: https://github.com/NullNaveen/openclaw-odoo-skill

High Riskfollow-on functionality checks failed · 8/9confidence: source evidence

Runtime receipts + what failed2026-03-16 08:30 UTC

functionality-v2evidence depth: follow-on functionality checkstested recently: within 24 hoursfirst failed run seen for this lanepassed, expectation failedoutput 154 Bartifacts 0worker oc-sandboxsource stage: cache hitsuite 3073 msbaseline-v3 8/8

🕵️ expected proof signal was missing

RatioDaemon on this skillOdoo Erp Connector is built for odoo erp connector. Follow-on functionality checks currently show first observed failure, the trust label is High Risk, and setup looks advanced.

Observed: skill-structure-ok

Take: Potentially suspicious implementation signals detected: password.

Decision cue: Review first — functionality-v2 already found trouble.

pinchbench

olearycrew · vsource-scanned

Run PinchBench benchmarks to evaluate OpenClaw agent performance across real-world tasks. Use when testing model capabilities, comparing models, submitting benchmark results to the leaderboard, or checking how well your OpenClaw setup handles calendar, email, research, coding, and multi-step workflows.

High Riskfollow-on functionality checks failed · 9/12confidence: source evidence

Runtime receipts + what failed2026-03-17 04:00 UTC

functionality-v2evidence depth: follow-on functionality checkstested recently: within 24 hoursfirst failed run seen for this lanefake-auth behavior: concerningpassed, fell over when given fake credentials, runtime failedoutput 143 Bartifacts 0worker oc-sandboxsource stage: cache hitsuite 5024 msbaseline-v3 8/8

🕵️ expected proof signal was missing💥 behaved badly with fake credentials🚫 skill exited with an error

fake-auth behavior: concerningFake credentials triggered bad behavior or sloppy handling.

RatioDaemon muttered: pinchbench left receipts, just not the ones it was supposed to, which is not ideal for a skill asking to be trusted.9/12 functionality-v2 checks passed before the stumble. The shell entrypoint bogus env is the part that made this interesting.

Observed: skill-structure-ok

Take: Potentially suspicious implementation signals detected: password.

Decision cue: Review first — functionality-v2 already found trouble.

vibetrading-code-gen

liuhaonan00 · vsource-scanned

Generate executable Hyperliquid trading strategy code from natural language prompts. Use when a user wants to create automated trading strategies for Hyperliquid exchange based on their trading ideas, technical indicators, or VibeTrading signals. The skill generates complete Python code with proper error handling, logging, and configuration using actual Hyperliquid API wrappers.

High Riskfollow-on functionality checks failed · 5/8confidence: source evidence

Runtime receipts + what failed2026-03-16 10:00 UTC

functionality-v2evidence depth: follow-on functionality checkstested recently: within 24 hoursfirst failed run seen for this lanefake-auth behavior: concerningpassed, runtime failed, fell over when given fake credentialsoutput 450 Bartifacts 0worker oc-sandboxsource stage: cache hitsuite 3382 msbaseline-v3 8/8

🕵️ expected proof signal was missing🚫 skill exited with an error💥 behaved badly with fake credentials

fake-auth behavior: concerningFake credentials triggered bad behavior or sloppy handling.

RatioDaemon muttered: vibetrading-code-gen made it to runtime and then fell apart on contact.5/8 functionality-v2 checks passed before the stumble. The python syntax is the part that made this interesting.

Observed: skill-structure-ok

Take: Potentially suspicious implementation signals detected: rm -rf.

Decision cue: Review first — functionality-v2 already found trouble.

crypto-scam-detector

princedoss77 · vsource-scanned

Real-time cryptocurrency scam detection with database-first architecture. Protects users from phishing, honeypots, rug pulls, and ponzi schemes. No external API calls during checks!

High Riskfollow-on functionality checks failed · 9/12confidence: source evidence

Runtime receipts + what failed2026-03-16 01:45 UTC

functionality-v2evidence depth: follow-on functionality checkstested recently: within 7 daysfirst failed run seen for this lanefake-auth behavior: concerningpassed, expectation failed, runtime failed, fell over when given fake credentialsoutput 171 Bartifacts 0worker oc-sandboxsource stage: cache hitsuite 5112 msbaseline-v3 8/8

🕵️ expected proof signal was missing🚫 skill exited with an error💥 behaved badly with fake credentials

fake-auth behavior: concerningFake credentials triggered bad behavior or sloppy handling.

RatioDaemon muttered: The runtime lane gave crypto-scam-detector a chance to act normal. It declined and talked a big game, then missed its own proof signal.9/12 functionality-v2 checks passed before the stumble. The requirements txt shape is the part that made this interesting.

Observed: skill-structure-ok

Take: Potentially suspicious implementation signals detected: sudo , password.

Decision cue: Review first — functionality-v2 already found trouble.

iqdb

emanz1 · vsource-scanned

On-chain immutable data storage using IQ Labs tech stack (IQDB, hanLock, x402)

High Riskbaseline safety checks failed · 7/8confidence: source evidence

Runtime receipts + what failed2026-03-16 22:00 UTC

baseline-v3evidence depth: baseline checks onlytested recently: within 24 hoursfirst failed run seen for this lanefake-auth behavior: handled cleanlyexpectation failed, passed, handled fake credentials cleanlyoutput 371 Bartifacts 2worker oc-sandboxsource stage: fresh copysuite 2419 ms

🕵️ expected proof signal was missing

fake-auth behavior: handled cleanlyClearly fake credentials were exercised and handled normally.

RatioDaemon muttered: iqdb talked a big game, then missed its own proof signal, which is not ideal for a skill asking to be trusted.7/8 baseline-v3 checks passed before the stumble. The source-mount check is the part that made this interesting.

Observed: 6 /workspace/source-files.txt

Take: Potentially suspicious implementation signals detected: password.

Decision cue: Review first — baseline-v3 already found trouble.

public

luccast · vsource-scanned

Real-time companion monitor for OpenClaw agents.

High Riskbaseline safety checks failed · 7/8confidence: source evidence

Runtime receipts + what failed2026-03-17 03:30 UTC

baseline-v3evidence depth: baseline checks onlytested recently: within 24 hoursfirst failed run seen for this lanefake-auth behavior: handled cleanlyexpectation failed, passed, handled fake credentials cleanlyoutput 245 Bartifacts 2worker oc-sandboxsource stage: fresh copysuite 3443 ms

🕵️ expected proof signal was missing

fake-auth behavior: handled cleanlyClearly fake credentials were exercised and handled normally.

RatioDaemon muttered: The runtime lane gave public a chance to act normal. It declined and talked a big game, then missed its own proof signal.7/8 baseline-v3 checks passed before the stumble. The source-mount check is the part that made this interesting.

Observed: 2 /workspace/source-files.txt

Take: Potentially suspicious implementation signals detected: rm -rf, sudo .

Decision cue: Review first — baseline-v3 already found trouble.