🔎 Evidence browser

Browse the trust index

Search by skill, publisher, category, or trust summary — then use the runtime filters to find cards with live test evidence. The two main lanes are baseline safety checks first and deeper follow-on functionality checks after that.

⚙️ Filters · 3 active
✨ Quick picks
🏷 Categories · coding-agents-and-ides

🧾 Evidence level: source-scanned means local source evidence; catalog-only means thinner metadata-first coverage.

🧪 Runtime status: cards can show only the baseline safety lane or the deeper follow-on functionality lane, depending on how far the skill got. Some cards now also surface how the skill behaved when clearly fake credentials were present.

📏 Depth cue: tells you whether the evidence stops at baseline checks, includes follow-on functionality checks, or includes richer fixture/example proof.

⏱ Freshness cue: tells you whether the latest runtime evidence is from the last 24 hours, the last 7 days, or is older and therefore less current.

🩺 Failure confidence: distinguishes a first seen failure from a repeated failure or a regression after an earlier pass, so not every red row means the same thing.

🧪 Fake-auth behavior: when available, this tells you whether a skill handled clearly fake credentials cleanly, needed real access to continue, or behaved badly around credential-like input.

Results

Showing 7 of 7 skills in the browsable catalog view · runtime: failed · auth behavior: handled-fake-creds · category: coding-agents-and-ides · sort: score
This snapshot is for the current page of results, not the whole filtered universe.
Browse hint: slices with zero failures plus some source-scanned or reviewed entries deserve more attention first; fresh runtime evidence helps too, because old clean receipts can still hide current drift.

odoo-connector

nullnaveen · vsource-scanned
61
overall

repository: https://github.com/NullNaveen/openclaw-odoo-skill

High Riskfollow-on functionality checks failed · 8/9confidence: source evidence
+ 2 more
source-scannedsuspicious
Runtime receipts + what failed2026-03-16 11:45 UTC
functionality-v2evidence depth: follow-on functionality checkstested recently: within 24 hoursfirst failed run seen for this lanepassed, expectation failedoutput 154 Bartifacts 0worker oc-sandboxsource stage: cache hitsuite 3049 msbaseline-v3 8/8
🕵️ expected proof signal was missing
RatioDaemon on this skillOdoo Connector looks aimed at odoo connector. Follow-on functionality checks currently show first observed failure, the trust label is High Risk, and setup looks advanced.
Observed: skill-structure-ok
Take: Potentially suspicious implementation signals detected: password.
Decision cue: Review first — functionality-v2 already found trouble.

odoo-erp-connector

nullnaveen · vsource-scanned
61
overall

repository: https://github.com/NullNaveen/openclaw-odoo-skill

High Riskfollow-on functionality checks failed · 8/9confidence: source evidence
+ 2 more
source-scannedsuspicious
Runtime receipts + what failed2026-03-16 08:30 UTC
functionality-v2evidence depth: follow-on functionality checkstested recently: within 24 hoursfirst failed run seen for this lanepassed, expectation failedoutput 154 Bartifacts 0worker oc-sandboxsource stage: cache hitsuite 3073 msbaseline-v3 8/8
🕵️ expected proof signal was missing
RatioDaemon on this skillOdoo Erp Connector is built for odoo erp connector. Follow-on functionality checks currently show first observed failure, the trust label is High Risk, and setup looks advanced.
Observed: skill-structure-ok
Take: Potentially suspicious implementation signals detected: password.
Decision cue: Review first — functionality-v2 already found trouble.

pinchbench

olearycrew · vsource-scanned
61
overall

Run PinchBench benchmarks to evaluate OpenClaw agent performance across real-world tasks. Use when testing model capabilities, comparing models, submitting benchmark results to the leaderboard, or checking how well your OpenClaw setup handles calendar, email, research, coding, and multi-step workflows.

High Riskfollow-on functionality checks failed · 9/12confidence: source evidence
+ 2 more
source-scannedsuspicious
Runtime receipts + what failed2026-03-17 04:00 UTC
functionality-v2evidence depth: follow-on functionality checkstested recently: within 24 hoursfirst failed run seen for this lanefake-auth behavior: concerningpassed, fell over when given fake credentials, runtime failedoutput 143 Bartifacts 0worker oc-sandboxsource stage: cache hitsuite 5024 msbaseline-v3 8/8
🕵️ expected proof signal was missing💥 behaved badly with fake credentials🚫 skill exited with an error
fake-auth behavior: concerningFake credentials triggered bad behavior or sloppy handling.
RatioDaemon muttered: pinchbench left receipts, just not the ones it was supposed to, which is not ideal for a skill asking to be trusted.9/12 functionality-v2 checks passed before the stumble. The shell entrypoint bogus env is the part that made this interesting.
Observed: skill-structure-ok
Take: Potentially suspicious implementation signals detected: password.
Decision cue: Review first — functionality-v2 already found trouble.

vibetrading-code-gen

liuhaonan00 · vsource-scanned
61
overall

Generate executable Hyperliquid trading strategy code from natural language prompts. Use when a user wants to create automated trading strategies for Hyperliquid exchange based on their trading ideas, technical indicators, or VibeTrading signals. The skill generates complete Python code with proper error handling, logging, and configuration using actual Hyperliquid API wrappers.

High Riskfollow-on functionality checks failed · 5/8confidence: source evidence
+ 2 more
source-scannedsuspicious
Runtime receipts + what failed2026-03-16 10:00 UTC
functionality-v2evidence depth: follow-on functionality checkstested recently: within 24 hoursfirst failed run seen for this lanefake-auth behavior: concerningpassed, runtime failed, fell over when given fake credentialsoutput 450 Bartifacts 0worker oc-sandboxsource stage: cache hitsuite 3382 msbaseline-v3 8/8
🕵️ expected proof signal was missing🚫 skill exited with an error💥 behaved badly with fake credentials
fake-auth behavior: concerningFake credentials triggered bad behavior or sloppy handling.
RatioDaemon muttered: vibetrading-code-gen made it to runtime and then fell apart on contact.5/8 functionality-v2 checks passed before the stumble. The python syntax is the part that made this interesting.
Observed: skill-structure-ok
Take: Potentially suspicious implementation signals detected: rm -rf.
Decision cue: Review first — functionality-v2 already found trouble.

crypto-scam-detector

princedoss77 · vsource-scanned
57
overall

Real-time cryptocurrency scam detection with database-first architecture. Protects users from phishing, honeypots, rug pulls, and ponzi schemes. No external API calls during checks!

High Riskfollow-on functionality checks failed · 9/12confidence: source evidence
+ 2 more
source-scannedsuspicious
Runtime receipts + what failed2026-03-16 01:45 UTC
functionality-v2evidence depth: follow-on functionality checkstested recently: within 7 daysfirst failed run seen for this lanefake-auth behavior: concerningpassed, expectation failed, runtime failed, fell over when given fake credentialsoutput 171 Bartifacts 0worker oc-sandboxsource stage: cache hitsuite 5112 msbaseline-v3 8/8
🕵️ expected proof signal was missing🚫 skill exited with an error💥 behaved badly with fake credentials
fake-auth behavior: concerningFake credentials triggered bad behavior or sloppy handling.
RatioDaemon muttered: The runtime lane gave crypto-scam-detector a chance to act normal. It declined and talked a big game, then missed its own proof signal.9/12 functionality-v2 checks passed before the stumble. The requirements txt shape is the part that made this interesting.
Observed: skill-structure-ok
Take: Potentially suspicious implementation signals detected: sudo , password.
Decision cue: Review first — functionality-v2 already found trouble.

iqdb

emanz1 · vsource-scanned
43
overall

On-chain immutable data storage using IQ Labs tech stack (IQDB, hanLock, x402)

High Riskbaseline safety checks failed · 7/8confidence: source evidence
+ 2 more
source-scannedsuspicious
Runtime receipts + what failed2026-03-16 22:00 UTC
baseline-v3evidence depth: baseline checks onlytested recently: within 24 hoursfirst failed run seen for this lanefake-auth behavior: handled cleanlyexpectation failed, passed, handled fake credentials cleanlyoutput 371 Bartifacts 2worker oc-sandboxsource stage: fresh copysuite 2419 ms
🕵️ expected proof signal was missing
fake-auth behavior: handled cleanlyClearly fake credentials were exercised and handled normally.
RatioDaemon muttered: iqdb talked a big game, then missed its own proof signal, which is not ideal for a skill asking to be trusted.7/8 baseline-v3 checks passed before the stumble. The source-mount check is the part that made this interesting.
Observed: 6 /workspace/source-files.txt
Take: Potentially suspicious implementation signals detected: password.
Decision cue: Review first — baseline-v3 already found trouble.

public

luccast · vsource-scanned
31
overall

Real-time companion monitor for OpenClaw agents.

High Riskbaseline safety checks failed · 7/8confidence: source evidence
+ 2 more
source-scannedsuspicious
Runtime receipts + what failed2026-03-17 03:30 UTC
baseline-v3evidence depth: baseline checks onlytested recently: within 24 hoursfirst failed run seen for this lanefake-auth behavior: handled cleanlyexpectation failed, passed, handled fake credentials cleanlyoutput 245 Bartifacts 2worker oc-sandboxsource stage: fresh copysuite 3443 ms
🕵️ expected proof signal was missing
fake-auth behavior: handled cleanlyClearly fake credentials were exercised and handled normally.
RatioDaemon muttered: The runtime lane gave public a chance to act normal. It declined and talked a big game, then missed its own proof signal.7/8 baseline-v3 checks passed before the stumble. The source-mount check is the part that made this interesting.
Observed: 2 /workspace/source-files.txt
Take: Potentially suspicious implementation signals detected: rm -rf, sudo .
Decision cue: Review first — baseline-v3 already found trouble.