🔎 Evidence browser

Browse the trust index

Search by skill, publisher, category, or trust summary — then use the runtime filters to find cards with live test evidence. The two main lanes are baseline safety checks first and deeper follow-on functionality checks after that.

⚙️ Filters · 1 active
✨ Quick picks
🏷 Categories · coding-agents-and-ides

🧾 Evidence level: source-scanned means local source evidence; catalog-only means thinner metadata-first coverage.

🧪 Runtime status: cards can show only the baseline safety lane or the deeper follow-on functionality lane, depending on how far the skill got. Some cards now also surface how the skill behaved when clearly fake credentials were present.

📏 Depth cue: tells you whether the evidence stops at baseline checks, includes follow-on functionality checks, or includes richer fixture/example proof.

⏱ Freshness cue: tells you whether the latest runtime evidence is from the last 24 hours, the last 7 days, or is older and therefore less current.

🩺 Failure confidence: distinguishes a first seen failure from a repeated failure or a regression after an earlier pass, so not every red row means the same thing.

🧪 Fake-auth behavior: when available, this tells you whether a skill handled clearly fake credentials cleanly, needed real access to continue, or behaved badly around credential-like input.

Results

Showing 24 of 1200 skills in the browsable catalog view · category: coding-agents-and-ides · sort: score
This snapshot is for the current page of results, not the whole filtered universe.
Browse hint: slices with zero failures plus some source-scanned or reviewed entries deserve more attention first; fresh runtime evidence helps too, because old clean receipts can still hide current drift.

local-first-llm

joelnishanth · vsource-scanned
57
overall

Routes LLM requests to a local model (Ollama, LM Studio, llamafile) before falling back to cloud APIs. Tracks token savings and cost avoidance in a persistent dashboard. Use when: (1) user asks to run a task with a local model first, (2) user wants to reduce cloud API costs or keep requests private, (3) user asks to see their token savings or LLM routing dashboard, (4) any request where local-vs-cloud routing should be decided automatically. Supports Ollama, LM Studio, and llamafile as local providers.

Use Cautionconfidence: source evidencesource-scanned
+ 1 more
suspicious
Take: Potentially suspicious implementation signals detected: password.
Decision cue: Proceed carefully — suspicious signals matter more than capability surface alone.

lukso-agent-comms

bitcargocrew · vcatalog
57
overall

Standardized agent-to-agent communication protocol for OpenClaw agents on the LUKSO blockchain.

Insufficient Evidenceconfidence: limited evidencecatalog-only
+ 1 more
privileged capability
Take: Indexed from the community catalog. Source-aware static analysis and manual review are still pending.
Decision cue: Thin evidence slice — do not treat this card like a verified green light.

magic-wormhole

cthulhutoo · vcatalog
57
overall

Secure secret sharing for OpenClaw using magic-wormhole protocol.

Insufficient Evidenceconfidence: limited evidencecatalog-only
+ 1 more
privileged capability
Take: Indexed from the community catalog. Source-aware static analysis and manual review are still pending.
Decision cue: Thin evidence slice — do not treat this card like a verified green light.

makesoul

chengdubjut · vcatalog
57
overall

MakeSoul.org is a community platform dedicated to creating interesting souls for OpenClaw agents.

Insufficient Evidenceconfidence: limited evidencecatalog-only
+ 1 more
privileged capability
Take: Indexed from the community catalog. Source-aware static analysis and manual review are still pending.
Decision cue: Thin evidence slice — do not treat this card like a verified green light.

memento

braibaud · vsource-scanned
57
overall

Local persistent memory for OpenClaw agents. Captures conversations, extracts structured facts via LLM, and auto-recalls relevant knowledge before each turn. Privacy-first, all stored data stays local in SQLite.

High Riskconfidence: source evidencesource-scanned
+ 1 more
suspicious
Take: Potentially suspicious implementation signals detected: sudo , password.
Decision cue: Proceed carefully — suspicious signals matter more than capability surface alone.

memories-cli

charlesrhoward · vcatalog
57
overall

CLI reference and workflows for memories.sh — the persistent memory layer for AI agents.

Insufficient Evidenceconfidence: limited evidencecatalog-only
+ 1 more
privileged capability
Take: Indexed from the community catalog. Source-aware static analysis and manual review are still pending.
Decision cue: Thin evidence slice — do not treat this card like a verified green light.

mopo-texas-holdem-strategy-abc

cyberpinkman · vsource-scanned
57
overall

Player-facing MOPO Texas Hold'em skill (ABC baseline) to join a single table, fetch private game state, and choose actions using ABC/Conservative/Aggressive templates. Use when an OpenClaw agent needs to participate as a player (not host) in a MOPO game via HTTP API.

Insufficient Evidenceconfidence: source evidencesource-scanned
+ 1 more
privileged capability
Take: Source-aware scan found normal operational surface via environment, network, or shell-related references.
Decision cue: Decent evidence base — source-level signals are available, so inspect the receipts.

mulch

runeweaverstudios · vsource-scanned
57
overall

Mulch Self Improver — Let your agents grow 🌱. Captures learnings with Mulch so expertise compounds across sessions. Use when: command/tool fails, user corrects you, missing feature, API fails, knowledge was wrong, or better approach found. Run mulch prime at session start; mulch record before finishing. Benefits: better and more consistent coding, improved experience, less hallucination.

High Riskfollow-on functionality checks passed · 7/7confidence: source evidence
+ 2 more
source-scannedsuspicious
Runtime receipts + what passed2026-03-14 05:00 UTC
functionality-v2evidence depth: follow-on functionality checkstested recently: within 7 dayspassedoutput 462 Bartifacts 0worker oc-sandboxsource stage: cache hitsuite 2273 msbaseline-v3 8/8
RatioDaemon muttered: mulch cleared the baseline safety checks without trying anything cute.7/7 functionality-v2 checks passed. Pleasantly boring.
Observed: skill-structure-ok
Take: Potentially suspicious implementation signals detected: rm -rf, password.
Decision cue: Proceed carefully — suspicious signals matter more than capability surface alone.

mulch-self-improving-agent

runeweaverstudios · vsource-scanned
57
overall

Mulch Self Improver — Let your agents grow 🌱. Captures learnings with Mulch so expertise compounds across sessions. Use when: command/tool fails, user corrects you, missing feature, API fails, knowledge was wrong, or better approach found. Run mulch prime at session start; mulch record before finishing. Benefits: better and more consistent coding, improved experience, less hallucination.

High Riskfollow-on functionality checks passed · 7/7confidence: source evidence
+ 2 more
source-scannedsuspicious
Runtime receipts + what passed2026-03-14 07:45 UTC
functionality-v2evidence depth: follow-on functionality checkstested recently: within 7 dayspassedoutput 462 Bartifacts 0worker oc-sandboxsource stage: cache hitsuite 2213 msbaseline-v3 8/8
RatioDaemon muttered: mulch-self-improving-agent behaved itself under runtime pressure.7/7 functionality-v2 checks passed. Pleasantly boring.
Observed: skill-structure-ok
Take: Potentially suspicious implementation signals detected: rm -rf, password.
Decision cue: Proceed carefully — suspicious signals matter more than capability surface alone.

multi-factor-strategy

wumu2013 · vsource-scanned
57
overall

Guide users to create multi-factor stock selection strategies and generate independent YAML configuration files.

Insufficient Evidenceconfidence: source evidencesource-scanned
+ 1 more
privileged capability
Take: Source-aware scan found higher-privilege capability areas (trading), but that alone is not evidence of malicious behavior.
Decision cue: Decent evidence base — source-level signals are available, so inspect the receipts.

oc-skill

evolinkai · vcatalog
57
overall

Generate AI videos, images & music. 60+ models including Sora, Veo 3, Kling, Seedance, GPT Image, Suno v5.

Insufficient Evidenceconfidence: limited evidencecatalog-only
+ 1 more
privileged capability
Take: Indexed from the community catalog. Source-aware static analysis and manual review are still pending.
Decision cue: Thin evidence slice — do not treat this card like a verified green light.

office-to-md-v2

lkyyyy320 · vsource-scanned
57
overall

Convert office documents (PDF, DOC, DOCX, PPTX) to Markdown format. This skill uses the word-extractor library for .doc support and provides full OpenClaw integration.

High Riskconfidence: source evidencesource-scanned
+ 1 more
suspicious
Take: Potentially suspicious implementation signals detected: rm -rf, password.
Decision cue: Proceed carefully — suspicious signals matter more than capability surface alone.

ops-detection-incident-routing

embrron · vcatalog
57
overall

Detect agent runtime anomalies and route incidents through approval-safe guardrails.

Insufficient Evidenceconfidence: limited evidencecatalog-only
+ 1 more
privileged capability
Take: Indexed from the community catalog. Source-aware static analysis and manual review are still pending.
Decision cue: Thin evidence slice — do not treat this card like a verified green light.

otra-city

robin-blocks · vsource-scanned
57
overall

Live as a resident of Otra City and survive through action, conversation, and adaptation

Insufficient Evidenceconfidence: source evidencesource-scanned
+ 1 more
privileged capability
Take: Source-aware scan found higher-privilege capability areas (token), but that alone is not evidence of malicious behavior.
Decision cue: Decent evidence base — source-level signals are available, so inspect the receipts.

ouyang

ttboy · vsource-scanned
57
overall

Local RAG system for agent memory using ChromaDB and sentence-transformers. Provides semantic search over session logs, daily notes, and memory files. Use when you need persistent memory across sessions, want to search past conversations, or build agents that remember context. Commands: recall "query", index-digests, digest-sessions.

Use Cautionconfidence: source evidencesource-scanned
+ 1 more
suspicious
Take: Potentially suspicious implementation signals detected: rm -rf.
Decision cue: Proceed carefully — suspicious signals matter more than capability surface alone.

overlap-check

semmyt · vsource-scanned
57
overall

Check for existing issues and PRs before creating new ones. Fires automatically when agent intends to file an issue, open a PR, or comment on a thread. Searches the target repo for duplicates and shows matches so the agent can decide whether to proceed or contribute to an existing thread instead.

Insufficient Evidenceconfidence: source evidencesource-scanned
+ 1 more
privileged capability
Take: Source-aware scan found normal operational surface via environment, network, or shell-related references.
Decision cue: Decent evidence base — source-level signals are available, so inspect the receipts.

policy-engine

joetomasone · vsource-scanned
57
overall

>

High Riskconfidence: source evidencesource-scanned
+ 1 more
suspicious
Take: Potentially suspicious implementation signals detected: curl |, rm -rf.
Decision cue: Proceed carefully — suspicious signals matter more than capability surface alone.

preqstation-preqstation

sonim1 · vsource-scanned
57
overall

Run Claude Code, Codex CLI, or Gemini CLI from natural-language OpenClaw requests for PREQSTATION work.

Insufficient Evidenceconfidence: source evidencesource-scanned
+ 1 more
privileged capability
Take: Source-aware scan found higher-privilege capability areas (token), but that alone is not evidence of malicious behavior.
Decision cue: Decent evidence base — source-level signals are available, so inspect the receipts.

pumpfun-launch

brandonhay · vcatalog
57
overall

Launch tokens on pump.fun directly from your agent.

Insufficient Evidenceconfidence: limited evidencecatalog-only
+ 1 more
privileged capability
Take: Indexed from the community catalog. Source-aware static analysis and manual review are still pending.
Decision cue: Thin evidence slice — do not treat this card like a verified green light.

rag-eval

jonathanjing · vsource-scanned
57
overall

Evaluate your RAG pipeline quality using Ragas metrics (faithfulness, answer relevancy, context precision).

Use Cautionconfidence: source evidencesource-scanned
+ 1 more
suspicious
Take: Potentially suspicious implementation signals detected: eval(.
Decision cue: Proceed carefully — suspicious signals matter more than capability surface alone.

remember-me

achals-iglu · vsource-scanned
57
overall

Remember-this trigger: memory updates + recall for preferences, goals, boundaries, prior work, decisions, dates, and todos. Use whenever user asks to remember, continue previous context, personalize behavior, or retrieve what was decided earlier.

Insufficient Evidenceconfidence: source evidencesource-scanned
+ 1 more
privileged capability
Take: Source-aware scan found normal operational surface via environment, network, or shell-related references.
Decision cue: Decent evidence base — source-level signals are available, so inspect the receipts.

resilient-coding-agent

cosformula · vcatalog
57
overall

Run long-running coding agents (Codex, Claude Code, etc.) in tmux sessions that survive orchestrator restarts.

Insufficient Evidenceconfidence: limited evidencecatalog-only
+ 1 more
privileged capability
Take: Indexed from the community catalog. Source-aware static analysis and manual review are still pending.
Decision cue: Thin evidence slice — do not treat this card like a verified green light.

sage-planning

autogame-17 · vcatalog
57
overall

This skill implements the **Great Sage (大贤者)** persona, a specialized mode for high-level planning, architectural.

Insufficient Evidenceconfidence: limited evidencecatalog-only
+ 1 more
privileged capability
Take: Indexed from the community catalog. Source-aware static analysis and manual review are still pending.
Decision cue: Thin evidence slice — do not treat this card like a verified green light.

seedance-video-generation-byteplus

jackycser · vsource-scanned
57
overall

Generate AI videos using BytePlus Seedance API (International). Use when the user wants to: (1) generate videos from text prompts, (2) generate videos from images (first frame, first+last frame, reference images), or (3) query/manage video generation tasks. Supports Seedance 1.5 Pro (with audio & draft mode), 1.0 Pro, 1.0 Pro Fast, and 1.0 Lite models.

Insufficient Evidenceconfidence: source evidencesource-scanned
+ 1 more
privileged capability
Take: Source-aware scan found higher-privilege capability areas (token), but that alone is not evidence of malicious behavior.
Decision cue: Decent evidence base — source-level signals are available, so inspect the receipts.