llm-evaluation AI skills

LLM Evaluation

GitHub:sickn33/antigravity-awesome-skills · llm-evaluation

Trusted

This skill provides documentation about evaluating Large Language Models (LLMs). It covers automated metrics, human evaluation, and A/B testing.

Source: Workspace import

Originally ingested from a local workspace copy.

version eb6c74cac6b1

1 findings

static analysis only

no human review yet

Safety

100

Quality

Transparency

100

Operational

Automated result: Trusted

Current public label: Trusted

Because the skill is mostly documentation, there are few opportunities for automated analysis to find problems. The skill is labeled trusted.

Human review: none yet

The current public label is still relying on automation. A human has not weighed in yet.

Severity mix: 1 low