Tools & Code

Self-hostable platform for managing AI IDE config files (.cursorrules, CLAUDE.md, copilot-instructions.md). Web UI, REST API, CLI, and federated blueprint marketplace for 30+ AI coding assistants.

Prompt Management and Testing

flompt

Visual AI prompt builder that decomposes prompts into 12 semantic blocks (role, context, constraints, examples, etc.) and compiles them into optimized XML. Browser extension for ChatGPT/Claude/Gemini, and MCP server for Claude Code agents. Free, open-source.

Prompt Management and Testing

DeepEval

7K+ stars

Open-source evaluation framework covering RAG, agents, and conversations with CI/CD integration. ~7K+ ⭐

LLM Evaluation Tools

Ragas

8K+ stars

RAG evaluation with knowledge-graph-based test set generation and 30+ metrics. ~8K+ ⭐

LLM Evaluation Tools

LangSmith

LangChain's platform for debugging, testing, evaluating, and monitoring LLM applications.

LLM Evaluation Tools

Langfuse

7K+ stars

Open-source LLM observability with tracing, prompt management, and human annotation. ~7K+ ⭐

LLM Evaluation Tools

Braintrust

End-to-end AI evaluation platform, SOC2 Type II certified.

LLM Evaluation Tools

Arize AI / Phoenix

Real-time LLM monitoring with drift detection and tracing.

LLM Evaluation Tools

TruLens

Evaluating and explaining LLM apps; tracks hallucinations, relevance, groundedness.

LLM Evaluation Tools

InspectAI

Purpose-built for evaluating agents against benchmarks (UK AISI).

LLM Evaluation Tools

Opik

Evaluate, test, and ship LLM applications across dev and production lifecycles.

LLM Evaluation Tools

EvalView

CLI tool for testing multi-step AI agents with YAML test cases, regression detection, and production monitoring.

LLM Evaluation Tools

LangChain / LangGraph

10K+ stars

Most widely adopted LLM app framework; LangGraph adds graph-based multi-step agent workflows. ~100K+ / ~10K+ ⭐

Agent Frameworks

CrewAI

44K+ stars

Role-playing AI agent orchestration with 700+ integrations. ~44K+ ⭐

Agent Frameworks

AutoGen (AG2)

40K+ stars

Microsoft's multi-agent conversational framework. ~40K+ ⭐

Agent Frameworks

DSPy

22K+ stars

Stanford's framework for programming LLMs with automatic prompt/weight optimization. ~22K+ ⭐

Agent Frameworks

OpenAI Agents SDK

10K+ stars

Official agent framework with function calling, guardrails, and handoffs. ~10K+ ⭐

Agent Frameworks

Semantic Kernel

24K+ stars

Microsoft's AI framework powering M365 Copilot; C#, Python, Java. ~24K+ ⭐

Agent Frameworks

LlamaIndex

40K+ stars

Data framework for RAG and agent capabilities. ~40K+ ⭐

Agent Frameworks

Haystack

20K+ stars

Open-source NLP framework with pipeline architecture for RAG and agents. ~20K+ ⭐

Agent Frameworks

Agno (formerly Phidata)

20K+ stars

Python agent framework with microsecond instantiation. ~20K+ ⭐

Agent Frameworks

Smolagents

15K+ stars

Hugging Face's minimalist code-centric agent framework (~1000 LOC). ~15K+ ⭐

Agent Frameworks

Pydantic AI

8K+ stars

Type-safe agent framework using Pydantic for structured validation. ~8K+ ⭐

Agent Frameworks

Mastra

20K+ stars

TypeScript AI agent framework with assistants, RAG, and observability. ~20K+ ⭐

Agent Frameworks

Google ADK

Agent Development Kit deeply integrated with Gemini and Google Cloud.

Agent Frameworks

Strands Agents (AWS)

Model-agnostic framework with deep AWS integrations.

Agent Frameworks

Langflow

50K+ stars

Node-based visual agent builder with drag-and-drop. ~50K+ ⭐

Agent Frameworks

n8n

60K+ stars

Workflow automation with AI agent capabilities and 400+ integrations. ~60K+ ⭐

Agent Frameworks

Dify

All-in-one backend for agentic workflows with tool-using agents and RAG.

Agent Frameworks

PraisonAI

Multi-AI Agents framework with 100+ LLM support, MCP integration, and built-in memory.

Agent Frameworks

Neurolink

Multi-provider AI agent framework unifying 12+ providers with workflow orchestration.

Agent Frameworks

Composio

Connect 100+ tools to AI agents with zero setup.

Agent Frameworks

DSPy

22K+ stars

Multiple optimizers (MIPROv2, BootstrapFewShot, COPRO) for automatic prompt tuning. ~22K+ ⭐

Prompt Optimization Tools

TextGrad

2K+ stars

Automatic differentiation via text (Stanford). ~2K+ ⭐

Prompt Optimization Tools

OPRO

Google DeepMind's optimization by prompting.

Prompt Optimization Tools

Garak (NVIDIA)

3K+ stars

LLM vulnerability scanner for hallucination, injection, and jailbreaks — the "nmap for LLMs." ~3K+ ⭐

Red Teaming and Prompt Security

PyRIT (Microsoft)

3K+ stars

Python Risk Identification Tool for automated red-teaming. ~3K+ ⭐

Red Teaming and Prompt Security

DeepTeam

40+ vulnerabilities, 10+ attack methods, OWASP Top 10 support.

Red Teaming and Prompt Security

LLM Guard

2K+ stars

Security toolkit for LLM I/O validation. ~2K+ ⭐

Red Teaming and Prompt Security

NeMo Guardrails (NVIDIA)

5K+ stars

Programmable guardrails for conversational systems. ~5K+ ⭐

Red Teaming and Prompt Security

Guardrails AI

Define strict output formats (JSON schemas) to ensure system reliability.

Red Teaming and Prompt Security

Lakera

AI security platform for real-time prompt injection detection.

Red Teaming and Prompt Security

Purple Llama (Meta)

Open-source LLM safety evaluation including CyberSecEval.

Red Teaming and Prompt Security

GPTFuzz

Automated jailbreak template generation achieving >90% success rates.

Red Teaming and Prompt Security

Rebuff

Open-source tool for detection and prevention of prompt injection.

Red Teaming and Prompt Security

AgentSeal

"Open-source scanner that runs 150 attack probes to test AI agents for prompt injection and extraction vulnerabilities."

Red Teaming and Prompt Security

MCP Specification

15K+ stars

The core protocol specification and SDKs. ~15K+ ⭐

MCP (Model Context Protocol)

MCP Reference Servers

Official implementations: fetch, filesystem, GitHub, Slack, Postgres.

MCP (Model Context Protocol)

FastMCP (Python)

5K+ stars

High-level Pythonic framework for building MCP servers. ~5K+ ⭐

MCP (Model Context Protocol)

GitHub MCP Server

15K+ stars

GitHub's official MCP server for repo, issue, PR, and Actions interaction. ~15K+ ⭐

MCP (Model Context Protocol)

Awesome MCP Servers

30K+ stars

Curated list of 10,000+ community MCP servers. ~30K+ ⭐

MCP (Model Context Protocol)

Context7

MCP server providing version-specific documentation to reduce code hallucination.

MCP (Model Context Protocol)

GitMCP

Creates remote MCP servers for any GitHub repo by changing the domain.

MCP (Model Context Protocol)

MCP Inspector

Visual testing tool for MCP server development.

MCP (Model Context Protocol)

Claude Code

Anthropic's agentic coding CLI; understands full codebases and executes complex multi-step tasks via natural language.

Prompt Engineering Course

Tools & Code

Promptfoo

Promptify

Agenta

PromptLayer

Helicone

LangGPT

ChainForge

LMQL

Promptotype

PromptPanda

Promptimize AI

PROMPTMETHEUS

Better Prompt

OpenPrompt

Prompt Source

Prompt Engine

PromptInject

LynxPrompt

flompt

DeepEval

Ragas

LangSmith

Langfuse

Braintrust

Arize AI / Phoenix

TruLens

InspectAI

Opik

EvalView

LangChain / LangGraph

CrewAI

AutoGen (AG2)

DSPy

OpenAI Agents SDK

Semantic Kernel

LlamaIndex

Haystack

Agno (formerly Phidata)

Smolagents

Pydantic AI

Mastra

Google ADK

Strands Agents (AWS)

Langflow

n8n

Dify

PraisonAI

Neurolink

Composio

DSPy

TextGrad

OPRO

Garak (NVIDIA)

PyRIT (Microsoft)

DeepTeam

LLM Guard

NeMo Guardrails (NVIDIA)

Guardrails AI

Lakera

Purple Llama (Meta)

GPTFuzz

Rebuff

AgentSeal

MCP Specification

MCP Reference Servers

FastMCP (Python)

GitHub MCP Server

Awesome MCP Servers

Context7

GitMCP

MCP Inspector

Claude Code

OpenAI Codex CLI

Gemini CLI

Qwen Code

Aider

OpenCode

Goose