v0.14.18 · Self-host free · Apache 2.0

See everything your
agents do.
Stop what they shouldn't.

Name: LangSight
Author: LangSight

The runtime reliability layer for AI agents. Prevent loops, enforce budgets, circuit-break failing tools, monitor MCP health. Instrument once — protection is automatic.

🔄Agent stuck in a loop? Killed after 3 repeat calls.

💸Session burning budget? Auto-stopped at your dollar limit.

💥MCP server went down? Circuit breaker opens, agents reroute.

🔍Schema drifted overnight? Detected before your agents fail.

Start self-hosting → Star on GitHub

$pip install langsight&&langsight init

Anthropic · CrewAI · Claude Agent SDKPostgres + ClickHouse

langsight · session trace

$ langsight sessions --id sess-f2a9b1

Trace: sess-f2a9b1 (support-agent)

5 tool calls · 1 failed · 2,134ms · $0.023

sess-f2a9b1

├── jira-mcp/get_issue 89ms ✓

├── postgres-mcp/query 42ms ✓

├── → billing-agent handoff

│ ├── crm-mcp/update 120ms ✓

│ └── slack-mcp/notify — ✗ timeout

Root cause: slack-mcp timed out at 14:32 UTC

└── Fix: check SLACK_TIMEOUT (currently 500ms)

Why LangSight exists

Observability tools watch.
LangSight prevents.

Every platform in the market traces what happened after the fact. Nobody stops loops, enforces budgets, or circuit-breaks failing tools at runtime. That's the gap LangSight fills.

Unique capabilities

no competitor has

Competitors with

runtime prevention

More spans captured

vs LangSmith (April '26)

Vendor lock-in

Apache 2.0, self-hosted

Capability	LangSight	LangSmith	Langfuse	Opik
PREVENTLoop detection (pattern-based)	Yes	—	—	—
PREVENTBudget enforcement (auto-kill)	Yes	—	—	—
PREVENTCircuit breakers (tool-level)	Yes	—	—	—
DETECTMCP health monitoring	Yes	—	—	—
DETECTSchema drift detection	Yes	—	—	—
DETECTSecurity scanning (CVE + OWASP)	Yes	—	—	—
MONITORAnomaly detection (z-score)	Yes	—	—	—
MONITORBlast radius mapping	Yes	—	—	—
OBSERVEAgent tracing	Yes	Yes	Yes	Yes
OBSERVECost tracking	Yes	Yes	Yes	Yes
OBSERVELLM evals	—	Yes	Yes	Yes

The bottom 3 rows are shared territory — every platform traces and tracks costs. The top 8 rows are empty for everyone except LangSight. That's the moat: runtime prevention at the tool layer.

As of April 2026, LangSight also captures 66x more spans than LangSmith in head-to-head benchmarks on Claude Agent SDK — but that's observability. The real difference is prevention.

The product

Built for the engineer
who gets paged at 2 AM.

Every page answers a question you'd ask during an incident. No dashboards for dashboards' sake.

MAP

The clearest way to see what your agents did.

A visual DAG of your entire agent session — coordinator → sub-agents → MCP servers — with call counts, latency, and full detail on click. Understand any session in seconds.

LangSight session graph showing coordinator delegating to sql_analyst, data_quality, and reporter agents connected to MCP servers

+Visual agent topology — see who called whom at a glance

+Click any node for full input/output JSON, latency, tokens

+Handoff arrows between agents with timing

+Per-agent and per-server call counts + avg latency

PREVENT

Per-agent guardrails. From the dashboard.

Loop detection, budget limits, max steps — configured per agent without code changes. Set thresholds, choose warn or terminate, control costs in real-time.

Loop detection: fire after N consecutive repeat calls
Action: Warn or Terminate — your choice per agent
Budget controls: max cost per session in USD
Soft alert threshold + hard kill limit
Max wall time to prevent runaway sessions

LangSight agent detail page showing loop detection settings, budget controls, and max steps configuration

MONITOR

Agent runtime health at a glance.

Sessions, tool calls, error rate, P99 latency, token usage — all in real-time. Overview, Models, and Tools tabs. 1h to 7d time ranges.

4 KPI cards: sessions, tool calls, error rate, avg latency
Agent sessions + error rate trend charts
P99 latency tracking across all agents
Token usage breakdown (input, output, cache)

LangSight dashboard overview showing 124 sessions, 1032 tool calls, 6.6% error rate

MONITOR

MCP infrastructure monitoring.

Dedicated MCP section: tool call volume, error rates, P99 latency per server. Error breakdown by type. Fleet health at a glance.

MCP tool calls, error rate, P99 latency per server
Error breakdown: API unavailable, agent crash, auth errors
Server fleet health: green dots = all healthy
Correlate MCP failures with agent errors

LangSight MCP infrastructure dashboard showing tool calls, error rate, P99 latency, and 11 healthy servers

DETECT

Every session. Filterable. Searchable.

Session list with health tags, agent name, call count, duration, tokens, cost. Filter by status, agent, or health tag. Click to drill into full trace.

Health tags: success, failure, loop, budget exceeded
Filter by agent, status, health tag
Sort by duration, cost, token count
Click to drill into full session trace + graph

LangSight sessions list showing 124 sessions with filters for status, agents, health tags

MONITOR

Cost attribution. Per-tool. Per-agent. Per-model.

See exactly where your money goes. Total cost, LLM cost, tool call cost — broken down by service, tool, model, and cost type.

$1.22 total → $0.13 LLM + $1.09 tool calls
Cost per call: $0.001 for tools, $0.135 for Gemini
Filter by service, agent, model, cost type
2.1M input tokens, 130K output tokens breakdown

LangSight cost attribution page showing $1.22 total cost broken down by tool

DETECT

MCP health + blast radius + AI root cause.

Per-server health panel: if this server went down, how many agents and sessions are affected? AI-powered root cause investigation built in.

Blast radius: agents, sessions, and calls at risk
Health, Tools, Consumers, Drift, Schema, Logs tabs
AI root cause investigation (Anthropic, 1h-24h lookback)
Click "Run Investigation" for automated RCA report

LangSight MCP servers page showing blast radius analysis and AI root cause investigation

DETECT

8 alert types. Slack + webhooks.

Agent failure, SLO breached, anomaly (critical + warning), security findings, MCP server down/recovered. Toggle each independently.

Agent Failure: session with failed tool calls
SLO Breached: service level objective violated
Anomaly: z-score >=3 (critical) or >=2 (warning)
MCP Down/Recovered: server health state changes
Incomplete session tracking and tagging

LangSight alerts page showing 8 alert rule types with toggles and Slack notifications

Integrations

Drop into any framework.

One line of code. Full tracing, prevention, and cost attribution. Zero-code for Claude Agent SDK and CrewAI.

Verified

Anthropic SDK

Messages API + Streaming

LLM tracingToken captureCost tracking

Verified

Claude Agent SDK

Multi-agent orchestration

Zero-code auto_patch()Subagent tracing66 spans captured

Verified

CrewAI

Event bus + 19 handlers

Native event busAgent attributionA2A handoffs

Beta

OpenAI SDK

Chat completions + Agents

LLM tracingToken captureFunction calls

Beta

Google Gemini

Generative AI SDK

LLM tracingToken capturegenerate_content

Verified

OTLP / OpenTelemetry

Any OTEL-compatible framework

OTLP ingestgen_ai conventionsAny language

Coming soon

LangChain / LangGraph

Chains + Graph agents

Callback handlerTool tracingGraph state

Coming soon

Pydantic AI

Type-safe agents

Agent tracingTool captureStructured output

Langfuse watches the brain. LangSight watches the hands.

Use alongside Langfuse, LangWatch, or LangSmith. They trace model reasoning. LangSight guards the tool layer — loops, budgets, health, security, blast radius.

What LangSight captures

Prevention + detection + monitoring.

🔄

Loop Detection

Pattern-based: same tool + same args = kill it

💰

Budget Enforcement

Per-session cost limits with auto-kill

⚡

Circuit Breakers

Tool-level, stateful, auto-recovery

🏥

MCP Health Checks

5 transports, latency, status, schema drift

🛡️

Security Scanning

CVE + OWASP MCP Top 10 + poisoning detection

🌳

Multi-Agent Trees

Parent → child span linking across agents

📊

Cost Attribution

Per-agent, per-tool, cache token breakdown

🚨

Anomaly Detection

Z-score vs 7-day baseline, auto-alerts

Get started

Zero to traced
in 5 minutes.

No account needed. No vendor dependency. Self-hosted on your infra. Apache 2.0 — fork it, modify it, ship it.

Install the SDK

10 seconds

One pip install. No Docker needed for the SDK — it works standalone with any Python agent.

Terminalbash

pip install langsight

Add two lines to your agent

30 seconds

auto_patch() instruments Claude Agent SDK, CrewAI, OpenAI, and Gemini automatically. Zero wrappers, zero config.

your_agent.pypython

import langsight

langsight.auto_patch()

# That's it. Every tool call, handoff,
# and LLM interaction is now traced.
# Loop detection + budget enforcement
# are active automatically.

# Your existing agent code — unchanged:
from claude_agent_sdk import query
result = await query(prompt="...", options=options)

Start the dashboard

5 minutes

One script generates secrets, starts Postgres + ClickHouse + API + Dashboard. You're looking at traces before your coffee is ready.

Terminalbash

# Clone and start the full stack
git clone https://github.com/LangSight/langsight
cd langsight

# Auto-generates secrets, starts 5 containers,
# seeds demo data
./scripts/quickstart.sh

# Dashboard: http://localhost:3002
# API:       http://localhost:8000
# Docs:      https://docs.langsight.dev

Ready to see what your agents are really doing?

Self-host on your own infrastructure. No data ever leaves your network. No paid tiers. No gated features. No usage limits.

Start self-hosting →Star on GitHub

Apache 2.0 — self-host free foreverNo account neededdocker compose up — full stack in 5 min

See everything youragents do.Stop what they shouldn't.

Observability tools watch.LangSight prevents.

Built for the engineerwho gets paged at 2 AM.

The clearest way to see what your agents did.

Per-agent guardrails. From the dashboard.

Agent runtime health at a glance.

MCP infrastructure monitoring.

Every session. Filterable. Searchable.

Cost attribution. Per-tool. Per-agent. Per-model.

MCP health + blast radius + AI root cause.

8 alert types. Slack + webhooks.

Drop into any framework.

Anthropic SDK

Claude Agent SDK

CrewAI

OpenAI SDK

Google Gemini

OTLP / OpenTelemetry

LangChain / LangGraph

Pydantic AI

Prevention + detection + monitoring.

Loop Detection

Budget Enforcement

Circuit Breakers

MCP Health Checks

Security Scanning

Multi-Agent Trees

Cost Attribution

Anomaly Detection

Zero to tracedin 5 minutes.

Install the SDK

Add two lines to your agent

Start the dashboard

Ready to see what your agents are really doing?

See everything your
agents do.
Stop what they shouldn't.

Observability tools watch.
LangSight prevents.

Built for the engineer
who gets paged at 2 AM.

Zero to traced
in 5 minutes.