Your agent failed.
Which tool broke
— and why?
Trace what your agents called. Find what broke, what's expensive, and what's unsafe. For MCP servers, get health checks, schema drift alerts, and security scanning built in.
Where LangSight fits
What question are you
trying to answer?
Use LangSight with LangWatch, Langfuse, or LangSmith — not instead of them. They evaluate model behavior. LangSight monitors the tool layer underneath.
| Question | Best tool |
|---|---|
| Did the prompt/model perform well? | LangWatch / Langfuse / LangSmith |
| Should I change prompts or eval policy? | LangWatch / Langfuse / LangSmith |
| Is my server CPU/memory healthy? | Datadog / New Relic |
| → Which tool call failed in production? | LangSight |
| → Is an MCP server unhealthy or drifting? | LangSight |
| → Is an MCP server exposed or risky? | LangSight |
| → Why did this session cost $47 instead of $3? | LangSight |
The problem
LLM quality is only
half the problem.
Teams already have ways to inspect prompts and eval scores. What they still cannot answer fast enough:
Which of 15 tools failed?
Your orchestrator calls 15 tools across 4 MCP servers. Something returned bad data. Without traces, you spend hours replaying requests — in the dark.
MCP server degraded silently
Schema changed. Latency spiked 10x. Auth expired. The agent keeps calling, gets bad data, and hallucinates. You find out from users, not alerts.
$4,200 in unexpected tool costs
A sub-agent retries geocoding-mcp 47 times per session. Nobody noticed until the invoice arrived. You need cost attribution at the tool level, not the model level.
Is this MCP server safe to run?
66% of community MCP servers have critical code smells. Tool poisoning attacks are real. You need automated scanning, not hope.
The solution
Four pillars of
runtime observability.
Action Traces
See the exact sequence of tool calls, handoffs, failures, and costs across a full agent session. Multi-agent trees reconstructed automatically from parent_span_id.
$ langsight sessions --id sess-f2a9b1 sess-f2a9b1 (support-agent) ├── jira-mcp/get_issue 89ms ✓ ├── postgres-mcp/query 42ms ✓ ├── → billing-agent handoff │ ├── crm-mcp/update 120ms ✓ │ └── slack-mcp/notify — ✗ timeout Root cause: slack-mcp timed out at 14:32
MCP Health
Detect down, slow, stale, or changed MCP servers before they silently corrupt agent behavior. Schema drift detection catches breaking changes in minutes.
$ langsight mcp-health Server Status Latency Schema Tools snowflake-mcp ✅ UP 142ms Stable 8 slack-mcp ⚠️ DEG 1,240ms Stable 4 jira-mcp ❌ DOWN — — — postgres-mcp ✅ UP 31ms Changed 5
MCP Security
Scan for CVEs, OWASP MCP Top 10, tool poisoning signals, weak auth, and risky configs. Run in CI with --ci to block deploys on CRITICAL findings.
$ langsight security-scan CRITICAL jira-mcp CVE-2025-6514 Remote code execution in mcp-remote HIGH slack-mcp OWASP-MCP-01 Tool description contains injection pattern HIGH postgres-mcp OWASP-MCP-04 No authentication configured
Cost Attribution
Move from "the invoice is $4,200" to "billing-agent's geocoding MCP retries 47x per session at $0.005/call."
$ langsight costs --hours 24 Tool Calls Failed Cost % geocoding-mcp 2,340 12 $1,872 44.6% postgres-mcp/query 890 3 $445 10.6% claude-3.5 (LLM) 156 0 $312 7.4%
Get started
Zero to traced
in 5 minutes.
Install & discover
30 seconds
pip install langsight langsight init # Auto-discovered 4 MCP servers
Instrument your agent
2 lines of code
from langsight.sdk import LangSightClient client = LangSightClient(url="...") traced = client.wrap(mcp, server_name="pg")
See everything
real-time
langsight sessions langsight mcp-health langsight security-scan langsight costs --hours 24
And more
Built for production.
Multi-Agent Call Trees
Coreparent_span_id links sub-agent calls across any depth. See the path from orchestrator to leaf tool.
Session Replay
v0.2Re-execute any session against live MCP servers. Compare two runs side-by-side to see what changed.
Anomaly Detection
v0.2Z-score analysis against 7-day baseline. Warning at |z|>=2, critical at |z|>=3. No manual thresholds.
Agent SLO Tracking
v0.2Define success_rate and latency_p99 targets per agent. Get alerted before you breach availability.
AI Root Cause Analysis
4 LLMslangsight investigate sends evidence to Claude, GPT-4o, Gemini, or Ollama and returns remediation steps.
Prometheus Metrics
v0.2Native /metrics endpoint. Plug into your existing Grafana stack. Request counts, latencies, SSE connections.
Integrations
Drop into any framework.
Use alongside Langfuse, LangWatch, or LangSmith
They trace the LLM reasoning layer (what the model decided). LangSight traces the action layer (what the agent called, what failed, what it cost). Different questions, same agent.
Your data. Your infra.
No vendor dependency.
Self-host on your own infrastructure. No data ever leaves your network. No paid tiers. No gated features. No usage limits.
Your data stays yours
PostgreSQL + ClickHouse via docker compose up. Both fully under your control. No telemetry phoning home.
No vendor lock-in
BSL 1.1 — converts to Apache 2.0 after 4 years. Fork it, modify it, embed it. The only restriction: don't resell it as a hosted service.
5-minute setup
One script generates secrets, starts 5 containers, seeds demo data. You're looking at traces before your coffee is ready.
Own the runtime layer
of your agent systems.
If your agents depend on MCP, LangSight keeps that dependency observable, reliable, and secure.
Trace what broke. Find what's expensive. Scan what's unsafe.