Open-source LLM engineering platform for traces, evals, and prompt management

Langfuse is an open-source product for LLM application observability: it ingests traces and spans from your stack, supports datasets and prompt/version workflows, and offers optional Langfuse Cloud or self-hosted deployment. It integrates with popular Python/JS SDKs and frameworks that emit OpenTelemetry-compatible telemetry, so teams can debug agent loops, compare prompt iterations, and monitor production quality metrics without building a custom analytics pipeline from scratch.

Category Developer Tools

Pricing Open source + hosted plans

Platforms Web / API / Self-hosted

observabilityllmopstracing

Use cases

Debugging tool-heavy agent runs where failures occur deep in a call chain
Tracking latency and token usage across routes and model versions
Building eval sets from production traces for regression testing before rollout
Comparing prompt edits with consistent datasets rather than anecdotal chat checks
Giving platform teams a shared view of LLM behavior in staging and production

Key features

Trace and session views for multi-step LLM and agent workflows
Prompt management with versioning and side-by-side comparisons
Datasets and scoring workflows for offline evaluation and regression checks
SDK integrations for Python and JavaScript ecosystems
Self-hosting option alongside Langfuse Cloud for teams with data residency requirements

Who Is It For?

ML and platform engineers operating LLM services
Product teams shipping agentic features who need production visibility
Developers self-hosting models or gateways who want trace storage under their control

Frequently Asked Questions

Is Langfuse the same as a generic APM?: It is specialized for LLM workloads: traces include prompts, completions, tool calls, and scores rather than only HTTP timings, though it can sit alongside traditional APM.
Can I run Langfuse on my own infrastructure?: Yes—Langfuse documents self-hosted deployment patterns in addition to its managed cloud offering.
Does it replace automated evaluation?: No—it helps you collect data and run eval workflows; you still define tasks, judges, or heuristics appropriate to your product.

3 Indexed items

Postgres MCP

Developer ToolsFree / Open Source

pg-mcp-server is a Model Context Protocol server that bridges AI agents and PostgreSQL databases. It exposes schema metadata (tables, columns, indexes, foreign keys) as MCP resources, and lets agents execute read-only SQL queries or transactional writes. Ideal for developers who want Claude, Cursor, or other LLM-powered tools to answer questions about a live database without manual SQL. Supports connection string configuration, SSL modes, and Row-level security awareness.

Google Antigravity

AI CodingFree (public preview)

Google Antigravity is an agentic development platform announced on the Google Developers Blog (November 2025). It pairs a familiar AI-assisted editor with a Manager Surface where developers spawn and observe agents working asynchronously across editor, terminal, and browser. Agents produce Artifacts—such as task lists, implementation plans, screenshots, and browser recordings—for review instead of relying only on raw tool logs. Public preview is offered at no cost for individuals on macOS, Windows, and Linux, with model choice including Gemini 3 Pro plus third-party models such as Claude Sonnet 4.5 and OpenAI GPT-OSS as described by Google.

MemGPT

AI AgentsFree

MemGPT is an open-source framework that enables large language models to maintain persistent memory across conversations, similar to how operating systems manage memory hierarchy. It solves context window limitations by intelligently managing different memory tiers. The system is particularly useful for building chatbots and agents that need long-term memory and continuous learning.

Langfuse