Project Awesome project awesome

Useful References > Awesome Performance Engineering

A curated collection of tools and resources for performance engineering, covering observability and performance testing.

Package 22 stars GitHub

Awesome Performance Engineering Awesome

Awesome Performance Engineering

The discipline that ensures systems deliver fast, reliable, and cost-efficient experiences at any scale, combining observability and performance testing.

Contents

Indicators: ⭐ Widely adopted · 🟢 Active · 🔵 Cloud-native · 🟠 Commercial · 🚀 High performance

Observability

Metrics Collection & Time-Series Storage

  • Prometheus - ⭐🟢🔵 Pull-based cloud-native metrics platform with dimensional data model and PromQL query language.
  • VictoriaMetrics - ⭐🟢🚀 High-performance, cost-efficient Prometheus-compatible TSDB with high-cardinality and long-retention support.
  • Thanos - ⭐🟢🔵 Long-term storage, global query view, and high availability layer for Prometheus via sidecar architecture.
  • Mimir - ⭐🟢🔵🚀 Horizontally scalable, multi-tenant Prometheus-compatible TSDB from Grafana Labs.
  • InfluxDB - 🟢🟠 Purpose-built time-series database with high write throughput and a Rust-based engine (v3).
  • Grafana Alloy - ⭐🟢🔵 OpenTelemetry-native telemetry collector supporting metrics, logs, traces, and profiles.
  • Telegraf - 🟢 Plugin-driven agent for collecting and reporting metrics with 300+ input plugins.
  • StatsD - Lightweight, UDP-based metrics aggregation daemon with broad application support.
  • Netdata - ⭐🟢🚀 Real-time per-second monitoring with built-in anomaly detection and zero-configuration agent.

Distributed Tracing

  • OpenTelemetry - ⭐🟢🔵 Open standard for distributed tracing, metrics, and logs with language-specific SDKs and auto-instrumentation.
  • Jaeger - ⭐🟢🔵 CNCF graduated distributed tracing backend and UI, originally from Uber.
  • Grafana Tempo - ⭐🟢🔵 High-scale tracing backend requiring only object storage, with native Grafana integration.
  • Zipkin - 🟢 Pioneering distributed tracing system (Twitter, 2012) with a simple architecture.
  • Apache SkyWalking - ⭐🟢🔵 Observability platform with bytecode-injection-based tracing, popular in the Java ecosystem.
  • SigNoz - 🟢🔵 Open-source OpenTelemetry-native observability platform with unified metrics, traces, and logs.
  • Pinpoint - Bytecode-instrumentation-based APM and tracing for Java and PHP with zero-code-change approach.

Log Management & Log Pipelines

  • Grafana Loki - ⭐🟢🔵 Label-based log aggregation that indexes metadata instead of content for cost-efficient storage at scale.
  • Fluent Bit - ⭐🟢🔵🚀 Lightweight, high-performance log processor and forwarder for edge and containerized environments.
  • Fluentd - 🟢🔵 CNCF graduated unified logging layer with 1000+ plugins for complex routing.
  • Elasticsearch - ⭐🟢🟠 Distributed search and analytics engine with powerful full-text search capabilities.
  • OpenSearch - 🟢🔵 Community-driven, Apache-2.0-licensed fork of Elasticsearch, backed by AWS.
  • Logstash - Flexible log ingestion and transformation pipeline, part of the Elastic Stack.
  • Graylog - 🟢🟠 Centralized log management with built-in alerting and dashboards.
  • rsyslog - 🟢🚀 High-performance system logging daemon handling millions of messages per second.

Observability Pipelines and Telemetry Processing

  • OpenTelemetry Collector - ⭐🟢🔵 Standard telemetry processing pipeline with receivers, processors, and exporters for any signal.
  • Vector - 🟢🚀 End-to-end observability data routing and transformation with programmable VRL transforms.
  • Logstash - ETL-style processing for observability data with powerful filter plugins.
  • Cribl Stream - 🟠🚀 Commercial observability pipeline for routing, reducing, and enriching telemetry data.

Visualization & Dashboards

  • Grafana - ⭐🟢 Open-source observability dashboard platform supporting 100+ data sources with alerting and annotations.
  • Kibana - 🟢🟠 Visualization and log exploration for Elasticsearch and OpenSearch data.
  • OpenSearch Dashboards - 🟢🔵 Open-source fork of Kibana for OpenSearch.
  • Apache Superset - 🟢 SQL-first analytics and dashboarding platform for ad-hoc data exploration.
  • Perses - 🟢🔵 CNCF sandbox dashboards-as-code project with native PromQL and TraceQL support.

Profiling & Continuous Performance Analysis

  • Parca - ⭐🟢🔵 eBPF-based continuous profiling platform with zero-instrumentation and differential flame graphs (CNCF sandbox).
  • Grafana Pyroscope - ⭐🟢🔵 Continuous profiling with flame graph visualization and multi-language support.
  • async-profiler - 🟢🚀 Low-overhead JVM sampling profiler capturing CPU, allocation, and lock contention profiles.
  • perf - 🚀 Linux kernel performance analysis tool with hardware counters, tracepoints, and sampling.
  • bpftrace - 🟢🚀 High-level tracing language for Linux eBPF with dynamic kernel and user-space tracing.
  • bcc (BPF Compiler Collection) - 🟢🚀 Toolkit for creating eBPF-based tracing programs with dozens of ready-to-use tools.
  • Grafana Beyla - 🟢🔵🚀 eBPF-based zero-code auto-instrumentation generating RED metrics and distributed traces.
  • Perfetto - 🟢 System-wide tracing and profiling toolkit from Google for Android, Chrome, and general system analysis.

Alerting & Incident Response

  • Alertmanager - ⭐🟢 Prometheus-native alert handling with grouping, silencing, inhibition, and routing.
  • Grafana OnCall - 🟢🔵 Open-source on-call management and alert routing with native Grafana integration.
  • Keep - 🟢🔵 Open-source alert management platform consolidating alerts from multiple sources.
  • Alerta - 🟢 Unified alert correlation and management across multiple monitoring systems.
  • PagerDuty - 🟠 Industry-standard incident response and on-call management platform.
  • Opsgenie - 🟠 Alerting and escalation platform, part of the Atlassian suite.
  • Rootly - 🟠 AI-assisted incident management with automated timelines and postmortem generation.

Observability Platforms (Integrated)

  • Datadog - 🟠 SaaS observability platform with AI-powered anomaly detection and root-cause analysis.
  • Dynatrace - 🟠 AI-driven observability with automatic topology discovery and root-cause analysis (Davis AI).
  • New Relic - 🟠 Developer-centric observability platform with NRQL query language and a generous free tier.
  • Splunk Observability - 🟠 Observability built on Splunk's machine data analytics platform.
  • Elastic Observability - 🟠 Observability solution built on the Elastic Stack with self-managed and cloud options.
  • Honeycomb - 🟠 Observability platform for high-cardinality event data with BubbleUp automated correlation.
  • Grafana Cloud - 🟠 Managed Grafana stack (Mimir, Loki, Tempo, Pyroscope) with a generous free tier.
  • Instana (IBM) - 🟠 Automatic infrastructure and application discovery with real-time observability.
  • AppDynamics (Splunk/Cisco) - 🟠 Enterprise APM with business transaction monitoring and code-level diagnostics.
  • Chronosphere - 🟠 Cloud-native observability platform focused on metrics at scale with cost control.
  • Lightstep / ServiceNow Cloud Observability - 🟠 OpenTelemetry-native observability platform, now part of ServiceNow.
  • Sematext - 🟢🟠 SaaS observability platform with OpenTelemetry-native support and topology discovery.

Monitoring Suites (Operations-Oriented)

  • Zabbix - 🟢 Enterprise-grade monitoring platform with agent-based and agentless monitoring.
  • Nagios - 🟢 Pioneering open-source check-based monitoring with an enormous plugin ecosystem.
  • Icinga - 🟢 Modern evolution of Nagios with improved APIs, configuration management, and scalability.
  • Checkmk - 🟢🟠 Infrastructure and application monitoring with auto-discovery for large environments.

Service Mesh Observability

  • Kiali - 🟢🔵 Observability console for Istio with topology visualization and traffic flow analysis.
  • Linkerd Viz - 🟢🔵 Built-in telemetry and dashboard for Linkerd service mesh.
  • Hubble - 🟢🔵🚀 eBPF-powered network observability for Cilium with L3/L4/L7 flow visibility.

Database Observability

Real User Monitoring (RUM) & Frontend Observability

  • Sentry - 🟢 Error tracking and performance monitoring with session replay and Web Vitals.
  • Grafana Faro - 🟢🔵 Open-source frontend observability SDK capturing errors, performance, and user events.
  • OpenTelemetry Browser SDK - 🟢 OTel instrumentation for web applications capturing page loads and resource timings.
  • LogRocket - 🟠 Session replay combined with frontend performance monitoring.

AI-Augmented Observability

  • Dynatrace Davis AI - 🟠 Deterministic and causal AI for topology-aware automatic root-cause analysis.
  • Datadog Watchdog - 🟠 ML-driven anomaly detection across metrics, logs, and APM data.
  • Moogsoft - 🟠 AIOps platform for alert correlation, noise reduction, and incident clustering.
  • New Relic AI - 🟠 Applied intelligence with anomaly detection, incident correlation, and natural-language querying.
  • Honeycomb BubbleUp - 🟠 Automated outlier correlation across high-cardinality dimensions.
  • Coroot - 🟢🔵 Open-source eBPF-powered observability with automated service map discovery.

SLO Management

  • Sloth - 🟢🔵 SLO generation for Prometheus with YAML definitions and multi-window multi-burn-rate alerts.
  • Pyrra - 🟢🔵 Kubernetes-native SLO management generating Prometheus recording rules and alerts.
  • OpenSLO - 🟢 Open, vendor-neutral specification for defining SLOs as code.
  • Nobl9 - 🟠 Enterprise SLO platform with unified tracking and error budget management.

Synthetic Monitoring

  • Checkly - 🟢🔵 Monitoring as code for APIs and browsers with Playwright-based synthetic checks.
  • Grafana Synthetic Monitoring - 🟢🔵 Probe-based multi-location synthetic monitoring integrated into Grafana Cloud.
  • Uptime Kuma - ⭐🟢 Self-hosted monitoring tool with HTTP, TCP, DNS, and keyword checks.
  • Sematext - 🟢🟠 Playwright-based synthetic checks with CI/CD integration and SSL monitoring.

Legacy & Historical

  • Graphite - Pioneering time-series storage and graphing system with Whisper backend and Carbon collector.
  • Redash - SQL-first data visualization and collaboration connecting to many data sources.

Performance Testing

Load & Stress Testing

  • k6 - ⭐🟢🔵 Modern load testing tool with JavaScript ES6 scripting and native Prometheus/Grafana integration.
  • Gatling - ⭐🟢🚀 High-performance load testing framework with Scala/Java/Kotlin DSL and detailed HTML reports.
  • Locust - ⭐🟢 Python-based load testing framework defining user behavior in plain Python code.
  • Apache JMeter - ⭐🟢 Load testing tool with GUI and extensive protocol support (HTTP, JDBC, JMS, LDAP, SOAP).
  • Artillery - 🟢🔵 Node.js-based load testing toolkit with YAML scenarios supporting HTTP, WebSocket, and Socket.io.
  • NBomber - 🟢 Load testing framework for .NET with C#/F# scripting.
  • Tsung - 🚀 Erlang-based distributed load testing tool handling massive concurrent connections across multiple protocols.
  • GoReplay (gor) - 🟢🚀 Capture and replay production HTTP traffic for load testing with real traffic patterns.
  • Anteon (formerly Ddosify) - 🔵 eBPF-based Kubernetes performance testing platform with distributed load generation.
  • Neoload - 🟠 Enterprise performance testing platform with codeless and as-code options.
  • LoadRunner / OpenText - 🟠 Enterprise performance testing platform with broad protocol support.

HTTP Benchmarking & Micro-Benchmarking

  • wrk2 - 🚀 Constant-throughput HTTP benchmarking with accurate latency histograms that avoids coordinated omission.
  • wrk - 🚀 HTTP benchmarking tool with Lua scripting for quick relative performance comparisons.
  • Vegeta - 🟢🚀 HTTP load testing tool with constant request rate mode and built-in plotting.
  • hey - 🟢 Simple HTTP load generator, successor to Apache Bench (ab).
  • oha - 🟢🚀 Rust-based HTTP load generator with real-time TUI.
  • bombardier - 🟢🚀 Fast, cross-platform HTTP benchmarking tool with detailed latency reporting.
  • hyperfoil - 🟢🔵🚀 Distributed benchmarking framework designed to avoid coordinated omission.

API Testing & Contract Testing

  • Hurl - 🟢 Plain-text HTTP request runner for API testing in CI with assertions and chaining.
  • Postman - ⭐🟢🟠 API development and testing platform with Newman CLI for CI/CD integration.
  • REST-assured - 🟢 Java DSL for testing REST APIs with fluent syntax and JUnit/TestNG integration.
  • Karate - 🟢 BDD-style API testing framework combining API testing, mocking, and performance testing.
  • Step CI - 🟢 Open-source YAML-based API testing and monitoring framework for CI/CD.
  • Pact - 🟢 Contract testing framework ensuring provider-consumer compatibility for HTTP APIs and messaging.
  • Dredd - API testing tool that validates implementations against OpenAPI and API Blueprint specifications.

gRPC & Protocol-Specific Testing

  • ghz - 🟢🚀 gRPC benchmarking and load testing tool supporting unary and streaming RPCs.
  • k6 + xk6-grpc - 🟢🔵 k6 extension for scriptable gRPC load testing scenarios.
  • k6 + xk6-kafka - 🟢🔵 k6 extension for Apache Kafka load testing at scale.
  • kafka-producer-perf-test / kafka-consumer-perf-test - 🟢 Built-in Kafka benchmarking tools for producer and consumer throughput.
  • RabbitMQ PerfTest - 🟢 Official RabbitMQ benchmarking tool for throughput and latency measurement.
  • k6 + xk6-websockets - 🟢🔵 Built-in k6 WebSocket support for testing real-time and bidirectional protocols.

Browser & Frontend Performance

  • Lighthouse - ⭐🟢 Google's auditing tool for performance, accessibility, and SEO with actionable scores.
  • WebPageTest - ⭐🟢 Web performance analysis with filmstrip views, waterfall charts, and multi-location testing.
  • Playwright - ⭐🟢 Browser automation framework with built-in performance timing APIs for Chromium, Firefox, and WebKit.
  • Sitespeed.io - 🟢 Open-source web performance monitoring integrating Lighthouse, WebPageTest, and Grafana dashboards.
  • Puppeteer - 🟢 Chrome DevTools Protocol API enabling programmatic access to performance traces and network interception.
  • Yellowlab Tools - 🟢 Frontend code quality and performance auditing for JavaScript, CSS, and rendering issues.
  • SpeedCurve - 🟠 Continuous frontend performance monitoring with Core Web Vitals tracking and competitive benchmarking.

Service Virtualization and Mocking

  • WireMock - ⭐🟢🔵 HTTP mock server with request matching, stateful behavior, response templating, and fault injection.
  • Mountebank - 🟢 Multi-protocol service virtualization supporting HTTP, HTTPS, TCP, and SMTP.
  • Hoverfly - 🟢🔵 Lightweight service virtualization with capture-and-replay mode for API simulation.
  • MockServer - 🟢 HTTP/HTTPS mock server with expectation-based matching and callback actions.
  • Microcks - 🟢🔵 Kubernetes-native API mocking and testing importing OpenAPI, AsyncAPI, gRPC, and GraphQL contracts.

Synthetic Data Generation

  • Faker - ⭐🟢 Realistic fake data generation for JavaScript/TypeScript with massive locale support.
  • DataFaker - 🟢 Modern Java data generation library with expression-based generation.
  • Mimesis - 🟢🚀 High-performance fake data generator for Python with strong locale support.
  • Neosync - 🔵 Open-source platform for anonymizing production data and generating synthetic datasets.

Database Performance Testing & Benchmarking

  • HammerDB - ⭐🟢 Open-source database benchmarking tool supporting TPC-C and TPC-H workloads across major databases.
  • sysbench - ⭐🟢🚀 Scriptable multi-threaded benchmark tool for OLTP, CPU, memory, and I/O tests.
  • pgbench - 🟢 PostgreSQL built-in benchmarking tool with custom scripts for workload simulation.
  • YCSB (Yahoo! Cloud Serving Benchmark) - ⭐🟢 Framework for benchmarking NoSQL and NewSQL databases with standard workloads.
  • benchbase (formerly OLTPBench) - 🟢 Multi-DBMS benchmarking framework supporting TPC-C, TPC-H, and YCSB workloads.
  • mysqlslap - MySQL built-in load emulation client for quick benchmarks.

System & Infrastructure Benchmarking

  • fio - ⭐🟢🚀 Reference I/O benchmarking tool with configurable workloads and multiple engines (libaio, io_uring).
  • stress-ng - 🟢🚀 System stress testing tool with 300+ methods covering CPU, memory, I/O, and network.
  • Phoronix Test Suite - 🟢 Comprehensive benchmarking platform with 500+ test profiles and result comparison.
  • iperf3 - ⭐🟢🚀 Network bandwidth measurement tool for TCP/UDP throughput testing.

Chaos Engineering & Fault Injection

  • Litmus - ⭐🟢🔵 CNCF incubating Kubernetes chaos engineering platform with extensive experiment library.
  • Chaos Mesh - ⭐🟢🔵 CNCF incubating Kubernetes-native chaos platform with pod, network, and I/O fault injection.
  • Gremlin - 🟠 Enterprise chaos engineering platform with managed experiments and safety controls.
  • Chaos Monkey - ⭐🟢 Netflix's pioneering chaos tool that randomly terminates instances in production.
  • Pumba - 🟢🔵 Chaos testing for Docker containers with network delay and packet loss injection.
  • Steadybit - 🟠🔵 Enterprise reliability platform combining chaos engineering with resilience validation.
  • AWS Fault Injection Service - 🟠🔵 Managed fault injection for AWS resources with native service integration.

Network Simulation & Traffic Shaping

  • tc (Traffic Control) - Linux kernel traffic shaping with netem qdisc for network emulation.
  • Comcast - CLI tool for simulating bad network conditions wrapping tc/pfctl.
  • Clumsy - 🟢 Windows network condition simulator for packet drop, lag, throttle, and reordering.

CI/CD Integration & Performance Gates

  • Gatling Enterprise - 🟠 Managed Gatling execution with CI/CD integrations and historical comparison.
  • Lighthouse CI - 🟢 Run Lighthouse in CI with performance budgets, baseline comparison, and trend tracking.
  • Taurus - 🟢 YAML-based automation wrapper for JMeter, Gatling, Locust with unified reporting.

Results Analysis & Reporting

  • k6 HTML Report - 🟢 Standalone HTML report generator for k6 test results.
  • HdrHistogram - 🟢🚀 High Dynamic Range Histogram for accurate latency measurement capturing the full distribution.
  • Gatling Reports - 🟢 Built-in HTML reports with percentile distributions and response time series.
  • Apache JMeter Dashboard - 🟢 Built-in HTML dashboard generating APDEX scores and response time distributions.
  • Taurus Reporting - 🟢 Unified reporting across multiple load testing engines with BlazeMeter integration.

Cloud Provider Services

  • Azure App Testing - 🟠🔵 Microsoft's managed load testing service supporting JMeter and Locust with multi-region simulation.
  • AWS Distributed Load Testing - 🟠🔵 Distributed load testing architecture on AWS via CloudFormation supporting JMeter, k6, and Locust.

Developer-Centric Platforms

  • Grafana k6 Cloud - 🟠 Managed k6 execution with multi-region load zones and real-time Grafana visualization.
  • Octoperf - 🟠 SaaS performance testing platform built on JMeter with distributed load generation.

Enterprise Platforms

  • BlazeMeter - 🟠 Cloud performance testing platform supporting JMeter, Gatling, Locust, Selenium, and Playwright.

Tools & Integrations

Related

Back to Testing