OpenTelemetry in Production: Distributed Tracing, Metrics, and Logs for Microservices

Why OpenTelemetry Matters

In a microservices architecture, a single user request might touch 10-20 services. When that request fails or is slow, identifying which service is responsible — and exactly why — is the central challenge of operations. Logs tell you what happened in one service. Distributed tracing tells you the full story across all services, with timing.

OpenTelemetry (OTel) is the CNCF standard for collecting observability data: traces, metrics, and logs. It replaces a fragmented ecosystem of vendor-specific agents (Jaeger client, StatsD, Fluentd) with a single, vendor-neutral SDK and wire protocol (OTLP). You instrument once, send anywhere.

OpenTelemetry Architecture

Your Services (Python/Node/Go)
    |-- OTel SDK (auto-instrumentation)
    |   |-- Traces (spans)
    |   |-- Metrics (counters, gauges, histograms)
    |   |-- Logs (structured, with trace_id correlation)
    |
    +--> OTel Collector (sidecar or daemonset)
             |-- Receivers: OTLP gRPC/HTTP, Jaeger, Zipkin, Prometheus
             |-- Processors: batch, sampling, attribute transformation
             |-- Exporters:
                  |-- Traces  --> Jaeger / Tempo / Honeycomb
                  |-- Metrics --> Prometheus / Datadog
                  |-- Logs    --> Loki / Elasticsearch

Auto-Instrumentation for Python (FastAPI)

# Install OTel Python packages
pip install opentelemetry-distro opentelemetry-exporter-otlp
opentelemetry-bootstrap -a install  # Auto-installs framework instrumentation

# This installs:
# opentelemetry-instrumentation-fastapi
# opentelemetry-instrumentation-sqlalchemy
# opentelemetry-instrumentation-redis
# opentelemetry-instrumentation-httpx
# opentelemetry-instrumentation-celery
# ... (all detected frameworks)

# app/telemetry.py
from opentelemetry import trace, metrics
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.sdk.metrics import MeterProvider
from opentelemetry.sdk.metrics.export import PeriodicExportingMetricReader
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
from opentelemetry.exporter.otlp.proto.grpc.metric_exporter import OTLPMetricExporter
from opentelemetry.sdk.resources import Resource
from opentelemetry.semconv.resource import ResourceAttributes

def setup_telemetry(service_name: str, service_version: str = "1.0.0"):
    resource = Resource.create({
        ResourceAttributes.SERVICE_NAME: service_name,
        ResourceAttributes.SERVICE_VERSION: service_version,
        ResourceAttributes.DEPLOYMENT_ENVIRONMENT: "production",
    })

    # Traces
    otlp_trace_exporter = OTLPSpanExporter(
        endpoint="http://otel-collector:4317",  # gRPC
        insecure=True
    )
    tracer_provider = TracerProvider(resource=resource)
    tracer_provider.add_span_processor(
        BatchSpanProcessor(
            otlp_trace_exporter,
            max_queue_size=2048,
            max_export_batch_size=512,
            export_timeout_millis=30000,
        )
    )
    trace.set_tracer_provider(tracer_provider)

    # Metrics
    otlp_metric_exporter = OTLPMetricExporter(
        endpoint="http://otel-collector:4317",
        insecure=True
    )
    metric_reader = PeriodicExportingMetricReader(
        otlp_metric_exporter,
        export_interval_millis=10000  # 10 second intervals
    )
    meter_provider = MeterProvider(resource=resource, metric_readers=[metric_reader])
    metrics.set_meter_provider(meter_provider)


# main.py
from fastapi import FastAPI
from opentelemetry.instrumentation.fastapi import FastAPIInstrumentor
from opentelemetry.instrumentation.sqlalchemy import SQLAlchemyInstrumentor
from opentelemetry.instrumentation.redis import RedisInstrumentor
from app.telemetry import setup_telemetry

setup_telemetry(service_name="order-service")

app = FastAPI()

# Auto-instrument FastAPI (captures all routes, request/response, status codes)
FastAPIInstrumentor.instrument_app(app)

# Auto-instrument SQLAlchemy (captures all DB queries with parameters)
SQLAlchemyInstrumentor().instrument(enable_commenter=True)

# Auto-instrument Redis
RedisInstrumentor().instrument()

# Custom spans for business logic
from opentelemetry import trace
from opentelemetry import metrics

tracer = trace.get_tracer(__name__)
meter = metrics.get_meter(__name__)

# Custom metrics
order_counter = meter.create_counter(
    "orders.created",
    description="Total orders created",
    unit="1"
)
order_value = meter.create_histogram(
    "orders.value",
    description="Order value in USD",
    unit="USD"
)
payment_duration = meter.create_histogram(
    "payment.processing.duration",
    description="Payment gateway latency",
    unit="ms"
)

async def process_order(order_data: dict):
    with tracer.start_as_current_span("process_order") as span:
        span.set_attribute("order.id", order_data["id"])
        span.set_attribute("order.total", order_data["total"])
        span.set_attribute("customer.id", order_data["customer_id"])

        # Nested span for payment
        with tracer.start_as_current_span("payment.charge") as payment_span:
            payment_span.set_attribute("payment.method", order_data["payment_method"])
            payment_span.set_attribute("payment.amount", order_data["total"])
            
            import time
            start = time.time()
            result = await charge_payment(order_data)
            duration_ms = (time.time() - start) * 1000
            
            payment_span.set_attribute("payment.status", result["status"])
            payment_duration.record(duration_ms, {"method": order_data["payment_method"]})

        # Record business metrics
        order_counter.add(1, {
            "payment_method": order_data["payment_method"],
            "region": order_data["region"]
        })
        order_value.record(order_data["total"], {
            "currency": order_data["currency"]
        })

Auto-Instrumentation for Node.js

npm install @opentelemetry/sdk-node @opentelemetry/auto-instrumentations-node   @opentelemetry/exporter-trace-otlp-grpc   @opentelemetry/exporter-metrics-otlp-grpc

// instrumentation.js — Load BEFORE your app
const { NodeSDK } = require('@opentelemetry/sdk-node');
const { getNodeAutoInstrumentations } = require('@opentelemetry/auto-instrumentations-node');
const { OTLPTraceExporter } = require('@opentelemetry/exporter-trace-otlp-grpc');
const { OTLPMetricExporter } = require('@opentelemetry/exporter-metrics-otlp-grpc');
const { PeriodicExportingMetricReader } = require('@opentelemetry/sdk-metrics');
const { Resource } = require('@opentelemetry/resources');
const { SEMRESATTRS_SERVICE_NAME } = require('@opentelemetry/semantic-conventions');

const sdk = new NodeSDK({
  resource: new Resource({
    [SEMRESATTRS_SERVICE_NAME]: 'api-gateway',
    'service.version': process.env.SERVICE_VERSION || '1.0.0',
    'deployment.environment': process.env.NODE_ENV || 'production',
  }),
  traceExporter: new OTLPTraceExporter({
    url: 'grpc://otel-collector:4317',
  }),
  metricReader: new PeriodicExportingMetricReader({
    exporter: new OTLPMetricExporter({
      url: 'grpc://otel-collector:4317',
    }),
    exportIntervalMillis: 10000,
  }),
  instrumentations: [
    getNodeAutoInstrumentations({
      '@opentelemetry/instrumentation-http': {
        // Filter out health check noise
        ignoreIncomingRequestHook: (req) => req.url === '/health',
      },
      '@opentelemetry/instrumentation-fs': {
        enabled: false,  // Disable noisy fs instrumentation
      },
    }),
  ],
});

sdk.start();
process.on('SIGTERM', () => sdk.shutdown());

// package.json
// "scripts": {
//   "start": "node --require ./instrumentation.js server.js"
// }

OTel Collector Configuration

# otel-collector-config.yaml
receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
      http:
        endpoint: 0.0.0.0:4318
  
  # Scrape Prometheus metrics from services that don't use OTLP
  prometheus:
    config:
      scrape_configs:
        - job_name: 'legacy-service'
          static_configs:
            - targets: ['legacy-service:9090']

processors:
  # Batch for efficiency
  batch:
    send_batch_size: 1000
    timeout: 10s
    send_batch_max_size: 2000
  
  # Memory limit to prevent OOM
  memory_limiter:
    check_interval: 1s
    limit_percentage: 75
    spike_limit_percentage: 30
  
  # Tail-based sampling: only keep interesting traces
  tail_sampling:
    decision_wait: 10s
    num_traces: 50000
    policies:
      - name: errors-policy
        type: status_code
        status_code:
          status_codes: [ERROR]  # Always keep error traces
      
      - name: slow-traces
        type: latency
        latency:
          threshold_ms: 1000  # Keep traces > 1 second
      
      - name: probabilistic-sample
        type: probabilistic
        probabilistic:
          sampling_percentage: 5  # 5% of healthy fast traces
      
      - name: always-sample-payment
        type: string_attribute
        string_attribute:
          key: "service.name"
          values: ["payment-service"]
          # Always sample payment service (critical path)
  
  # Enrich spans with Kubernetes metadata
  k8sattributes:
    auth_type: serviceAccount
    passthrough: false
    filter:
      node_from_env_var: KUBE_NODE_NAME
    extract:
      metadata:
        - k8s.pod.name
        - k8s.pod.uid
        - k8s.deployment.name
        - k8s.namespace.name
        - k8s.node.name
  
  # Transform attributes
  transform:
    error_mode: ignore
    trace_statements:
      - context: span
        statements:
          # Redact PII from HTTP URLs
          - replace_pattern(attributes["http.url"], "token=[^&]+", "token=REDACTED")
          - replace_pattern(attributes["http.url"], "password=[^&]+", "password=REDACTED")

exporters:
  # Traces to Jaeger
  otlp/jaeger:
    endpoint: jaeger:4317
    tls:
      insecure: true
  
  # Metrics to Prometheus (pull-based)
  prometheus:
    endpoint: 0.0.0.0:8889
    namespace: otel
  
  # Logs to Loki
  loki:
    endpoint: http://loki:3100/loki/api/v1/push
    labels:
      attributes:
        service.name: service_name
        severity: severity

service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [memory_limiter, k8sattributes, tail_sampling, batch]
      exporters: [otlp/jaeger]
    
    metrics:
      receivers: [otlp, prometheus]
      processors: [memory_limiter, batch]
      exporters: [prometheus]
    
    logs:
      receivers: [otlp]
      processors: [memory_limiter, batch]
      exporters: [loki]

Kubernetes Deployment

# otel-collector-daemonset.yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: otel-collector
  namespace: monitoring
spec:
  selector:
    matchLabels:
      app: otel-collector
  template:
    metadata:
      labels:
        app: otel-collector
    spec:
      serviceAccountName: otel-collector
      containers:
        - name: collector
          image: otel/opentelemetry-collector-contrib:0.107.0
          args:
            - "--config=/conf/otel-collector-config.yaml"
          ports:
            - containerPort: 4317   # OTLP gRPC
            - containerPort: 4318   # OTLP HTTP
            - containerPort: 8889   # Prometheus metrics
          env:
            - name: KUBE_NODE_NAME
              valueFrom:
                fieldRef:
                  fieldPath: spec.nodeName
          resources:
            requests:
              cpu: 200m
              memory: 400Mi
            limits:
              cpu: 1000m
              memory: 1Gi
          volumeMounts:
            - name: config
              mountPath: /conf
      volumes:
        - name: config
          configMap:
            name: otel-collector-config
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: otel-collector
  namespace: monitoring
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: otel-collector
rules:
  - apiGroups: [""]
    resources: ["pods", "nodes", "namespaces"]
    verbs: ["get", "list", "watch"]
  - apiGroups: ["apps"]
    resources: ["deployments", "replicasets"]
    verbs: ["get", "list", "watch"]

Correlating Traces with Logs

# Python: inject trace context into log records
import logging
from opentelemetry import trace

class TraceContextFilter(logging.Filter):
    """Injects trace_id and span_id into every log record."""
    
    def filter(self, record):
        span = trace.get_current_span()
        if span.is_recording():
            ctx = span.get_span_context()
            record.trace_id = format(ctx.trace_id, '032x')
            record.span_id = format(ctx.span_id, '016x')
        else:
            record.trace_id = "0000000000000000"
            record.span_id = "0000000000000000"
        return True

# Configure structured logging with trace context
import structlog
structlog.configure(
    processors=[
        structlog.stdlib.add_log_level,
        structlog.stdlib.add_logger_name,
        structlog.processors.TimeStamper(fmt="iso"),
        structlog.processors.JSONRenderer()
    ]
)

logger = structlog.get_logger()

# When you log, trace_id is automatically included:
# {"timestamp": "2026-03-28T10:00:00Z", "level": "info",
#  "trace_id": "4bf92f3577b34da6a3ce929d0e0e4736",
#  "span_id": "00f067aa0ba902b7",
#  "event": "order created", "order_id": "ORD-123"}

# Grafana: Link traces to logs using trace_id
# In Grafana datasource configuration for Loki:

# grafana/provisioning/datasources/loki.yaml
apiVersion: 1
datasources:
  - name: Loki
    type: loki
    url: http://loki:3100
    jsonData:
      derivedFields:
        - datasourceUid: jaeger
          matcherRegex: '"trace_id":"(w+)"'
          name: TraceID
          url: '${__value.raw}' 
          # Click trace_id in logs to jump to Jaeger trace!

Grafana Dashboard for OTel Data

{
  "panels": [
    {
      "title": "Request Rate by Service",
      "type": "timeseries",
      "targets": [{
        "expr": "sum(rate(http_server_request_duration_seconds_count[5m])) by (service_name)",
        "legendFormat": "{{service_name}}"
      }]
    },
    {
      "title": "P99 Latency by Service",
      "type": "timeseries",
      "targets": [{
        "expr": "histogram_quantile(0.99, sum(rate(http_server_request_duration_seconds_bucket[5m])) by (service_name, le))",
        "legendFormat": "{{service_name}} p99"
      }]
    },
    {
      "title": "Error Rate by Service",
      "type": "stat",
      "targets": [{
        "expr": "sum(rate(http_server_request_duration_seconds_count{http_response_status_code=~'5..'}[5m])) by (service_name) / sum(rate(http_server_request_duration_seconds_count[5m])) by (service_name)",
        "legendFormat": "{{service_name}}"
      }]
    }
  ]
}

Production Sampling Strategy

Sampling Strategy Decision Tree:

Traffic: 10,000 req/s
  - At 100% sampling: 864M spans/day = expensive storage
  
Recommended approach:

1. Head-based sampling (at SDK level):
   - Development: 100%
   - Staging: 50%
   - Production: 10% default

2. Tail-based sampling (at Collector level):
   - Always keep: all errors (HTTP 5xx, exceptions)
   - Always keep: all slow traces (> 1 second)
   - Always keep: all payment/checkout traces
   - Keep 5%: normal healthy fast traces
   - Result: ~15-20% overall — captures all interesting data

3. Storage estimate with tail sampling:
   - 10,000 req/s × 20% = 2,000 traces/s
   - Average 10 spans/trace = 20,000 spans/s
   - 1KB per span = 20MB/s = 1.7TB/day — still high
   
4. Add span cardinality control:
   - Limit custom attributes to bounded values
   - Don't trace health checks
   - Don't store raw request/response bodies
   - Result: ~200-500 bytes/span = 170-430GB/day ✓

5. Retention policy:
   - Keep last 7 days in Tempo/Jaeger
   - Archive to S3 for 30 days (cheap cold storage)
   - Delete after 30 days

Alerting on Trace Data

# Prometheus rules for OTel metrics
groups:
  - name: otel.rules
    rules:
      - alert: ServiceErrorRateHigh
        expr: |
          (
            sum(rate(http_server_request_duration_seconds_count{http_response_status_code=~"5.."}[5m])) by (service_name)
            /
            sum(rate(http_server_request_duration_seconds_count[5m])) by (service_name)
          ) > 0.05
        for: 2m
        labels:
          severity: critical
        annotations:
          summary: "Service {{ $labels.service_name }} error rate > 5%"
          runbook: "https://wiki/runbooks/service-error-rate"

      - alert: ServiceLatencyHigh
        expr: |
          histogram_quantile(0.99,
            sum(rate(http_server_request_duration_seconds_bucket[5m])) by (service_name, le)
          ) > 2.0
        for: 5m
        labels:
          severity: warning
        annotations:
          summary: "{{ $labels.service_name }} P99 latency > 2s"

      - alert: OTelCollectorDropping
        expr: |
          rate(otelcol_processor_dropped_spans_total[5m]) > 0
        for: 1m
        labels:
          severity: warning
        annotations:
          summary: "OTel Collector dropping spans — check memory limits"

Conclusion

OpenTelemetry transforms microservices debugging from guesswork to forensics. With traces, you can reconstruct the exact path of any request across every service, database call, and cache hit. With correlated logs, you jump from a trace span directly to the relevant log lines. With metrics derived from trace data, you never need to maintain separate instrumentation for latency histograms and error rates.

The investment is the initial instrumentation work — typically 1-2 days per service for auto-instrumentation, plus custom spans for critical business logic. The return is permanent: every incident investigation that previously took hours of log searching becomes minutes of trace navigation. For a team running 10+ microservices, OpenTelemetry pays for itself on the first major production incident.