LiveTemplate Observability Guide

Overview

LiveTemplate provides production-ready observability through two complementary systems:

Architecture

┌─────────────────────────────────────────────────────────────┐
│                     Application Code                         │
└──────────┬──────────────────────────┬───────────────────────┘
           ↓                          ↓
┌────────────────────┐   ┌───────────────────────┐
│   log/slog         │   │   observe.Metrics     │  ← Operational counters/gauges
│   (structured logs)│   │   PrometheusExporter  │  ← /metrics endpoint
└──────────┬─────────┘   └───────────┬───────────┘
           ↓                          ↓
┌────────────────────┐   ┌───────────────────────┐
│   slog.Handler     │   │   Prometheus scraper   │
│   (JSON/Text)      │   │   or slog emission     │
└──────────┬─────────┘   └───────────────────────┘
           ↓
    stdout/stderr/file
           ↓
    Log aggregation system
    (e.g., Loki, CloudWatch,
     Datadog, etc.)

Structured Logging

LiveTemplate uses Go's standard log/slog package directly for all structured logging. No wrapper is needed.

Configuration:

import "log/slog"

// Development: human-readable text logs
slog.SetDefault(slog.New(slog.NewTextHandler(os.Stdout, &slog.HandlerOptions{
    Level: slog.LevelDebug,
})))

// Production: structured JSON logs
slog.SetDefault(slog.New(slog.NewJSONHandler(os.Stdout, &slog.HandlerOptions{
    Level: slog.LevelInfo,
})))

All LiveTemplate components log using slog.Info(), slog.Warn(), slog.Error(), and slog.Debug() with structured attributes. Configure the default logger at application startup to control output format and level.

Metrics

LiveTemplate automatically tracks operational metrics internally. These metrics are exposed via the public MetricsHandler() method on any LiveTemplate handler.

Prometheus Export

tmpl := livetemplate.Must(livetemplate.New("myapp",
    livetemplate.WithDevMode(false),
))
handler := tmpl.Handle(controller, livetemplate.AsState(&State{}))

mux := http.NewServeMux()
mux.Handle("/", handler)
mux.Handle("/metrics", handler.MetricsHandler()) // Prometheus text format

Available Metrics

Counters:

Gauges:

Summaries (with quantiles p50/p90/p95/p99):

Log Output Formats

Development (Text Handler)

time=2025-10-31T12:34:56.789Z level=INFO msg=template_parsed template=todos.html duration_ms=5
time=2025-10-31T12:34:56.790Z level=DEBUG msg=tree_built data_type=*main.TodoState duration_ms=2
time=2025-10-31T12:34:56.791Z level=DEBUG msg=tree_diffed changes=3 duration_ms=1
time=2025-10-31T12:34:56.792Z level=DEBUG msg=rendered format=html bytes=1024 duration_ms=3
time=2025-10-31T12:34:56.793Z level=INFO msg=action_received action=increment store=counter

Production (JSON Handler)

{"time":"2025-10-31T12:34:56.789Z","level":"INFO","msg":"template_parsed","template":"todos.html","duration_ms":5}
{"time":"2025-10-31T12:34:56.790Z","level":"DEBUG","msg":"tree_built","data_type":"*main.TodoState","duration_ms":2}
{"time":"2025-10-31T12:34:56.791Z","level":"DEBUG","msg":"tree_diffed","changes":3,"duration_ms":1}
{"time":"2025-10-31T12:34:56.792Z","level":"DEBUG","msg":"rendered","format":"html","bytes":1024,"duration_ms":3}
{"time":"2025-10-31T12:34:56.793Z","level":"INFO","msg":"action_received","action":"increment","store":"counter"}

Integration Example

package main

import (
    "log/slog"
    "net/http"
    "os"

    "github.com/livetemplate/livetemplate"
)

func main() {
    // Configure structured logging (production: JSON, dev: Text)
    slog.SetDefault(slog.New(slog.NewJSONHandler(os.Stdout, &slog.HandlerOptions{
        Level: slog.LevelInfo,
    })))

    tmpl := livetemplate.Must(livetemplate.New("myapp"))
    handler := tmpl.Handle(controller, livetemplate.AsState(&State{}))

    mux := http.NewServeMux()
    mux.Handle("/", handler)
    mux.Handle("/metrics", handler.MetricsHandler()) // Prometheus endpoint

    http.ListenAndServe(":8080", mux)
}

Note: The internal/observe package is internal to the library and cannot be imported by external applications (per Go's internal/ visibility rules). Use the public MetricsHandler() API shown above for Prometheus export, and configure log/slog at application startup for structured logging.

Log Levels

Recommendation:

Metric Collection Best Practices

1. Percentiles over Averages

Metrics use histograms with p50/p95/p99 percentiles instead of averages because:

2. Emission Frequency

// Low traffic (<100 req/sec): emit every 60s
go metrics.EmitPeriodically(60 * time.Second)

// Medium traffic (100-1000 req/sec): emit every 30s
go metrics.EmitPeriodically(30 * time.Second)

// High traffic (>1000 req/sec): emit every 10s
go metrics.EmitPeriodically(10 * time.Second)

3. Metric Cardinality

Good (low cardinality):

slog.Info("Action received",
    slog.String("action", "increment"),
    slog.String("store", "counter"))

Bad (high cardinality):

slog.Info("Action received",
    slog.String("action", "increment"),
    slog.String("user_id", userID))  // DO NOT use user IDs, session IDs, etc. in metric labels

High-cardinality fields (user IDs, session IDs) should only appear in individual log events, not in metric labels.

Alerting Patterns

Key Metrics to Monitor

# High error rate
alerts:
  - name: HighErrorRate
    condition: error_logs_per_minute > 10
    severity: warning

  - name: CriticalErrorRate
    condition: error_logs_per_minute > 50
    severity: critical

# Slow template execution
  - name: SlowTemplateExecution
    condition: template_duration_p95 > 100  # ms
    severity: warning

  - name: VerySlowTemplateExecution
    condition: template_duration_p99 > 500  # ms
    severity: critical

# WebSocket connection churn
  - name: HighConnectionChurn
    condition: websocket_disconnected_per_minute > 100
    severity: warning

# Broadcast failures
  - name: BroadcastFailures
    condition: broadcast_errors_per_minute > 5
    severity: critical

Log Aggregation Integration

Loki (Grafana)

# Count errors by component
sum by (component) (count_over_time({app="livetemplate",level="ERROR"}[5m]))

# p95 template duration
quantile_over_time(0.95, {app="livetemplate",msg="template_parsed"} | json | unwrap duration_ms [5m])

# Active connections over time
avg_over_time({app="livetemplate",msg="metrics"} | json | unwrap active_connections [1m])

CloudWatch Logs Insights

-- Error count by component
fields @timestamp, component, error
| filter level = "ERROR"
| stats count() by component

-- p95 template duration
fields @timestamp, duration_ms
| filter msg = "template_parsed"
| stats pct(duration_ms, 95) as p95

-- Active connections
fields @timestamp, active_connections
| filter msg = "metrics"
| stats avg(active_connections) by bin(1m)

Datadog

# Error rate
sum:livetemplate.errors{*}.as_count()

# Template duration p95
avg:livetemplate.template.duration{*} by {template}

# Active connections
avg:livetemplate.connections.active{*}

Performance Overhead

The observability system is designed for minimal overhead:

Total overhead: <0.1% of request processing time for typical workloads.

Testing with Observability

func TestWithObservability(t *testing.T) {
    // Create test logger that captures output
    var buf bytes.Buffer
    slog.SetDefault(slog.New(slog.NewJSONHandler(&buf, &slog.HandlerOptions{
        Level: slog.LevelDebug,
    })))

    // Trigger operations that log
    slog.Info("template_parsed",
        slog.String("template", "test.html"),
        slog.Duration("duration", time.Millisecond))

    // Verify log output
    output := buf.String()
    if !strings.Contains(output, "template_parsed") {
        t.Error("expected template_parsed log")
    }
}

Request-ID Correlation

LiveTemplate does not include built-in request-ID middleware. For request tracing and correlation, use a standard middleware from your HTTP router or an OpenTelemetry instrumentation library:

import "go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp"

// Wrap your handler with OpenTelemetry instrumentation
mux.Handle("/", otelhttp.NewHandler(handler, "livetemplate"))

Alternatively, add a simple request-ID middleware:

// import "github.com/google/uuid"

type ctxKey struct{}

func RequestIDMiddleware(next http.Handler) http.Handler {
    return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        id := r.Header.Get("X-Request-ID")
        if id == "" {
            id = uuid.NewString()
        }
        w.Header().Set("X-Request-ID", id)
        ctx := context.WithValue(r.Context(), ctxKey{}, id)
        slog.InfoContext(ctx, "request", slog.String("request_id", id), slog.String("path", r.URL.Path))
        next.ServeHTTP(w, r.WithContext(ctx))
    })
}

Future Enhancements

Connecting...