Skip to content

Modules: Observability

Observability utilities — telemetry, metrics, and tracing (@broblox/observability). Status: Implemented (~244 tests).

Purpose

  • Provide production observability: structured telemetry events, counters/gauges/histograms, and distributed tracing spans.
  • Ship data to the dashboard pipeline via configurable sinks.
  • Reusable across games — each game configures its own event schemas.

Public API

Telemetry

  • Telemetryemit({ category, event, level, data }) for structured event logging.

Metrics

  • Counter, Gauge, Histogram — metric primitives.
  • CommonMetrics — shared metrics container for standard game metrics.

Distributed tracing

  • Spanstart()setTag(k,v)setError(err)end().
  • CorrelationContext — propagate request/session IDs across async boundaries.

Service factory

  • createObservabilityService(config) — wires up all telemetry, metrics, and tracing.
  • ObservabilityServiceConfig — sink URLs, sample rates, batch sizes.

Dependencies

  • @broblox/core (service lifecycle, logging).
  • @broblox/shared-types (branded IDs, Result type).
  • @broblox/analytics — player behavior analytics (events, funnels, sessions, retention).

Data ownership

Observability owns no player profile data. All event data is ephemeral or shipped externally.

Trust & security

  • Telemetry runs server-side. No client-sent telemetry without validation.
  • PII fields are stripped before shipping. Player IDs are hashed in analytics events.
  • Rate limiting on HTTP sinks prevents floods.

Configuration

  • observability.enabled — global kill-switch.
  • observability.sampleRate — event sampling (0.0–1.0).
  • observability.batchSize — events per HTTP batch.
  • observability.sinkUrl — dashboard API endpoint.

HTTP Sink

The HTTP sink batches telemetry events and sends them to the dashboard's /api/telemetry endpoint. Key behaviors:

  • Batching: Events are buffered and flushed when the batch reaches maxBatchSize or the flushIntervalSec timer elapses.
  • Error handling: Flush failures are logged via warn(...). Failed batches are dropped, not retried, to avoid backpressure.

BigInt Considerations

Roblox IDs (universe, place, player) are BigInt in the database. When telemetry data reaches the Next.js dashboard:

  • Player IDs must be converted to string before React Server Component serialization.
  • Aggregate counts use $queryRaw with COUNT(DISTINCT ...) for DB-side computation instead of fetching rows and counting client-side.

Observability (meta)

The package dogfoods itself: initialization and sink errors are emitted as telemetry events.

Testing

~244 unit tests covering Telemetry emit/subscribe, Metrics counters/gauges/histograms, Span lifecycle, CorrelationContext propagation, all analytics trackers, and HTTP sink batching. Highest test count of any package.