Back to projects
LokiPromtailGrafanaFluent BitElasticsearchMinIODocker ComposePython

Centralized Logging Platform

Production centralized logging using PLG stack (Promtail, Loki, Grafana) with EFK alternative, featuring log parsing pipelines, PII masking, multi-tenancy, and log-based alerting

4 min read

Overview

A production-grade centralized logging platform built to collect, parse, store, and query logs from multiple sources across an infrastructure. Provides two complete deployment options: the lightweight PLG stack (Promtail + Loki + Grafana) and the feature-rich EFK stack (Elasticsearch + Fluent Bit + Kibana) — with documented trade-offs for each approach.

Designed for compliance-aware environments with PII masking, multi-tenant isolation, configurable retention policies, and log-based alerting that fires on log patterns in real-time.

Key Features

Log Collection

  • Promtail as primary log shipper with 6 scrape configs: systemd journal, syslog, auth.log, Docker JSON logs, Nginx access/error, application JSON
  • Fluent Bit as alternative shipper with richer parsing, Lua scripting, and Kubernetes metadata enrichment
  • File-based and journal-based collection covering all Linux log sources

Log Parsing Pipelines

  • Nginx access logs — regex parsing extracting status, bytes, response time, user agent, and GeoIP enrichment
  • JSON application logs — unpack, timestamp extraction, label creation from structured fields
  • Multiline stack traces — Java/Python stack trace aggregation using multiline stage
  • Logfmt parsing — key=value log format parsing with label extraction
  • PII masking — regex-based masking of emails, credit cards, phone numbers, and SSNs before ingestion
  • Noise reduction — drop health check logs, debug-level noise, and known noisy patterns

Log-Based Alerting

  • High 5xx rate
    CODE
    rate({job="nginx"} |= "500" [5m]) > 10
  • Auth brute force
    CODE
    rate({job="auth"} |= "Failed password" [10m]) > 5
  • Application error spike
    CODE
    sum(rate({job="app"} |= "ERROR" [5m])) > 50
  • SSH failed logins — Alert on suspicious SSH activity patterns
  • Recording rules from logs — Pre-compute error rates from log streams for faster dashboard queries

Multi-Tenancy & Retention

  • X-Scope-OrgID tenant isolation for secure multi-team log access
  • Chunk lifecycle management — configurable retention with hot/warm/cold phases
  • S3-compatible storage via MinIO for cost-effective long-term log retention
  • Configurable retention per tenant with different policies for different log types

Architecture

CODE
1Log Sources Shipper Layer Storage + Query 2───────────── ────────────── ────────────── 3Linux syslog ──┐ 4Docker logs ──┤ ┌──────────┐ ┌──────────┐ 5Nginx logs ──┼── Promtail ──┤ │ │ MinIO │ 6App JSON logs ──┤ (parse + │ Loki │──── chunks ──┤ (S3) │ 7Auth logs ──┤ label) │ │ └──────────┘ 8 │ └────┬─────┘ 9 │ │ 10 └── Fluent Bit ─────┤ 11 (alternative) │ 1213 ┌──────────┐ 14 │ Grafana │ 15 │ (LogQL) │ 16 └──────────┘

Technical Implementation

Loki Configuration

Loki runs in simple scalable deployment mode with TSDB index store and S3-compatible chunk storage (MinIO). The ingester batches and compresses log chunks before flushing to object storage, while the querier supports LogQL for both log queries and metric extraction from logs.

Promtail Pipeline Design

Each log source has a dedicated pipeline configuration:

  1. Scrape — Read from file path, journal, or Docker socket
  2. Parse — Extract fields using regex, JSON, logfmt, or pattern stages
  3. Label — Create Loki labels from parsed fields (keeping cardinality low)
  4. Transform — Mask PII, drop noisy entries, normalize log levels
  5. Push — Send to Loki with tenant header

EFK Alternative

For environments needing full-text search, aggregations, and Kibana's rich visualization:

  • Elasticsearch with ILM policies (hot → warm → cold → delete)
  • Fluent Bit with parsers, filters, and Lua scripts
  • Kibana for exploration and dashboard creation

CI/CD Validation

GitHub Actions pipeline validates:

  • YAML lint on all config files
  • Loki config validation
  • Promtail config validation
  • Parsing pipeline unit tests against sample log entries
  • PII masking verification tests

Deployment

Bash
1cp .env.example .env 2docker compose -f docker/docker-compose.plg.yml up -d 3# Seed demo logs: 4./scripts/seed-logs.sh

EFK Stack (Alternative)

Bash
docker compose -f docker/docker-compose.efk.yml up -d

Impact

  • 6 log parsing pipelines covering all common log formats
  • PII masking for GDPR/compliance — emails, credit cards, SSNs redacted before storage
  • Multi-tenant isolation with per-tenant retention policies
  • Log-based alerting that fires on patterns, not just metrics
  • 2 complete stack options with documented trade-off analysis
  • Realistic log generators for demo and testing out of the box

Future Plans

  • Add Grafana Alloy as next-generation collector (replacing Promtail)
  • Integrate OpenTelemetry for unified logs + metrics + traces
  • Add Kafka as buffer layer for high-throughput environments
  • Implement log sampling for high-volume debug logs
  • Add anomaly detection on log volume and error patterns