The only Kubernetes log agent with intelligent error context capture, rule-based alerting, and 9 pluggable storage backends.
The Problem: Finding the Needle in the Log Haystack
Every SRE knows the pain: an alert fires at 3 AM, and you're digging through gigabytes of logs trying to understand what happened before the error. Traditional log solutions either capture everything (expensive) or miss crucial context (frustrating).
What if your log agent was smart enough to capture only what matters—the error AND the context around it—and alert you instantly?
Introducing Logsnare
Logsnare is an open-source Kubernetes log monitoring agent that solves this problem with intelligent error-aware context capture and rule-based alerting. Instead of blindly forwarding all logs, Logsnare:
- π Detects errors using configurable regex patterns
- ⏪ Captures context – logs BEFORE and AFTER the error
- π¨ Alerts intelligently – route different errors to different teams
- πΎ Stores smartly – choose from 9 storage backends
- ⚡ Scales efficiently – handles 500+ pods with minimal resources
Key Features
π― Smart Error Detection
Logsnare uses regex and string-based pattern matching to detect errors across multiple languages and frameworks:
errorPatterns: - "ERROR" - "Exception" - "FATAL" - "panic:" - "Traceback" - "OOMKilled" - "CrashLoopBackOff" These patterns are "fully customizable"
π¨ Rule-Based Alerting NEW
Route different error patterns to different teams with customizable thresholds:
alerting:
enabled: true
rules:
# Critical errors → On-call team immediately
- name: "critical-errors"
patterns: ["CRITICAL", "FATAL", "OOMKilled", "panic:"]
threshold:
count: 1 # Alert on FIRST occurrence
windowSeconds: 60
email:
enabled: true
toAddresses: ["oncall@company.com"]
# Java exceptions → Backend team (after 2 occurrences)
- name: "java-exceptions"
patterns: ["NullPointerException", "OutOfMemoryError"]
threshold:
count: 2 # Alert after 2 occurrences
windowSeconds: 300
email:
enabled: true
toAddresses: ["backend-team@company.com"]
# Python errors → Data team
- name: "python-errors"
patterns: ["Traceback", "TypeError", "ValueError"]
threshold:
count: 1
email:
enabled: true
toAddresses: ["data-team@company.com"]
Why rule-based alerting matters:
- π― Reduce alert fatigue – only alert relevant teams
- ⏱️ Smart thresholds – distinguish between flaky tests and real outages
- π§ Multiple channels – email and webhooks (Slack, PagerDuty, etc.)
- π Context included – alerts contain the actual error with surrounding logs
π¦ 9 Storage Backends
One agent, any storage destination:
| Backend | Use Case |
|---|---|
| PostgreSQL | Relational queries, SQL analysis |
| MongoDB | Flexible document storage |
| Elasticsearch | Full-text search, Kibana dashboards |
| Azure Log Analytics | Azure ecosystem, KQL queries |
| AWS CloudWatch | AWS ecosystem, CloudWatch Insights |
| GCP Cloud Logging | Google Cloud ecosystem |
π Context Capture Window
Configure how much context to capture around errors:
captureWindow: bufferDurationMinutes: 2 # Lines BEFORE error captureAfterMinutes: 2 # Lines AFTER error
This means when an error occurs, you get the full story—not just the error line.
Quick Start
Deploy Logsnare in under 2 minutes:
# Clone the repository git clone https://github.com/DevOpsArts/logsnare.git cd logsnare # Install with Helm (PostgreSQL backend + alerting) helm install logsnare-engine ./charts/logsnare-engine \ --namespace logsnare \ --create-namespace \ --set storage.type=postgresql \ --set connections.postgresql.host=your-db-host \ --set connections.postgresql.username=logsnare \ --set connections.postgresql.password=YOUR_PASSWORD \ --set alerting.enabled=true \ --set alerting.email.smtpHost=smtp.company.com
Architecture
┌─────────────────────────────────────────────┐
│ Kubernetes Cluster │
│ │
│ ┌─────┐ ┌─────┐ ┌─────┐ │
│ │Pod A│ │Pod B│ │Pod C│ ← Monitored │
│ └──┬──┘ └──┬──┘ └──┬──┘ │
│ └────────┼────────┘ │
│ ▼ │
│ ┌────────────────────┐ │
│ │ Logsnare-Engine │ │
│ │ • Error Detection │ │
│ │ • Context Capture │ │
│ │ • Rule-Based Alert│ ──► π§ Email │
│ │ • Rolling Buffer │ ──► π Webhook │
│ └─────────┬──────────┘ │
└───────────────┼─────────────────────────────┘
▼
┌────────────────┐
│ Storage Backend│
│ (Your Choice) │
└────────────────┘
Production-Ready Features
- π Security: Non-root container, read-only filesystem, seccomp profiles
- π High Availability: Leader election for multi-replica deployments
- π Scalability: ThreadPoolExecutor handles 500+ pods efficiently
- π SSL/TLS: Secure connections to all database backends
- π‘️ Network Policies: Built-in network isolation templates
- π¨ Smart Alerting: Rule-based routing with thresholds and cooldowns
Get Started Today
Logsnare is open-source and free to use. Check out the resources below:
- π¦ GitHub: github.com/DevOpsArts/logsnare
- π Documentation: Wiki Documentation
- π¨ Alerting Guide: Alerting Configuration
- π³ Docker Image: devopsart1/logsnare-engine
Have questions or feedback? Drop a comment below or open an issue on GitHub!
Tags: Kubernetes, DevOps, SRE, Logging, Monitoring, Alerting, Azure, AWS, GCP, Helm, Open Source
Post a Comment