API Monitoring and Alerting Dashboard
About the Project
Built a scalable dashboard on Linux querying 50K+ logs/day from ClickHouse with sub-second latency, reducing issue detection time by 80%.
Key Features
- High-Performance Querying: Sub-second latency when querying 50K+ logs per day from ClickHouse
- Intelligent Anomaly Detection: NLP-based classification for anomaly alerts, reducing developer debugging time by over 30%
- Enterprise Security: Integrated RBAC for role-based access control
- Automated Alerting: Achieved 99.5% alert accuracy with automated alert system
- Cloud Infrastructure: AWS pipelines for reliable deployment and scaling
Tech Stack
- Backend: Python
- Visualization: Dash, Plotly
- Observability: OpenTelemetry
- Database: ClickHouse
- Cloud: AWS
- OS: Linux
Impact
- 80% faster issue detection time
- 30% reduction in developer debugging time
- 99.5% alert accuracy for automated alerts
- Handles 50K+ logs/day with sub-second query performance