We are looking for a Senior DevOps Observability Engineer who is passionate about building, operating, and evolving modern observability platforms at scale. In this role, you will be a key technical authority for monitoring, logging, and alerting systems used in production environments supporting European (French) operations.
You will work hands-on with Prometheus, Grafana, and Loki, taking ownership of observability architecture, advanced dashboards, automation, and complex incident troubleshooting. This is an excellent opportunity for a senior engineer who enjoys combining deep technical expertise, automation, and operational excellence.
Key Responsibilities
- Design, build, and maintain high‑quality dashboards in Grafana that provide clear, actionable insights for engineering and operations teams.
- Develop, customize, and integrate Prometheus exporters tailored to application, infrastructure, and business requirements.
- Act as Level 3 (L3) support for complex monitoring and observability incidents, including root cause analysis and post‑incident improvements.
- Own and continuously improve the observability stack:
- Metrics: Prometheus
- Visualization: Grafana
- Logs: Loki
- Automate deployments, upgrades, and operations of observability components and exporters using Ansible.
- Improve alerting quality by reducing noise and increasing signal‑to‑noise ratio.
- Collaborate closely with DevOps, SRE, and Platform teams to embed observability into system design.
- Proactively identify scalability, performance, and reliability risks within monitoring and logging platforms.
- Ensure high availability, resilience, and performance of observability services in production environments.







