Site Reliability Engineer (SRE)
Halian | Managed Services, Recruitment and Contract Staffing · Abou Dabi
Job description
About the role
We are looking for a Site Reliability Engineer (SRE) to ensure the resilience, performance, and production readiness of our cloud‑based AI systems. The role combines engineering, automation and monitoring to keep critical services reliable and secure.
Key responsibilities
- Implement resilience and chaos engineering practices.
- Build automated testing frameworks for AI workloads.
- Define SLIs/SLOs and set up comprehensive monitoring.
- Automate security and compliance validation.
- Support and maintain reliable cloud infrastructure.
Required profile
- Strong Python and automation background.
- Experience with Azure or AWS cloud platforms.
- Understanding of AI/ML evaluation processes.
- Experience with monitoring tools and CI/CD pipelines.
- Familiarity with Terraform infrastructure as code.
Required skills
- Python
- Azure
- AWS
- Terraform
- CI/CD pipelines
- Monitoring tools
- Chaos engineering
- Automated testing frameworks
- Security automation
- Compliance validation
Questions fréquentes
Why are you reporting this job?
Apply in 30 seconds
Enter your email to apply. An account will be created automatically.
By continuing, you accept our terms of use.
Already have an account? Login
Published 5 days ago
Expires 1 month from now
17 views · 0 interested
Boost your chances
Upload your CV — we will match you with relevant openings.
Analyzing your CV...
Halian | Managed Services, Recruitment and Contract Staffing
Abou Dabi
Related job offers
-
Cyber Security and Threat Monitoring Specialist – SOC
Halian | Managed Services, Recruitment and Contract Staffing Abou Dabi -
Senior System & Application Administrator
Al Etihad Payments Abou Dabi -
Backend Engineer - Python
AppliedAI Abou Dabi -
AFC Systems and Reporting Specialist
Keolis.MHI Émirats arabes unis -
Operations Technician
SANS Institute Doubaï