Principal Site Reliability Engineer
Core42 · Abou Dabi
وصف الوظيفة
About the role
Core42 is looking for a Principal Site Reliability Engineer to lead the design and evolution of its globally distributed infrastructure that powers AI and private‑cloud workloads. This senior technical leader will shape platform strategy, drive automation, and ensure the reliability of high‑performance, GPU‑intensive systems.
Key responsibilities
- Define and execute the long‑term roadmap for infrastructure, CI/CD, and Kubernetes platforms.
- Design scalable, distributed systems for AI/ML and HPC workloads.
- Implement AI‑driven automation, self‑healing workflows, and predictive AIOps capabilities.
- Architect high‑performance, multi‑tenant Kubernetes environments with GPU support.
- Build observability platforms, set SLOs/SLIs, and lead root‑cause analysis.
- Act as escalation point for complex incidents and mentor SRE/DevOps teams.
- Collaborate with product, engineering, and senior leadership to align reliability with business goals.
Required profile
- 10+ years of experience in Site Reliability Engineering, Platform Engineering, or Systems Architecture.
- Proven track record designing and operating large‑scale distributed systems.
- Deep expertise with Kubernetes (EKS, GKE, or bare‑metal) and GPU‑intensive workloads.
- Strong programming skills in Python, Go, or Rust.
- Extensive experience with Terraform, Helm, and infrastructure‑as‑code practices.
Required skills
- Kubernetes
- EKS / GKE / bare‑metal clusters
- GPU and HPC workload orchestration
- Python, Go, Rust
- Terraform
- Helm
- CI/CD pipelines
- Observability (metrics, logs, tracing)
- SLO / SLI definition
- AIOps and automation frameworks
Questions fréquentes
لماذا تبلغ عن هذا العرض؟
قدم طلبك في 30 ثانية
أدخل بريدك الإلكتروني للتقديم. سيتم إنشاء حساب تلقائياً.
بالمتابعة، أنت توافق على شروط الاستخدام.
لديك حساب بالفعل؟ تسجيل الدخول
عزز فرصك
حمّل سيرتك الذاتية وسنقترح عليك الوظائف التي تناسب ملفك.
جاري تحليل سيرتك الذاتية...
Core42
Abou Dabi
عروض عمل ذات صلة
-
Software Engineer – AI Platform (Hybrid UAE or Remote)
AW Connect Abou Dabi -
DevOps Engineer – Azure & Kubernetes (UAE)
Mphasis Abou Dabi -
Full Stack Developer (Remote)
YO IT Consulting Abou Dabi -
Entry-Level IT Specialist – Odoo Focus
H&R Real Estate Brokerage | Arm of Amer Al Ghurair Group Doubaï -
President – AI Acquisition
AI Acquisition Doubaï