ETH Zürich
DevOps Engineer
📍 Lugano
Rolle und Verantwortlichkeiten
Design, deploy, automate, and operate scalable infrastructure and cloud-native platform services. Contribute to Kubernetes-based AI/ML and HPC platforms, including CI/CD, GitOps, observability, security, and operational tooling. Collaborate with researchers and engineers to support complex workflows, troubleshoot production environments, and improve reliability and performance. Contribute to platform engineering, automation, and developer productivity initiatives across evolving systems and services.
Team / Beschreibung
The Swiss National Supercomputing Centre (CSCS) develops and operates a high-performance computing and data research infrastructure that supports world-class science in Switzerland. Its user laboratory is available to domestic and international researchers in academia, industry, and the business sector. The centre is operated by ETH Zurich and has offices at its data centre in Lugano and in Zurich.
Qualifikationen und Fähigkeiten
Linux systems engineering, and software development (e.g., Python, Bash)
Containers, Kubernetes, CI/CD, GitOps, and Infrastructure as Code (e.g., Terraform, Helm, Ansible, ArgoCD)
Distributed systems concepts, APIs, scalability, observability, identity and access management, and security
AI/ML platforms and supporting infrastructure services
HPC systems, GPU clusters, and large-scale infrastructure environments
Platform engineering and developer productivity tooling
Secure or confidentiality-sensitive operational environments
Curious, hands-on, and eager to understand systems inside-out
Strong engineering mindset and problem-solving attitude
Comfortable learning new technologies and working across disciplines
Effective communicator and collaborative team player
Experience supporting research or scientific computing environments
Familiarity with HPC systems and services
Exposure to GPU clusters and accelerated computing
Experience with SRE practices or on-call operations
Advanced Linux security knowledge
Ability to leverage AI tools for increased productivity