ETH Zürich

Systems Engineer – Platform Automation

📍 Lugano

Rolle und Verantwortlichkeiten

Investigating, troubleshooting, and debugging platform services and infrastructure resources Developing, maintaining, and supporting tools and pipelines to support platforms on a geographically redundant infrastructure Developing automations to provision, test, deploy, and monitor resources to support the needs of HPC and AI platforms Supporting, documenting, and sharing knowledge of tools, and procedures

Team / Beschreibung

The Swiss National Supercomputing Centre (CSCS) develops and operates a high-performance computing and data research infrastructure that supports world-class science in Switzerland. Its user laboratory is available to domestic and international researchers in academia, industry, and the business sector. The centre is operated by ETH Zurich and has offices at its data centre in Lugano and in Zurich.

Qualifikationen und Fähigkeiten

  • You should have a bachelor’s or higher degree in computer engineering, computer science, a relevant technical field, or equivalent practical experience.

  • Experience in the deployment of HPC, AI, or Cloud infrastructures

  • Management of HPC/AI services to maximize utilization of compute, storage and high speed network components

  • Working knowledge of automation tools and frameworks, including CI/CD processes and ecosystem

  • Linux administration skills

  • Experience with versioning systems and CI/CD workflows such as ArgoCD is preferred

  • Experience with debugging of microservices running on Kubernetes is preferred

  • Experience with performance monitoring and diagnostic tools for HPC/AI hardware is preferred

  • Experience with Infrastructure as Code tools such as Terraform and Ansible is preferred

  • Experience with working in self-organized teams is preferred

  • Familiarity with Agile methodology is preferred

  • Experience with test-driven development is a plus