HPC & GPFS DevOps Engineer
Salt Lake City , Utah
Hybrid
Direct Hire
$170k - $250k
Job Title: Senior DevOps HPC Engineer
Location: Hybrid -Salt Lake City, UT - 2-3 days a week
Overview: We’re seeking a Senior DevOps HPC Engineer to design and manage high-performance computing (HPC) infrastructure, focusing on storage and AI model deployment. This role requires expertise in HPC, GPFS, and scientific workloads, with a strong emphasis on storage management and disaster recovery.
Responsibilities:
- Design and maintain HPC infrastructure and storage solutions, with expertise in GPFS.
- Implement backup, disaster recovery, and lifecycle management policies.
- Work in a GCP environment to build scalable cloud-based HPC solutions.
- Develop tools for monitoring and managing HPC systems.
- Provide mentorship and troubleshoot complex infrastructure and storage issues.
- Apply DevOps practices, focusing on HPC and storage needs.
Requirements:
- Senior-level experience with HPC and GPFS.
- Proficiency in Python for automation and scripting.
- Experience in backup policies and disaster recovery for storage.
- Familiarity with AI model deployment on HPC systems.