Motion Recruitment | Jobspring | Workbridge

Senior SRE/ AWS/ Observability

Irvine, California

Onsite

Full Time

$150k - $180k

This company is internationally recognized for delivering high-quality networking solutions and smart home innovations. With a strong global presence spanning over 170 countries, they are dedicated to enhancing everyday life through faster, more dependable connectivity. Known for its customer-first approach and commitment to excellence, it continues to grow its influence in both residential and commercial markets.
They are currently seeking a Senior Site Reliability Engineer to join their team on-site at their Irvine location. This role offers the opportunity to work on mission-critical cloud and microservices infrastructure, focusing on system reliability, automation, and performance optimization. You will play a vital role in driving observability, improving scalability, ensuring compliance, and supporting global product deployments within a dynamic and collaborative technical environment.

Required Skills & Experience
  • Bachelor's degree in Computer Science, Information Systems, or a similar technical discipline.
  • A minimum of five years’ experience working in Site Reliability Engineering or a closely related field.
  • Strong coding and scripting abilities using languages such as Java, Python, Bash, or PowerShell.
  • Proven experience in SRE, DevOps practices, cloud platform management, and security implementation.
What You Will Be Doing
  • Act as a technical authority in deploying and maintaining microservices within cloud-native Kubernetes environments.
  • Conduct performance and resiliency testing (e.g., load and chaos testing) to validate system robustness under various conditions.
  • Implement end-to-end observability across distributed services hosted on platforms such as AWS, Azure, Google Cloud, and Oracle Cloud.
  • Coordinate disaster recovery strategies, ensuring readiness through close collaboration with infrastructure and application teams.
  • Diagnose and mitigate operational issues stemming from system resource limitations, such as CPU/memory constraints or inefficient auto-scaling configurations.
  • Develop automation tools and scripts using languages such as Python, Go, or Bash to enhance operational efficiency.
  • Define service-level metrics (SLAs, SLOs, SLIs) in partnership with development teams to align technical performance with business expectations.

The Offer
You will receive the following benefits:
  • Medical, Dental, and Vision Insurance
  • 401K Retirement Savings Plan
  • Free Snacks and Drinks, and Catered Lunch
  • Free Gym Membership

Applicants must be currently authorized to work in the US on a full-time basis now and in the future.

#LI-AV3

Posted by: Alyssa Valles