Director of SRE /Windows/ Azure
Greensboro, North Carolina
Open to Remote
Full Time
$175k - $195k
Job Title: Director of Site Reliability Engineering (SRE)
Location: Remote (North Carolina residents preferred)
Type: Full-Time
About Us: We are a leading insurance company committed to providing exceptional services and products to our customers. As we continue to innovate and grow, we are seeking a Director of Site Reliability Engineering (SRE) with strong Windows expertise to lead our efforts in ensuring the reliability, scalability, and performance of our systems.
Job Description:
As the Director of Site Reliability Engineering (SRE), you will lead a team of SREs responsible for maintaining the reliability and availability of our cloud-based systems and services. You will develop and implement SRE best practices, with a strong emphasis on Windows technologies- Azure Cloud is a plus, and play a key role in driving operational excellence across the organization. This role is 70% leadership and 30% hands on
Key Responsibilities:
Location: Remote (North Carolina residents preferred)
Type: Full-Time
About Us: We are a leading insurance company committed to providing exceptional services and products to our customers. As we continue to innovate and grow, we are seeking a Director of Site Reliability Engineering (SRE) with strong Windows expertise to lead our efforts in ensuring the reliability, scalability, and performance of our systems.
Job Description:
As the Director of Site Reliability Engineering (SRE), you will lead a team of SREs responsible for maintaining the reliability and availability of our cloud-based systems and services. You will develop and implement SRE best practices, with a strong emphasis on Windows technologies- Azure Cloud is a plus, and play a key role in driving operational excellence across the organization. This role is 70% leadership and 30% hands on
Key Responsibilities:
- Leadership & Strategy:
- Lead, mentor, and grow a team of SREs, fostering a culture of continuous improvement and innovation.
- Develop and implement SRE strategies to enhance system reliability, performance, and scalability.
- Collaborate with engineering, product, and operations teams to align SRE practices with business goals.
- Azure Expertise:
- Drive the adoption and optimization of Azure cloud services, ensuring best practices for reliability, security, and cost-efficiency.
- Oversee the design, implementation, and management of Azure-based infrastructure and services, including monitoring, automation, and disaster recovery solutions.
- Provide technical guidance on Azure architecture, deployment, and scaling strategies.
- Operational Excellence:
- Establish and monitor SLOs, SLIs, and SLAs to ensure systems meet reliability and performance targets.
- Develop and enforce incident management processes, including root cause analysis and post-mortem reviews.
- Lead automation initiatives to reduce manual efforts and enhance system resilience and scalability.
- Collaboration & Communication:
- Serve as a key stakeholder in technical discussions, advocating for SRE principles and best practices.
- Communicate complex technical concepts to both technical and non-technical stakeholders, ensuring organizational alignment.
- Collaborate with security teams to ensure compliance with industry standards and regulations.
- Experience:
- 10+ years of experience in IT operations, software engineering, or a related field, with at least 5 years in a leadership role.
- Proven experience leading SRE or DevOps teams in a cloud-native environment, with a strong focus on Windows
- Plus: experience with Azure cloud services, including Azure DevOps, Azure Monitor, Azure Automation, and ARM templates.
- Technical Skills:
- Proficiency in scripting and automation languages such as Python, PowerShell, or Bash.
- Strong understanding of CI/CD pipelines, infrastructure as code (IaC), and configuration management tools.
- Familiarity with containerization technologies (Docker, Kubernetes) and microservices architecture.
- Knowledge of networking, security, and compliance best practices in cloud environments.
- Leadership & Communication:
- Demonstrated ability to lead, inspire, and develop high-performing technical teams.
- Excellent communication and collaboration skills, with the ability to influence and drive change across all levels of the organization.
- Strong problem-solving skills and a proactive approach to identifying and addressing potential issues before they impact customers.
- Certifications in Azure (e.g., Azure Solutions Architect, Azure DevOps Engineer).
- Experience with other cloud platforms (e.g., AWS, Google Cloud) and hybrid cloud environments.
- Background in software development with an understanding of modern development practices.