Key Responsibilities:
1. Leadership and Strategy:
- Develop and execute the SRE strategy to support company goals.
- Lead and mentor the SRE team, fostering a culture of continuous improvement and innovation.
- Collaborate with other engineering and product teams to align on priorities and deliverables.
2. Infrastructure and Operations:
- Oversee the design, implementation, and maintenance of scalable, reliable infrastructure.
- Ensure the efficient operation of systems, focusing on automation, monitoring, and performance optimization.
- Implement best practices for incident response and management, including root cause analysis and post-mortems.
3. Performance and Reliability:
- Define and monitor key performance indicators (KPIs) and service-level objectives (SLOs).
- Proactively identify and mitigate potential reliability issues before they impact customers.
- Drive the adoption of robust monitoring and alerting solutions to ensure system health and performance.
4. Security and Compliance:
- Ensure systems are secure and compliant with relevant regulations and standards.
- Collaborate with the security team to implement and maintain best-in-class security practices.
- Collaboration and Communication:
- Act as a liaison between engineering, product, and other key stakeholders.
- Communicate effectively with technical and non-technical audiences, including executive leadership.
Qualifications:
- Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field.
- Minimum of 10 years of experience in software engineering, infrastructure, or operations roles.
- At least 5 years in a leadership position within a high-growth technology company.
- Deep understanding of cloud platforms (AWS, GCP, Azure) and container orchestration (Kubernetes, Docker).
- Proficiency in scripting and programming languages (Python, Go, Ruby, etc.).
- Experience with CI/CD pipelines, infrastructure as code (Terraform, Ansible), and monitoring tools (Prometheus, Grafana, Datadog).
- Strong leadership and people management skills.
- Excellent problem-solving and analytical abilities.
- Effective communication and collaboration skills.