Your Role
Infrastructure Management:
- Design, build, and maintain scalable and resilient infrastructure using cloud platforms.
- Automate infrastructure provisioning, configuration, and deployment using tools
- Monitor and optimize infrastructure for performance, scalability, and cost-efficiency
- Ensure high availability and disaster recovery of systems and applications.
CI/CD Pipeline Management:
- Design and implement Continuous Integration/Continuous Deployment (CI/CD) pipelines to automate the software development lifecycle.
- Integrate testing, security scanning, and code quality checks into the CI/CD process.
- Manage and maintain CI/CD tools (e.g., Jenkins, GitLab CI, CircleCI).
- Collaborate with development teams to ensure smooth code integration and deployment.
Automation and Scripting:
- Develop scripts and automation tools to streamline repetitive tasks and reduce manual intervention.
- Implement infrastructure as code (IaC) practices to manage environments and configurations.
- Automate monitoring, alerting, and incident response processes.
Monitoring and Performance Optimization:
- Set up and maintain monitoring and logging systems to track system performance and detect issues.
- Analyze performance data and implement optimizations to improve system reliability and efficiency.
- Conduct root cause analysis for incidents and outages and implement preventative measures.
Security and Compliance:
- Implement security best practices across infrastructure, including access controls, encryption, and vulnerability management.
- Work with security teams to ensure compliance with industry standards and regulations (e.g., GDPR).
- Conduct regular security audits and risk assessments to identify and mitigate vulnerabilities.
Collaboration and Mentorship:
- Collaborate with cross-functional teams, including developers, QA, and IT operations, to align on goals and deliverables.
- Provide technical guidance and mentorship to junior DevOps engineers and other team members.
- Lead the adoption of DevOps best practices and continuous improvement initiatives.
Incident Management and Troubleshooting:
- Lead incident response efforts, including triaging, troubleshooting, and resolving critical issues.
- Develop and maintain incident response procedures, including post-incident reviews and documentation.
- Ensure timely communication and updates during incidents to stakeholders.
Your Profile
- At least a bachelor's degree in computer sciences, Software Engineering or equivalent experience
- At least 7 years of IT experience in a DevOps environment, fully understanding DevOps frameworks.
- A solid understanding of general networking (including TCP/IP, networking/security, switches/routing, HTTP/HTTPS, SSL/encryption, load balancing and SSH).
- A good understanding in DevOps and automation concepts
- Strong analytical, troubleshooting and problem-solving skills
- Strong programming skills such as Python, Groovy, Linux and Shell scripting
- Experienced in Linux operating systems such as Ubuntu, CentOS, RHEL, Debian
- Experienced in Docker, Kubernetes, Terraform, and Ansible
- Experienced in Database technologies such as PostgreSQL
- Ability to provide development solutions in accordance with the Software Development Life-Cycle methodologies such as SCRUM and Agile