What you'll be doing:
Monitor the performance and availability of systems and services: Ensure that the services and platform are being re-actively and proactively monitored. Investigate and resolve incidents promptly to minimize impact.
Incident analysis and continuous Improvement: Analyze and report potential trends to drive down repeated incidents. Follow incident management processes and maintain accurate documentation.
Collaborate with Teams: Escalate and coordinate incidents in a timely manner to ensure quick service restoration. Communicate clearly and concisely with business/management regarding incident details, impact, and steps for timely resolution and prevention. Collaborate with cross-functional teams to resolve technical issues.
Shift work: Work on shifts to support a 24x7x365 environment, including nights and weekends.
What you'll need:
Customer Focus: Demonstrates a strong commitment to customer service, striving to resolve issues swiftly and effectively.
Technical Aptitude: Possesses a solid understanding of observability tools such as Datadog, Grafana, ElasticSearch, Kibana and Zabbix along with a keen willingness to learn and adapt to new technologies rapidly.
Communication Skills: Skilled in articulating technical concepts clearly and succinctly to both technical and non-technical audiences, with strong written and verbal communication skills in English.
Incident Management: Capable of assessing customer impact and managing incidents by coordinating across multiple teams to drive quick and effective resolutions.
Qualification:
Bachelor’s degree in IT or a related field, or equivalent work experience.
Strong problem-solving and communication skills.
Ability to work on shifts to support a 24x7x365 environment, including nights and weekends.
Good command in English both spoken and written
Fresh graduates are open to apply.
Nice to have:
Experience with incident management and ticketing systems (e.g. JIRA, ServiceNow) to track and manage incidents.
Familiarity with monitoring and alerting tools, such as Nagios, Zabbix, Prometheus, Grafana, or similar systems.
Familiarity with Linux/Unix and Windows operating systems, including command-line troubleshooting.
Knowledge of scripting or automation using languages like Python, Bash, or PowerShell is a plus.
Understanding of ITIL framework and incident management best practices.
ทักษะ
สายงาน
Full-time
บริษัท
16 ตำแหน่งงานที่เปิดรับ
Singapore
อุตสาหกรรม:
พร้อมที่จะสมัครแล้ว?
ส่งใบสมัครของคุณตอนนี้เพื่อก้าวต่อไปในอาชีพของคุณ
ตำแหน่งงานที่คล้ายกัน