Job Description:
Collaborate with other teams to ensure seamless operation and maintenance of systems.
Develop, implement, and manage operational processes and systems to ensure efficiency and customer satisfaction
Manage the day- to- day operations of the organization
Continuously improve operational processes and implement best practices.
Provide guidance and support to operations personnel
Analyze and respond to system alerts, ensuring quick resolution of issues.
Monitor and evaluate operational performance and recommend improvements
Lead a team in a 24/7 operation environment, ensuring continuous system monitoring and management.
Ensure compliance with applicable laws and regulation
Develop and implement policies and procedures
Compensation & Benefits:
Flat, open and sharing culture with friendly management team; outsourcing company with product mindset;
Minimum 14 paid leaves per annum for all employees after probation;
Men’s Day, Women’s Day, Children’s Day, Mid- Autumn Festival and other benefits under the provisions of the company;
Nice & modern working space with young, dynamic & friendly colleagues and free coffee, tea, drinks;
Performance bonus, 13th- month salary, public holidays bonus (2/9, 30/4, 1/5, 1/1); bonus for Excellent Employee and Excellent Team;
Yearly company trip and year- end party, quarterly team building and weekly eating together; English- Japanese Club, Sports Clubs;
Saturday & Sunday OFF, Overtime pay is 150%, 200%, 300% as per labor law;
Social insurance, health insurance, unemployment insurance and Bao Viet care insurance;
Work performance review 2 times/ year (in April and October);
Performance bonus in Token of the project;
Training courses and working opportunities with technical gurus who built and operated world- class applications with millions of users. This might be a good chance for graduated students to learn cutting- edge technologies and how to build scalable system from scratch;
01 day remote work per month; A flexitime allowance of 90- 180 minutes per month for employees
01 hour paid leave per day for women having children under 12 months
Requirements:
Must have:
Proficiency in English.
Thorough understanding of system monitoring metrics.
Experience with system monitoring tools such as New Relic, Datadog, Prometheus, etc.
In- depth knowledge of SLA, SLO, and Error budget concepts.
Ability to understand and propose Operation workflow processes, Escalation processes, and Incident response processes.
Proficient in monitoring infrastructure, databases, and applications.
Prior experience as a team lead in system operation projects for clients.
Minimum of 4+ years in system operation management.
Strong knowledge of Linux.
Nice to have:
Programming skills or proficiency in writing bash scripts.
Experience with both on- premise and cloud provider system operations.
Experience in planning and scheduling shifts for team members.