Mô tả công việc
Location of other offices: Japan, USA, New Zealand
Bespokify Vietnam stands as one of the subsidiaries of Japan’s largest apparel- focused eCommerce website. Throughout the years, we have built a range of diverse technologies and competencies that we are now looking to license to other businesses, focusing on both large enterprises and smaller ventures.At Bespokify, you are always our top priority. We strive to bring you the best satisfaction at work. Join our vibrant and passionate team to gain your own unique experience!
Key Responsibilities:
We are looking for a highly skilled Senior Cloud DevOps Engineer to join our dynamic team. As a DevOps, you will play a critical role in ensuring the reliability, availability, and performance of our systems and applications. You will collaborate closely with development, operations, and product teams to design and maintain scalable and resilient infrastructure. The ideal candidate is passionate about automation, monitoring, and resolving complex technical issues to enhance the overall user experience.
Collaboration: Work closely with development and operations teams to bridge the gap between software development and IT operations. Foster a culture of collaboration and shared responsibility
Documentation: Create and maintain detailed documentation related to system architecture, configurations, and procedures. Ensure knowledge transfer within the team
Monitoring and Alerting: Implement comprehensive monitoring and alerting solutions. Set up appropriate thresholds and notifications to proactively identify and address potential issues
Security: Collaborate with security teams to implement and maintain security best practices. Ensure systems are compliant with security policies and standards
Automation: Develop automation tools and scripts to streamline deployment, monitoring, and operational tasks. Automate repetitive tasks to improve efficiency and reduce manual intervention
Capacity Planning: Plan and forecast system capacity based on growth projections and usage patterns. Scale infrastructure to accommodate increasing demands while optimizing costs
Incident Management: Respond to and resolve incidents related to system outages, performance degradation, and other service interruptions. Conduct root cause analysis to prevent recurring issues
Performance Optimization: Monitor system performance and proactively identify bottlenecks. Optimize infrastructure components for maximum speed and scalability. Collaborate with developers to improve application performance
System Reliability: Ensure the reliability and availability of our applications and services by designing, deploying, and maintaining robust, scalable, and highly available infrastructure