Responsibilities:
Incident Response: respond to incidents and troubleshoot issues in production environments, ensuring minimal downtime.
Environment Setup and Management: create and manage development, testing, and production environments, ensuring they mirror each other for accurate testing and deployment.
Performance Optimization: optimize system performance and resource usage, ensuring scalability and reliability of applications.
Collaboration: work closely with development, QC, and operations teams to ensure smooth integration and deployment of applications.
Security: ensure security best practices are integrated into the CI/CD pipelines and infrastructure.
Continuous Integration/Continuous Deployment (CI/CD): Set up and maintain CI/CD pipelines to streamline code integration, testing, and deployment processes.
Automating Processes: develop and implement automation for software builds, testing, deployment, and infrastructure management.
Infrastructure Management: manage and provision infrastructure using Infrastructure as Code (IaC) tools like Terraform or Ansible.
Cost Optimization: reducing expenses by right- sizing resources, using cost- efficient cloud options, ensures efficient operations while maintaining performance and scalability.
Improvement: continuously evaluate and improve the DevOps processes, tools, and methodologies to enhance efficiency, speed, and quality.
Monitoring and Logging: implement and maintain monitoring tools to track system performance and set up logging to troubleshoot and analyse issues.
Deployment and Release Management: develop and improve deployment and release processes to ensure smooth and error- free deployments. Implement canary releases and A/B testing strategies.
Preferred Skills and Qualifications:
At least 6 years of experience as a DevOps engineer.
Working experience with containerization and orchestration technologies (e.g., Docker, Kubernetes, Rancher) • Working experience with GitOps concept and tools like ArgoCD, Jenkin
Strong problem- solving skills and the ability to analyse complex systems
Hands- on experience with monitoring, logging tools (e.g., Prometheus, Grafana, ELK stack) and distributed tracing tools (e.g., Jaeger, OpenTelemetry)
Strong knowledge of Linux/Unix systems and networking concepts
Familiarity with data engineering workflows and tools is a plus.
Experience with cloud platforms (e.g., AWS, Azure, Google Cloud) and infrastructureas- code tools (e.g., Terraform, Ansible, CloudFormation)
Proficiency in at least one programming language (e.g., Python, Go, Java) and experience with scripting languages (e.g., Bash, PowerShell).
Bachelor’s degree in computer science, Engineering, or related field.