International award-winning executive recruitment expert Monroe Consulting Group is recruiting on behalf of a global IT consulting & services company. As the business keeps developing, our client is seeking for Site Reliability Engineer position. The job is based in Ho Chi Minh City, Vietnam.
The dedicated Site Reliability Engineer will be responsible for monitoring computer systems and building alerts for various operational issues that computer systems can experience with Java.
- Maintain systems and troubleshoot system issues.
- Identifying bottleneck in various Java applications and implement performance improvements.
- Identify and analyze user requirements.
- Prioritize, assign, and execute tasks throughout the software development life cycle.
- Develop, configure, and deploy tools for cloud-based systems and services.
- Containerize new and legacy applications.
- Maintain awareness of new and emerging technologies.
- Support development and operations teams.
- Enhance, modify or debug developer code as needed.
- Experience with Jenkins for CI/CD pipeline creation and CI/CD automation
- Proficiency in supporting a 24×7 critical operation
- Experience with Kubernetes implementation with Google
- Experience in a cloud computing platform and the associated automation patterns it provides, preferably GCP
- Deep understanding of an object orientated language, preferably the latest version of Java
- Proficient in a modern scripting language like Go or Python
- Proficient in production systems design including High Availability, Disaster Recovery, Performance, Efficiency, and Security user, application performance, system, log, time-series, and dashboarding
- Familiarity with Open Source concepts and tools like Prometheus, Grafana, ELK etc. Knowledge of APM fundamentals or experience in tools like New Relic or AppDynamics is good to have
- Proficient in a modern infrastructure automation toolkit such as Terraform/Helm
- Proficient in a Linux or Unix based environment
- Deep understanding of modern microservice based architectures and operations
- Experience in destructive testing methodologies and tools such as chaos monkey
- Experience in defensive coding practices and patterns for high availability