Carlson Wagonlit Travel is currently seeking a Site Reliability Engineer to join our team. The Site Reliability Engineer is responsible for building and improving the infrastructure for development and production environments, as well as ensuring the systems are stable, performant, and secure.
Carlson Wagonlit Travel is looking for talented and enthusiastic people. People who want to realize their professional ambitions while delivering the highest levels of expertise and service to our customers. As a global leader in business travel management, we offer exciting opportunities in different areas around the world. If you share our commitment to excellence and customer care and enjoy professional challenges, we would like to hear from you.
Learn about us and start your journey.
* Analytical thinking, problem-solving capability and in depth understanding of software deployment and operations.
* Instrument key parts of the infrastructure.
* Guide software development teams to instrument their code to generate relevant metrics for performance and quality.
* Implement a cost effective monitoring solution for all environments.
* Monitor and measure system availability, latency, and overall health.
* Update system configurations and provide recommendations on how to automate and help scale systems based on performance metrics.
* Define infrastructure-as-code.
* Provide technical support including issue investigation and analysis for production alerts. Fulfill tasks that aid monitoring of production health.
* Effectively communicate (written, verbal) issues and solutions in a clear, consistent manner through appropriate methods (voice calls, email, instant messaging, and ticketing systems).
* Bachelor of Science in Computer Science, Computer Engineering or equivalent exposure
* 3+ years of experience in software development and operations
* Experience working in a Continuous Integration / Continuous Deployment environment
* Experience with test automation
* Knowledge of best practices related to security, performance, and disaster recovery
* Experience with datacenter monitoring, service oriented systems, and microservices
* Experience with cloud-based applications (i.e. Amazon AWS)
* Experience with container orchestration (Kubernetes)
* Experience setting up, tuning, and maintaining Grafana, Prometheus or other TSDB, Jaeger, Kafka, Elasticsearch, or other distributed systems* Experience working in an agile environment - Scrum or Kanban
* Experience working in the Travel Industry domain
* Experience working in a start-up company
* Experience working in a geographically distributed or multicultural team