In the Site Reliability Engineering team, part of Common Technology Application Operations we are responsible for supporting the organization with engineering solutions. We work towards simplification, harmonization and automation, along with building and maintaining scalable and reliable services. We need you to help us design and build Infrastructure as a Code solutions across the bank. Part of our daily work is to use and maintain development, continuous delivery/integration and application operations toolset. Continuously investigating new technologies is what we also spend our time on. Building new solutions used across the bank is what we take pride in. Our working methodology is a “3 – Angle” model through which we deliver new solutions through our “Factory”, we operate our tools in production through our Service and Self-Service Desks and we provide “hands on” support and “on boarding” to our platform.
What you’ll be doing:
- Ensure high availability, resiliency, scalability and capacity utilization of the applications while working closely with the development teams, and enhance the application monitoring and self-healing capabilities of the applications, and work closely with the rest in the SRE team on improving the applications from an operational perspective, achieving automation and efficiency
- Perform deployments for new releases using CI/DC pipelines and work on improvements of our CI/CD capabilities, as well as constantly update the KMS ensuring that it is of high quality and consistent
- Work on eliminating the “toil” on operations tasks, daily and proactive support of the new technical foundation business critical applications in production, and support the troubleshooting, incident and problem management processes
- Perform “on call” duty outside office hours as the resiliency and business criticality classification of the applications, requires 24/7 support (The estimated “on call” ratio is one week every five weeks and the compensation is very competitive.)
- Work closely with business and development teams on application roadmaps, define and automatically monitor and report on SLAs, SLOs, SLIs, MTTR and Error Budget, as well as become an ambassador and influencer on the SRE Model, Adoption and Tenets
- Make continuous service improvement a habit
Our team is spread across Denmark, Finland and Poland, supporting both the application development and application operations teams across Common Technology. The role is based in Gdańsk.
Who you are
- Are an energetic, innovative and service minded Engineer
- Think outside the box
- Get pleasure in automation and enjoy challenges
- Are a self-starter, ambitious, technology savvy and curious person
Your experience and background:
- You hold a master’s degree in Computer Science, Software Engineering, Computer Engineering or Information Technology and have at least 3 years’ work experience as an SRE or DevOps preferably at a large organization, supporting the design and operation of business-critical applications
- Experience in supporting clients is desired, as well as troubleshooting, incident and problem management, experience working with Scrum and/or Kanban is an advantage
- Experience with at least one of the following: TypeScript, Angular, Python, Go or Shell-Scripting, Oracle and/or PostgreSQL, as well as Linux, Infrastructure as a Code approach (Terraform) and Cloud infrastructure (ex. Azure, Google Cloud, AWS)
- Experience with at least four of the below technologies/tools:
- Containerization (Docker, OpenShift)
- ·no-sql databases (Mongo DB)
- event streaming platforms (Kafka)
- automation tooling (Ansible, Salt)
- service discovery (Consul)
- continuous integration/delivery (Bitbucket, Artifactory, Bamboo, TeamCity, Jenkins),
- log aggregation (Splunk, ELK)
- Experience working with APIs achieving automation and interoperability, and experience in setting up and testing of failover instances complemented with relevant documentation (TRRDs), data modelling and user guides, as well as experience in application monitoring (Prometheus/Grafana) and application self-healing best practises
- Experience with distributed systems and microservices, experience with Service Level Management and reporting (SLAs, SLOs, SLIs, Error Budgets), and operational readiness assessments would be an asset
As a benefit you will receive:
- Possibility to increase your knowledge and skills (external and internal trainings, Udemy)
- Stable employment conditions in international environment, cafeteria benefits plan (life insurance, retirement program, sport card, private health care, cinema tickets, annual profit sharing)
- Agile project approach
- Challenging tasks? Yes, with enough time to add automation so we don’t do them all-over and over again, and on site and remote work capability (split model)
If this sounds like you, get in touch!
Next steps
Submit your application no later than 30.09.2019.