Site Reliability Engineer Length:
1.5 Years Location:
Greater Vancouver, BC (Remote at the moment) Responsibilities
The Site Reliability Engineer is responsible for deploying, supporting, and optimizing highly scalable, fault-tolerant cloud and hybrid systems. The SRE II will maintain competency and collaborate with other members in multiple functional areas, including working knowledge of software development, debugging and troubleshooting code, DevOps practices, networking, and security, as well as an understanding of business processes. The SRE will also use their knowledge of software development, systems engineering, and operational best practices to promote the vision and execution path for building non-functional requirements into product deliverables.
This role will work closely with DevOps teams and business and technology stakeholders to deliver services quickly and effectively to end users. This position will also promote a respectful and collaborative culture. The Site Reliability Engineer is accountable to the DevOps Director. The incumbent has no assigned subordinates, but provides mentoring and guidance to junior team members. Responsibilities:
- Ensures the reliability and sustainability of business applications by optimizing, monitoring, and automating deployment, support, and sustainment activities
- Provides oversight and guidance on application sustainability and maintainability to software development teams from planning through implementation and operations
- Promotes best practices for supporting applications, and develops and maintains runbooks for assigned systems
- Assists with the development and implementation of operational checklists, including systems handover criteria, and performs systems reviews with DevOps teams
- Develops and monitors Service Level Indicators, Service Level Objectives, and Service Level Agreements to ensure the sustainability of business applications
- Acts as DevOps liaison on any Service Level Agreement issues identified for DevOps and Cloud applications
- Liaises with Release and QA Leads to maintain awareness of production deployments to ensure the stability and maintainability of supported systems
- Maintains close relationships with DevOps teams, business users, architects, Cloud Operations and Infrastructure teams
- Automates any repeatable support processes and metrics
- Identifies and removes system bottlenecks as part of a shift-left testing and deployment practice, and troubleshoots performance, reliability, and stability issues
- Makes support metrics visible using tools such as VSTS dashboards and integrating metrics into reports and dashboards with tools such as SharePoint, Excel or PowerBI
- Manages the support development lifecycle using tools such as VSTS, creating reports and integrating with tools such as SharePoint, or PowerBI
- Utilizes tools such as Visual Studio for code development, testing, and code review
- Undergraduate degree in Computer Science or STEM (Science, Technology, Engineering, Math)
- Minimum 4 years' of equivalent work experience in IT. Of the 4 years, a minimum of 2 years' relevant IT work experience with agile methodologies, Cloud and DevOps environments, continuous IT process improvement
- Scaled Agile Framework (SAFe) training and certification like SAFe Practitioner, or willingness to take the course and obtain certification within 2 months from commencement of work.
- Working knowledge of Microsoft Azure DevOps and /or similar continuous integration and continuous deployment technologies such as Team Foundation Server, Jenkins CI, Github and Artifactory specifically related to software build, unit testing and deployment
- Working knowledge of infrastructure configuration management and automation tools such as Chef, Puppet, Salt, Ansible, and Terraform
- Specialist knowledge of cloud monitoring and measurement tools for infrastructure, application, logging, APM, and user interface experience
- Working knowledge of Microsoft ARM templates and JSON scripting for automated deployments
- Working knowledge of Microsoft ARM IaaS and PaaS architectures
- Working knowledge of API architecture and hybrid cloud integration patterns
- Working knowledge of networking protocols and technologies such as routing, DNS, network peerin
- Working knowledge of developing and monitoring SLOs and SLAs
- Must reside in BC during the contract period even when working remotely
Annex Consulting Group is a full service IT and management consulting firm, specializing in staff augmentation contracting, permanent staffing, and outsourced solutions. Candidates must be legally entitled to work in location advertised.
Not interested but know someone who is a fit for this role? Check out the award-winning Annex Referral Program . Leaders in IT. Advisors in Business. Partners in Solutions.
Software and Programming Information Technology