* Do you want to work on something that has an immediate impact to your customers and the business?
* Do you want to work on projects that will allow you to keep up to date with technology?
* Do you thrive in an environment where there is always something to investigate or automate?
Amazon Web Services is looking for Senior Systems Engineer to join our CloudFront and Route 53 Operation Engineering team. Our projects include massively scalable distributed systems that provide inexpensive, reliable, global distribution. This is an opportunity to join a world class team that is at the forefront of creating the next major computing platform. As a member of this AWS team you will help create the environment that will set the pattern for a generation to come. You should be somebody who enjoys working on solving problems, is customer-centric, and feels strongly not only about operations but also about running systems and software in the real world. You must enjoy a close-knit team environment of shared responsibility. The ideal candidate will have strong distributed systems and Linux/Unix design, networking, implementation and engineering experience.
As a senior member of the team, you will be guiding team members from multiple disciplines to provide solutions that meet cost and availability needs for our customers.
Amazon is an Equal Opportunity-Affirmative Action Employer - Minority / Female / Disability / Veteran / Gender Identity / Sexual Orientation.
•BS Computer Science or other technical degree or related experience
•1 year experience in 24x7 online internet computing environments
•Extensive experience scripting in some of the common languages used by System Engineers (perl, python, shell, ruby, other?)
•5 plus years of solid *NIX system engineering experience
** For more information on Amazon Web Services, please visit http://aws.amazon.com **
Preferred Qualifications You should have or be most of the following:
•Experience running and maintaining a 24x7 Internet-oriented production environment, preferably across multiple data centers, involving (preferably) at least hundreds of machines.
•Thorough understanding of fundamental internet protocols such as DNS, HTTP, and TCP.
•Demonstrable expertise around specifying, designing, and/or implementing system health, performance monitoring tools, and software management tools for 24x7 environments.
•A solid grasp of networking fundamentals, preferably including hands-on experience with Cisco or Juniper routers and switches.
•Familiar with the challenges surrounding efficient operations and failure mode analysis in large complex distributed systems.
•Experience with configuration management systems such as cfengine, chef, or puppet.
You will be expected to deliver on these kinds of things in the first six to twelve months on the job:
•Through participation in all phases of the development of a large distributed system, provide hardware, manageability, operability and performance perspectives on all aspects of CloudFront, Route 53 and potentially their dependencies.
•Define and/or refine hardware requirements and selected designs, balancing raw up-front dollar cost with operability and TCO, from the data center infrastructure up specify and participate in the development and delivery of operability-related features such as system health monitoring, diagnostics, repair, and other self-healing automation.
•Develop or further existing automation and system management tools and processes that reduce manual efforts and increase overall efficiency.
•Adapt and improve operations management systems and processes to accommodate rapid and increasing growth in systems and traffic.
•Participate in the design and execution of production acceptance tests and new hardware evaluations.
•Maintain fleet inventory management, including producing, maintaining, and evolving capacity plans for various components.
•Monitor the health of the fleet, automating system health, maintenance tasks, and reporting systems as needed.
•Perform various system maintenance tasks (your hands get dirty here), including configuration of new edge locations down to the machine level.
•Manage directly assigned tasks and on-call duties gracefully.