S&P Global Sr. Lead Site Reliability Engineer in Centennial, Colorado
Site Reliability Engineering (SRE) is an engineering discipline that draws from software and systems engineering to define, measure and achieve reliability objectives. SRE embraces DevOps philosophies and leverages custom code, automation, tooling, support processes and service management frameworks to achieve reliability objectives. The SRE mindset considers reliability a first-class feature of any service and prioritizes engineering and automation over manual intervention.
Open to all office locations, Remote
12 (for internal use only)
S&P Global's Site Reliability Engineering teams are responsible for keeping our products and services available to customers and employees located around the world. We achieve this through software, system and process engineering to maintain service level objectives, limit human intervention and minimize the level of effort associated with support (a.k.a. "toil"). SRE teams at S&P Global are generally responsible for availability, latency, performance, efficiency, change management, monitoring, emergency response and capacity planning of our products and services.
Our SRE teams value:
As engineers and consumers of services, we deeply value the quality of our users' experience. We recognize that a solution is only as good as the quality of service it provides.
Passion for coding and automation:
We leverage technology to improve reliability and make our lives easier. We are experienced problem-solvers and are proficient in scripting and programming languages. We look for people who enjoy problem-solving, writing code and exploring automation.
We compulsively search for the underlying cause of issues and ways to improve reliability.
We value honesty and transparency over placing blame. We promote a blameless culture throughout the organization.
You have 10 years of experience in software or systems engineering.
You have experience monitoring, supporting and tuning a production application stack.
You value your time and have experience with scripting and automation frameworks.
You want to support full-stack solutions, including applications, servers, networks, data pipelines and data platforms.
You have excellent troubleshooting skills.
You demonstrate an objective, data-driven approach to problem-solving.
You demonstrate excellent collaboration and communication skills.
You take a practical and iterative approach to improvement, making small changes and testing for effect.
You have experience working across silos in change-controlled environment.
You have experience working with a globally-distributed workforce.
You have experience with cloud hosting technologies (E.g., AWS, Azure, Google).
You may have some experience with containerization platforms (Docker, Kubernetes)
Develop, maintain and report on Service Level Objectives (SLOs).
Develop and support monitoring and automation to defend SLOs.
Resolve Incidents (outages and service disruptions), including participation in on-call rotations.
Perform root cause analysis and formal postmortem write-ups for service disruptions.
Perform capacity planning to assure future reliability and efficiency as utilization grows.
Develop and test disaster recovery plans.
Implement changes and support releases in a controlled environment.
Develop and maintain runbooks, share knowledge and cross-train members of SRE and Development teams.
Consult with Development teams during service design and in advance of releases.
Conduct production readiness reviews to ensure services meet SRE onboarding requirements.
Education - Certifications
Bachelor's degree or higher in computer science, math, engineering or related disciplines.
AWS technical certifications helpful.
- S&P Global Market Intelligence - *
At S&P Global Market Intelligence, we know that not all information is importantsome of it is vital. Accurate, deep and insightful. We integrate financial and industry data, research and news into tools that help track performance, generate alpha, identify investment ideas, understand competitive and industry dynamics, perform valuation and assess credit risk. Investment professionals, government agencies, corporations and universities globally can gain the intelligence essential to making business and financial decisions with conviction.
S&P Global Market Intelligence is a division of S&P Global (NYSE:
SPGI), which provides essential intelligence for individuals, companies and governments to make decisions with confidence. For more information, visit www.spglobal.com - marketintelligence.
S&P Global is an equal opportunity employer committed to making all employment decisions without regard to race - ethnicity, gender, pregnancy, gender identity or expression, color, creed, religion, national origin, age, disability, marital status (including domestic partnerships and civil unions), sexual orientation, military veteran status, unemployment status, or any other basis prohibited by federal, state or local law. Only electronic job submissions will be considered for employment.
If you need an accommodation during the application process due to a disability, please send an email to:
EEO.Compliance@spglobal.com and your request will be forwarded to the appropriate person.
The EEO is the Law Poster http://www.dol.gov/ofccp/regs/compliance/posters/pdf/eeopost.pdf describes discrimination protections under federal law.