Sr. Site Reliability Engineer Job at Addison Group, Austin, TX

MG9ZZTlvbE1maUdWcUxZV0ZhcXd3Skg1WXc9PQ==
  • Addison Group
  • Austin, TX

Job Description

n this position, you will be a vital member of our Site Reliability Engineering (SRE) team, responsible for improving incident response, advancing problem management, identifying automation opportunities, and managing observability tools. You'll work closely with Platform and Value Stream teams to strengthen system resiliency, champion a culture of Site Reliability Engineering, and support our transition from on-premise to cloud infrastructure.

Responsibilities & Qualifications

Ideal candidates will: 

  • Lead positive change with clear, collaborative leadership and measurable project outcomes. 
  • Solve challenges independently while offering solutions-focused guidance to peers. 
  • Empower team growth by sharing knowledge transparently and providing constructive feedback. 
  • Foster a culture of diversity of thought, mutual trust, and accountability. 

What you’ll do: 

  • Take ownership of key projects, driving efforts to improve efficiency, enable self-service, and automate manual processes. 
  • Manage initiatives from discovery through planning, scheduling, and execution using Agile Scrum methodologies. 
  • Lead high-stakes production incidents as a Senior Incident Commander, ensuring rapid resolution, clear communication, and poise under pressure. 
  • Facilitate post-incident retrospectives, transforming technical learnings into actionable improvements. 
  • Architect, implement, and maintain cutting-edge observability systems to ensure proactive incident detection and resolution. 
  • Build and manage integrations across systems to streamline monitoring, alerting, and health reporting. 
  • Define and execute strategies for system availability, performance, and reliability, aligning with organizational goals. 
  • Collaborate with stakeholders to establish Service Level Objectives (SLOs) and design strategies for managing breaches. 
  • Mentor and guide team members, setting high standards for technical excellence and operational discipline. 
  • Offer candid, constructive feedback to improve processes, systems, and team performance. 
  • Serve as a trusted advisor, advocating for best practices in reliability engineering and driving cultural change across the organization. 

 

It is required that you have: 

  • Bachelor’s degree in a related field or equivalent education, training, or experience. 
  • At least 4 years of experience in site reliability engineering, DevOps, or related engineering discipline (or equivalent education, training or experience).  
  • Strong leadership skills in incident management and operational excellence. 
  • Demonstrated initiative, independent work, and results-driven success  
  • Expertise in building and optimizing complex systems  

It would be great to also have:  

  • Expertise in ITIL practices and their application in modern IT environments. 
  • Extensive experience in operations and engineering with distributed systems. 
  • Proficiency with Git and modern CI/CD pipelines. 
  • Advanced skills in programming (Java, C#) and scripting (Python, PowerShell, Bash). 
  • Hands-on experience with automation tools (Terraform, Ansible) and infrastructure as code. 
  • Proven success in implementing monitoring, logging, and alerting solutions. 
  • Exceptional collaboration, negotiation, and presentation skills, with the ability to inspire and influence. 
  • Experience providing constructive feedback and fostering continuous improvement. 
  • A passion for achieving results, with a strong sense of accountability and teamwork.

Job Tags

Similar Jobs

Enexor

Mechanical or Biomedical Engineer - Product Design & Development Job at Enexor

 ..., this is the kind of place where your work will matter every day. The Opportunity We have openings for Mechanical or Biomedical Engineers who are either recent graduates or have up to five years of professional experience. You will help design and develop a new... 

K2 Group, Inc.

Red Analyst (Cyber) Job at K2 Group, Inc.

 ...Red Analyst (Cyber) Red Team Analysts (cyber) are responsible for providing direct strategic and tactical analytic support to the DoD Red Team . The Red Analysts (cyber) drive the strategic direction of cyber operations by selecting cyber targets and identifying cyber... 

Marriott International

Restaurant Host or Hostess Job at Marriott International

POSITION SUMMARY Our jobs aren't just about putting food on the table that our guests will enjoy until they ask for their bill. Instead, we want to build an experience that is memorable and unique - with food and drinks on the side. Our Guest Service Support Experts...

Dr Reddy's Laboratories Limited

Packaging Engineer Job at Dr Reddy's Laboratories Limited

 ...the role and are open to exploring candidates with a less traditional background. Job Description We are seeking a Packaging Engineer to support our North America Groups packaging commercialization efforts for their U.S. OTC store brand and owned brand business... 

ALSO.

Vehicle Product Design Architecture, Integration Lead Job at ALSO.

 ...mention, innovative and delightful) vertically integrated, small EVs designed to meet the global mobility challenges of today and tomorrow....  ...is to inspire everyone to ride ALSOreplacing many local car, truck and SUV miles with ones on vehicles that are more affordable...