Site Reliability Engineer (SRE) - Azure | DevSecOps | IaC | Governance | Observability Job at Avaya, Remote

ajJnK3p0SngrSEkyeUcvUWF1NkcvVzVSMkE9PQ==
  • Avaya
  • Remote

Job Description

About Avaya

Avaya is an enterprise software leader that helps the world’s largest organizations and government agencies forge unbreakable connections.

The Avaya Infinity™ platform unifies fragmented customer experiences, connecting the channels, insights, technologies, and workflows that together create enduring customer and employee relationships.

We believe success is built through strong connections – with each other, with our work, and with our mission. At Avaya, you'll find a community that values your contributions and supports your growth every step of the way.

Learn more at

Description

We are seeking a  Site Reliability Engineer (SRE)who will drive stability, reliability, and performance across our  Azure and GCP-based platforms .
This role blends operational excellence, proactive incident management, and strong collaboration with DevOps, Cloud, and Security teams.

The ideal candidate will have hands-on experience with  multi-cloud environments (Azure and GCP) IaC (Terraform/Ansible) CI/CD (Jenkins/GitHub Actions) , and modern  observability and AI-Ops systems . The engineer will also contribute to  governance, cost optimization, and automation strategies that reduce toil and prevent issues before they occur. A key aspect of this role is the ability to perform deep-dive troubleshooting of application performance and errors by analyzing logs and traces in platforms like Grafana and Datadog.

This position includes 24×7 support coverage (rotational)and requires strong ownership in managing major incidents, RCA processes, and continuous service improvements.

Key Responsibilities

Reliability & Incident Management

  • Serve as a key member of the 24×7 on-call rotation, responding to and managing incidents across production and pre-production environments.
  • Lead incident bridges, coordinate root cause analysis (RCA), and ensure post-incident reviews drive systemic improvements.
  • Maintain clear communication with cross-functional teams and leadership during major incidents.

Monitoring, AI-Ops, Alerts & Prevention

  • Build, tune, and maintain observability dashboards ( Azure Monitor GCP Operations Suite Prometheus Grafana Datadog Log Analytics ).
  • Perform deep-dive troubleshooting of application and service-level issues using distributed tracing and log analysis (Grafana, Datadog) to pinpoint root causes beyond infrastructure.
  • Define  SLOs, SLIs, and error budgets to proactively identify and mitigate reliability risks before customer impact.
  • Integrate  AI-Ops tools for anomaly detection, predictive alerting, and automated incident correlation.
  • Continuously enhance alert quality, reduce false positives, and automate runbooks for faster recovery.
  • Analyze trends to prevent recurring issues and support teams in resilience engineering.

Requirements

Required Skills & Experience

  • 5+ years in  Site Reliability, DevOps, Cloud Operations , or Customer support roles.
  • Demonstrated experience in application-level troubleshooting by analyzing logs and traces to identify bugs, performance bottlenecks, and error conditions.
  • Expertise in  Azure and GCP cloud operations and distributed system reliability.
  • Understanding of  Terraform Ansible , and  CI/CD pipelines (Jenkins, GitHub Actions).
  • Experience with  observability and AI-Ops tools (Azure Monitor, GCP Operations Suite, Grafana, Prometheus, Datadog, etc.).
  • Solid grasp of  incident management frameworks (P1–P3 handling, RCA, PIRs, on-call rotations).
  • Excellent analytical, troubleshooting, and communication skills.

Desired Behaviours

  • Proactive Prevention: Identifies and resolves risks before they escalate into incidents.
  • AI-Driven Mindset: Applies AI and automation to improve reliability and reduce human intervention.
  • Accountability: Owns service reliability and communicates with clarity.
  • Collaboration: Works seamlessly with platform, DevOps, and product teams.
  • Efficiency: Focuses on automation to reduce manual effort and improve MTTR.
  • Continuous Improvement: Learns from failures, iterates processes, and enhances documentation.

The pay range for this opportunity is from $129,00 to $143,000 + performance-related bonus + benefits.  This range represents the anticipated low and high end of the salary for this position. This role is also eligible to receive an annual bonus that aligns with individual and company performance. Actual salaries will vary and are based on factors such as a candidate’s qualifications, skills, competencies.

Footer

Applicants must be currently authorized to work in the United States without the need for visa sponsorship now or in the future.

Avaya is an Equal Opportunity employer and a U.S. Federal Contractor. Our commitment to equality is a core value of Avaya. All qualified applicants and employees receive equal treatment without consideration for race, religion, sex, age, sexual orientation, gender identity, national origin, disability, status as a protected veteran or any other protected characteristic. In general, positions at Avaya require the ability to communicate and use office technology effectively. Physical requirements may vary by assigned work location. This job brief/description is subject to change. Nothing in this job description restricts Avaya right to alter the duties and responsibilities of this position at any time for any reason.

Job Tags

For contractors, Work at office, Visa sponsorship

Similar Jobs

Eastlake Performance

Sports Performance Personal Trainer Job at Eastlake Performance

 ...or willing to pursue one, and live close to Eastlake. To begin, we have prospective team members go through a "non-paid shadow sports performance internship" to see if it's a good fit for you, the clients, and our team. Depending on your competency and ability to catch... 

Shearer's Foods

Stand Up Crown Forklift Operator Job at Shearer's Foods

 ...And we know what youre thinkinghow can I get my hands on some free goodies? Our team members can take home free snacks!STAND UP CROWN FORKLIFT OPERATORSNIGHTS 12am - 8am (all schedules are working both Sat/Sun)$21.42 hourly rate + $3 Shift PremiumWhat you bring... 

JBS

BOILER OPERATOR Job at JBS

DescriptionPosition at JBS USABoiler Operator HourlyJBSis seeking anhourly MaintenanceTeamMember! Positionpayrate ranges from $24.35per hour up to$42.35/hour plus$1/hour Shift Differentialfor2ndand 3rdshift.~Paid vacation and holidays... 

Joby Aviation

IT Security Intern Job at Joby Aviation

 ...in the US and Dubai, we're now scaling manufacturing and preparing for the launch of our commercial service. Overview The IT Security Internship position entails working directly with the Security Operations team. Additional responsibilities include learning Joby... 

Compass Group

Clinical Dietitian Job at Compass Group

 ...Take the next step in your career withMorrison Living as a Clinical Dietitian! Location : ArchCare at Carmel Richmond Healthcare and Rehabilitation Center - Staten Island, NY Setting : LTC, memory care, rehab Schedule : Full time; Monday - Friday Salary...