← All Jobs
Posted Mar 24, 2026

Site Reliability Engineer L5 – Live SRE

Apply Now
Job Description: • Support live streaming events by focusing on cloud traffic (API Gateway, IPC between microservices). • Prepare and execute various load tests to ensure infrastructure can handle sudden API traffic increases. • Implement end-to-end observability and visualize data to achieve desired availability at scale. • Drive continual improvement in observability, monitoring, and scalability. • Implement, automate, execute, and analyze results from live streaming delivery focused tests. • Write and review code, develop documentation, and debug complex problems. • Coordinate and collaborate across multiple stakeholders for smooth event execution. • Participate in an on-call rotation and work flexible hours based on event schedules. Requirements: • 5+ years service reliability/operational experience running large scale, high performance systems & internet services with focus on traffic at scale. • Knowledge of and proven experience with L4 Load Balancer, HTTP cache, and reverse proxy technologies. • Expert-level knowledge of Unix or Linux systems and TCP/IP network fundamentals. • Proficient understanding of networking principles, transport, and application protocols, especially DNS, TLS, and HTTP(s) etc. • Proficient in a programming language such as Go, Python, Rust etc. • Experience with using real time and BigData analytics processing technologies (Kafka, time series database and Presto/Trino, Spark SQL etc) • Ability to work in a highly collaborative environment and to communicate effectively with internal and external partners. • Preferred - B.S. in Computer Science, Electrical or Computer Engineering (or equivalent professional experience). Benefits: • Inclusion is a Netflix value and we strive to host a meaningful interview experience for all candidates. • We are an equal-opportunity employer and celebrate diversity, recognizing that diversity builds stronger teams. Apply Now Apply Now