LiveKit is revolutionizing the AI landscape by providing the essential network infrastructure that powers multimodal AI interfaces, enabling seamless audio and visual interactions. Founded in 2021, LiveKit has rapidly grown to support over 3 Billion calls annually, 100,000+ developers globally, and industry giants like OpenAI, Character AI, Spotify, and Meta.
You'll thrive at LiveKit if you:
obsess with crafting code that is fast, reliable and practical for the problem
are known as the go-to person for tackling tough technical problems
work hard and can build and ship fast
can clearly explain complex technical concepts to others
are a fast learner, frequently picking up new languages and tools
The best way to impress us is with thoughtful Issues and/or PRs on our Github repos 😊
LiveKit is on a mission to help developers create and scale real-time experiences. We are hiring a Software Engineer / Site Reliability Engineer to help manage and scale the core components of the LiveKit infrastructure. Visibility, performance, and reliability of our globally distributed architecture is critical and a top priority.
What You'll Do
Build and own the foundational infrastructure that our products run upon.
Work directly on our products' golang code base to implement SRE related objectives.
Take a data driven approach to quantifying system performance and reliability and use it to drive project priorities.
Oncall participation including leading incident management for complex situations.
Work on automation and advanced configuration management to allow our team to manage large numbers of clusters distributed across the world running various products.
Work with infrastructure vendors when their solutions aren't meeting our real time performance and reliability needs.
Who You Are
A balance of strengths in both software engineering and large scale system administration.
Experience managing complex multi-region distributed systems running on top of container orchestration systems like Kubernetes.
Passionate about maintainability and keeping system complexity at bay, but able to balance this with meeting launch deadlines.
Bonus Points
Incident management training and experience being an Incident Commander.
Experience with Linux networking, overlay networks, and Kubernetes CNIs.
Low level knowledge for troubleshooting and tuning latency sensitive workloads.
Our Commitments to You
We offer:
A competitive salary and equity package.
Health, dental, and vision benefits
Flexible vacations
Remote work environment with necessary equipment provided.
Ready to Apply?
If you're excited about driving the future of AI-native communications and want to make a significant impact at a high-growth company, we'd love to hear from you.