The Elevator Pitch
We're looking for a Senior DevOps Engineer that will tackle our hardest architecture, data and AI/ML challenges. As a core member of the R&D, you will play a pivotal role in building our infrastructure for scalability, reliability, and efficiency, ensuring the seamless deployment and monitoring of our entire production stack - from compute to Kubernetes clusters and AI models. A significant and distinct aspect of this role involves direct engagement with our enterprise clients to implement and manage unique deployment solutions on both our cloud and theirs.
About the Role
Responsibilities
- Infrastructure Management: Manage our production environment to ensure scalability, performance and ease of use. This includes our own cloud and components deployed on customers clouds.
- Observability Stack: Implement and manage monitoring and observability solutions to maintain system health and allow for fast debugging of production incidents.
- AI/ML Lifecycle: Lead MLOps practices to streamline the deployment, monitoring, and management of LLMs and other AI models in production.
- Foundational Leadership: Champion DevOps best practices across the R&D. Position yourself to potentially lead the DevOps function as the team expands.
Requirements
- 5+ years of experience working with AWS & Azure using IaC tools.
- Deep understanding of Kubernetes and containerized environments.
- Hands-on experience with monitoring & observability tools like Grafana, Prometheus and OpenTelemetry.
- Familiarity with AI/ML (including LLMs) model deployment challenges and solutions - Advantage.