Head of MLOps (The Deployment Dynamo)
Unreal Gigs
San Francisco, california
Job Details
Full-time
Full Job Description
Are you passionate about creating seamless, scalable, and efficient pipelines that bring machine learning models from development to production? Do you have the expertise to lead the development and optimization of MLOps infrastructure that accelerates deployment, ensures reliability, and supports continuous improvement? If you’re ready to build and maintain the backbone of AI operations, our client has the ideal role for you. We’re looking for a Head of MLOps (aka The Deployment Dynamo) to lead MLOps strategy, automate processes, and implement best practices that empower data scientists and engineers to deploy AI solutions with confidence.
As the Head of MLOps at our client, you’ll work with cross-functional teams to create a robust MLOps ecosystem, from model deployment and monitoring to continuous integration and automation. You’ll manage the infrastructure that makes model deployment secure, scalable, and efficient, ensuring that models perform optimally in production and deliver valuable insights in real time. Your role will be crucial in supporting the company’s AI-driven products and fostering a culture of operational excellence in machine learning.
Key Responsibilities:
- Develop and Lead the MLOps Strategy:
- Define the roadmap for MLOps infrastructure that aligns with the company’s AI goals and product requirements. You’ll prioritize scalability, reliability, and security, setting the direction for a streamlined, automated ML lifecycle.
- Oversee Model Deployment and Monitoring Pipelines:
- Design and manage deployment pipelines that support automated, repeatable model deployments to production environments. You’ll ensure models are securely deployed and that monitoring systems detect performance issues and drift in real time.
- Implement Continuous Integration/Continuous Deployment (CI/CD) for ML Models:
- Build CI/CD workflows tailored to machine learning, enabling frequent updates and reliable rollbacks. You’ll establish best practices for testing, versioning, and rollback procedures to support fast, secure deployment cycles.
- Collaborate with Data Scientists and Engineers to Optimize Pipelines:
- Partner with data scientists and engineers to optimize data and model pipelines, addressing bottlenecks and ensuring efficient resource allocation. You’ll help establish best practices in data preprocessing, feature engineering, and model retraining.
- Ensure Model Governance and Compliance:
- Develop and enforce governance frameworks for model versioning, audit trails, and compliance with industry regulations. You’ll implement systems to track model lineage and ensure adherence to privacy and security standards.
- Drive Automation and Operational Efficiency:
- Leverage tools and platforms to automate repetitive MLOps tasks, from data preprocessing to model validation and deployment. You’ll foster an automation-first mindset that enables rapid iteration and efficient scaling.
- Stay Updated on MLOps Tools and Trends:
- Keep up with the latest advancements in MLOps tools, cloud services, and best practices, including containerization, distributed training, and real-time monitoring. You’ll integrate new technologies that enhance infrastructure performance and reliability.
Requirements
Required Skills:
- MLOps Expertise and Infrastructure Knowledge: Strong experience in MLOps, including model deployment, monitoring, and lifecycle management. You’re proficient with MLOps tools like MLflow, Kubeflow, SageMaker, or similar platforms.
- CI/CD and Automation for ML: Expertise in designing and implementing CI/CD workflows for machine learning, including model versioning, testing, and rollback procedures. You’re skilled in leveraging automation to streamline the ML lifecycle.
- Cloud and High-Performance Computing Knowledge: Familiarity with cloud environments (AWS, GCP, Azure) and high-performance computing for scaling ML workloads. You understand resource management and cost optimization for cloud-based deployments.
- Collaboration and Cross-Functional Alignment: Proven ability to work with data scientists, ML engineers, and IT teams to integrate MLOps practices into the development cycle. You’re skilled at balancing technical requirements with business objectives.
- Model Governance and Compliance: Knowledge of model governance, including tracking, auditing, and regulatory compliance. You know how to create governance frameworks that protect model integrity and meet industry standards.
Educational Requirements:
- Bachelor’s or Master’s degree in Computer Science, Data Engineering, or a related field. Equivalent experience in MLOps or infrastructure engineering may be considered.
- Certifications in cloud computing, data engineering, or MLOps (e.g., AWS Certified Machine Learning, Google Cloud Professional ML Engineer) are advantageous.
Experience Requirements:
- 10+ years of experience in data engineering, infrastructure, or MLOps, with a strong background in building and maintaining MLOps systems.
- 5+ years of experience in a leadership role, managing teams in MLOps, infrastructure, or high-performance computing environments.
- Experience with model monitoring, pipeline optimization, and deployment in production environments is highly desirable.
Benefits
- Health and Wellness: Comprehensive medical, dental, and vision insurance plans with low co-pays and premiums.
- Paid Time Off: Competitive vacation, sick leave, and 20 paid holidays per year.
- Work-Life Balance: Flexible work schedules and telecommuting options.
- Professional Development: Opportunities for training, certification reimbursement, and career advancement programs.
- Wellness Programs: Access to wellness programs, including gym memberships, health screenings, and mental health resources.
- Life and Disability Insurance: Life insurance and short-term/long-term disability coverage.
- Employee Assistance Program (EAP): Confidential counseling and support services for personal and professional challenges.
- Tuition Reimbursement: Financial assistance for continuing education and professional development.
- Community Engagement: Opportunities to participate in community service and volunteer activities.
- Recognition Programs: Employee recognition programs to celebrate achievements and milestones.