About the job
Job Description:Job Summary We are seeking a Senior MLOps / AIOps Platform Engineer with deep DevSecOps expertise and hands-on experience managing enterprise-grade AI/ML platforms. This critical role focuses on building, configuring, and operationalizing secure, scalable, and reusable infrastructure and pipelines that support AI and ML initiatives across the enterprise. The ideal candidate will have a strong background in Infrastructure as Code (IaC), pipeline automation, and platform engineering, with specific experience configuring and maintaining IBM watsonx and Google Cloud Vertex AI environments. Key Responsibilities • Serve as a lead platform engineer responsible for provisioning, configuring, and maintaining the IBM watsonx and Google Cloud Vertex AI environments. • Ensure platforms are production-ready, secure, cost-effective, and performant across training, inferencing, and orchestration workflows. • Manage updates, patching, integrations, and uptime for AI/ML platform infrastructure and services. • Collaborate with product, security, and compliance teams to ensure platform alignment with enterprise policies. Enterprise MLOps / AIOps Enablement • Design and implement standardized MLOps and AIOps patterns across business units, enabling consistent and scalable practices. • Create and maintain reusable workflows for model development, deployment, and lifecycle management. • Provide onboarding, enablement, and ongoing support to AI/ML teams adopting enterprise platforms and toolsets. DevSecOps Integration • Embed security into every phase of the ML lifecycle, integrating tools for scanning, policy enforcement, and vulnerability management. • Implement guardrails, access controls, and automated compliance checks across all CI/CD and IaC processes. • Ensure platform and model deployments meet enterprise and regulatory requirements. Infrastructure as Code & Pipeline Automation • Develop and maintain IaC templates using tools like Terraform, CloudFormation, and Ansible to provision AI/ML infrastructure. • Build and optimize CI/CD pipelines for AI/ML assets, including data pipelines, model training workflows, deployment artifacts, and monitoring systems. • Promote and enforce best practices around automation, observability, and reusability of infrastructure and workflow components. Monitoring, Logging, and Operational Visibility • Implement comprehensive observability for AI/ML workloads using tools like Prometheus, Grafana, Stackdriver, or Datadog. • Develop alerts and diagnostics for system health, model drift, data integrity, and deployment anomalies. • Define KPIs and metrics for evaluating operational health and platform usage.
Requirements
- MLOps
- AIOps
- DevSecOps
- Infrastructure as Code
- AI/ML Platforms
Qualifications
- Bachelor's or Master's degree in Computer Science, Engineering, or a related technical field.
Preferred Technologies
- MLOps
- AIOps
- DevSecOps
- Infrastructure as Code
- AI/ML Platforms
About the company
UPS is committed to providing a workplace free of discrimination, harassment, and retaliation. Explore your next opportunity at a Fortune Global 500 organization. Envision innovative possibilities, experience our rewarding culture, and work with talented teams that help you become better every day.
Similar Jobs
Senior MLOps / AIOps Platform Engineer
People Prime Worldwide
Senior Front-End Engineer
CodeMyMobile
Senior Front-End Engineer
CodeMyMobile