Lead Data Scientist Vision & Multimodal AI
About the job
We are looking for a Lead Data Scientist Vision & Multimodal AI to architect and build next-generation Vision-Language Model (VLM) systems at scale. This Role Requires Deep Expertise In • Architecting and implementing RLHF (Reinforcement Learning from Human Feedback) Frameworks. • Training and fine-tuning Open-Source Vision-Language Models (VLMs). • Deploying and scaling multimodal models to production serving millions of Responsibilities & Build RLHF Frameworks: • Design end-to-end RLHF pipelines (SFT - Reward Modeling - PPO/DPO) • Develop scalable human feedback collection systems • Implement preference modeling and ranking pipelines • Optimize reward models for multimodal outputs (image + text) • Build automated evaluation & Fine-Tune OSS Vision-Language Models: • Experience working with Qwen-VL, Llama, GPT OSS • Pretraining/instruction tuning multimodal models • Parameter-efficient fine-tuning (LoRA, QLoRA) • Dataset curation & synthetic data generation • Scaling training on multi-GPU/multi-node clusters • Optimizing for alignment, hallucination reduction, and Scalable Deployment of VLM Systems: • Design distributed inference pipelines (GPU-optimized) • Model serving using vLLM and Triton Inference Server • Optimize latency, throughput, and cost • Implement batching, KV caching, quantization, tensor parallelism • Deploy on Kubernetes-based infrastructure • Build monitoring for drift, performance, and AI System Design: • Architect systems combining OCR, vision encoders, LLMs, retrieval • Implement retrieval-augmented multimodal pipelines • Design evaluation benchmarks for VQA, grounding, and reasoning • Ensure model safety and Leadership: • Lead a team of ML engineers & research scientists • Define technical roadmap for multimodal AI • Review model architectures & code quality • Collaborate with product and infrastructure: • 6+ years in ML / AI • 2+ years working with large-scale LLM or VLM systems • Strong hands-on experience building RLHF pipelines (not just using libraries) • Deep PyTorch expertise • Experience training models >7B parameters • Experience with distributed training (Deep Speed, FSDP) • Production-grade deployment experience handling 10k+ QPS workloads • Strong understanding of transformer architectures.
Requirements
- Architecting RLHF
- Training Vision-Language Models
- Scaling multimodal models
- PyTorch
Preferred Technologies
- Architecting RLHF
- Training Vision-Language Models
- Scaling multimodal models
- PyTorch
Similar Jobs
AI & Data Lead Scientist
Aditya Birla Group
Lead Data Scientist
Bread Financial
Gen AI Lead / Data Scientist
Prospect 33