About the job
Job Description: Primary Skills: • Languages & Frameworks: Python, PyTorch, Hugging Face Transformers • Modeling: SLM/LLM fine-tuning, Transformer architectures, vLLM • Infrastructure: On-premise GPU deployment • MLOps: CI/CD pipelines, model observability tools Job Summary and Roles & Responsibilities: AI Sr Engineer (SA) 2 Positions We are seeking an experienced and hands-on Data Scientist with around 9 to 11 years of experience in AI/ML, particularly in fine-tuning and deploying Large Language Models (LLMs) and Small Language Models (SLMs). This is a high-priority role, and we would like to engage in a technical discussion with the identified candidate prior to onboarding. Job Description: The selected candidate will contribute to the end-to-end lifecycle of LLM/SLM fine-tuning and deployment. The role involves working on incremental fine-tuning strategies, optimizing inference using vLLM, and supporting DevOps-based CI/CD pipelines. The ideal candidate should have practical experience in model training, infrastructure planning, and deploying models on on-premise GPU environments. A strong foundation in data preparation, model evaluation, and monitoring is essential. Key Responsibilities: Model Development & Fine-Tuning • Apply incremental fine-tuning techniques to extend model checkpoints with stability and version control. • Select and configure SLMs based on specific business requirements. • Tune hyperparameters and set up training environments. • Evaluate and compare models using appropriate performance metrics. Data Preparation • Collect, clean, tokenize, and split datasets into training, validation, and test sets. • Ensure data quality and relevance for model training. Training & Evaluation • Train models using curated datasets and monitor for overfitting and performance issues. • Validate models on unseen data and assess generalization. • Understand and apply the appropriate use of SLMs, SLM fine-tuning, or LLMs based on business context. Inference & Model Serving • Use vLLM for efficient, scalable inference. • Deploy models with optimized endpoints for performance and reliability. Infrastructure & Deployment • Estimate and manage compute resource requirements (CPU/GPU/memory/storage). • Deploy models on on-premise GPU infrastructure with a focus on efficiency. • Collaborate with engineering teams to integrate models into production systems. DevOps & CI/CD • Support the development of CI/CD pipelines for model training and deployment. • Contribute to automation of retraining workflows based on model drift. Monitoring & Maintenance • Assist in setting up monitoring systems for real-time model performance. • Participate in periodic retraining and maintenance to ensure model accuracy and relevance. Technology Exploration & POCs • Stay updated with the latest trends in AI/ML tools, frameworks, and methodologies. • Conduct proof-of-concept (POC) projects to evaluate new technologies for potential adoption. • Document findings and present recommendations to the team. Required Skills & Tools: • Languages & Frameworks: Python, PyTorch, Hugging Face Transformers • Modeling: SLM/LLM fine-tuning, Transformer architectures, vLLM • Infrastructure: On-premise GPU deployment • MLOps: CI/CD pipelines, model observability tools
Requirements
- Python
- PyTorch
- Hugging Face Transformers
- SLM/LLM fine-tuning
- Transformer architectures
- vLLM
Qualifications
- 9 to 11 years of experience in AI/ML
Preferred Technologies
- Python
- PyTorch
- Hugging Face Transformers
- SLM/LLM fine-tuning
- Transformer architectures
- vLLM
Similar Jobs
Sr AI Engineer
Litmus7
Sr AI Engineer
Litmus7
AI Engineer
Vista Applied Solutions Group Inc