Bluvin Solutions Private Limited

Senior NLP/AI Engineer

Bluvin Solutions Private Limited
Noida Not disclosed
5 hours ago
Hybrid
Apply to Job

About the job

About the Role We are seeking an experienced Senior NLP/AI Engineer to lead development of cutting-edge AI systems powering translation, transliteration, ASR, OCR, LLM, and RAG products serving enterprise and government clients. This role requires expertise in building, optimizing, and deploying production-grade NLP and speech models at scale, along with strong infrastructure, model engineering, and cross-functional collaboration skills. You will be responsible for designing innovative AI solutions, training and optimizing models, improving system performance, and ensuring high availability of AI services used across platform, chatbot systems, and enterprise AI deployments. Key Responsibilities 1. AI Model Development & Training Train, finetune, and deploy models across multiple domains: - Multilingual Neural Machine Translation (NMT) - Adaptive Translation Systems - Multilingual Transliteration models (Indian languages) - Speech-to-Text (ASR / Whisper / Nvidia Nemo / Indic-ASR) - Text-to-Speech (TTS) - Large Language Models (LLMs) - Embedding models for RAG - Build multilingual models supporting 20+ Indian languages. - Perform dataset creation, preprocessing, augmentation, and large-scale training. - Conduct model benchmarking using chrf++, BLEU, WER, CER, and custom evaluation metrics. - Convert models to optimized inference formats (CTranslate2, Faster-Whisper, AWQ/INT4/INT8 quant). 2. Model Optimization for Production - Reduce model sizes through quantization and pruning. - Optimize inference speed improvements for real-time workloads. - Optimize GPU/CPU utilization and memory footprint for large models. - Build scalable inference pipelines for translation, ASR, and RAG. 3. Audio & Video Processing Systems - Develop advanced audio transcription and translation pipelines. - Implement real-time STT systems for indic languages. - Build video subtitle extraction and SRT translation workflows. - Integrate diarization, language detection, summarization, and cross-lingual translation. 4. RAG & LLM-Based Systems - Architect multilingual Retrieval-Augmented Generation (RAG) pipelines. - Build vector databases and embedding models. - Implement document indexing, chunking, parsing, and hybrid retrieval search. - Integrate LLMs (Llama, Gemma, Qwen etc.) for chatbot and voice-bot systems. 5. Infrastructure & Server Management - Manage AI/ML servers on AWS & GCP (GPU VM provisioning, optimization). - Reduce infra cost by optimizing GPU usage, scheduling, and server consolidation. - Implement auto-restart, monitoring, logging, and fail-safe mechanisms for all AI services. - Deploy high-availability APIs for translation, transliteration, ASR, OCR, and chatbots. - Familiarity with cloud-based GPU environments and troubleshooting (NVIDIA drivers). 6. Cross-Functional Ownership - Work with Sales, Ops, Tech teams to troubleshoot, support clients, and deliver large projects. - Maintain detailed documentation for product flows, APIs, model deployments. - Handle urgent escalations, server crashes, and mission-critical deployments. - Create internal tools and FAQs to reduce dependency on the AI team. Required Skills & Experience - Strong background in NLP, Speech, Deep Learning, and Generative AI. - Experience: 4-5 years in production ML/NLP systems. - Hands-on experience with: - Python, PyTorch, TensorFlow - Speech to text and Text to speech models, open source LLMs, Transformer architectures - CTranslate2, Faster-Whisper, ONNX Runtime - LLM inference frameworks like, vLLM, Sglang, LLM quantization techniques - Vector DBs (FAISS, Pinecone) - Docker, FastAPI, Linux systems - AWS/GCP GPU Infrastructure. - Expertise in multilingual NLP, especially Indian languages. - Experience creating datasets and training models from scratch.

Requirements

  • NLP
  • Speech
  • Deep Learning
  • Generative AI
  • Python
  • PyTorch
  • TensorFlow
  • Speech to text
  • Text to speech
  • CTranslate2
  • Faster-Whisper
  • Vector DBs

Preferred Technologies

  • NLP
  • Speech
  • Deep Learning
  • Generative AI
  • Python
  • PyTorch
  • TensorFlow
  • Speech to text
  • Text to speech
  • CTranslate2
  • Faster-Whisper
  • Vector DBs

Similar Jobs

C

Senior Engineer

CodeMyMobile

BunghmunNot disclosed
Last MonthRemote
C

Senior Engineer

CodeMyMobile

BunghmunNot disclosed
Last MonthRemote
Totale Global Private Limited

Senior Marketing Engineer

Totale Global Private Limited

ChennaiNot disclosed
2 days agoHybrid