About the job
As an AI Engineer at our company, you will be responsible for leveraging your expertise in Large Language Models (LLMs), Retrieval-Augmented Generation (RAG) systems, and Vector Database architectures to build end-to-end AI-driven solutions. Your role will involve enhancing knowledge retrieval, automating reasoning, and delivering scalable conversational or cognitive experiences. Key Responsibilities: • RAG Architecture Design: - Develop and implement Retrieval-Augmented Generation pipelines using LLMs integrated with external knowledge sources and vector stores. • LLM Integration & Fine-Tuning: - Fine-tune or prompt-engineer models like GPT, Llama, Falcon, Mistral, T5, or Claude. - Optimize inference workflows for efficiency, context management, and accuracy. • Document Processing & Chunking: - Design intelligent text-splitting and chunking strategies for long documents. - Build embedding generation and context retrieval pipelines. • Vector Database Management: - Integrate and optimize vector stores like FAISS, Pinecone, Chroma, Weaviate, Milvus, or Qdrant. - Implement similarity search, hybrid retrieval, and ranking mechanisms. • Python-Based AI Development: - Build APIs and microservices using FastAPI / Flask / LangChain / LlamaIndex. - Create reusable AI pipelines for inference, retraining, and data ingestion. • Data Handling & Preprocessing: - Clean, transform, and index structured and unstructured data for efficient knowledge retrieval. • Performance Optimization & Monitoring: - Evaluate model performance using precision, recall, BLEU, ROUGE, or RAG-specific metrics. - Deploy and monitor models using Docker, MLflow, and cloud environments (AWS/GCP/Azure). • Collaboration: - Work cross-functionally with data scientists, backend engineers, and domain experts to integrate AI models into enterprise applications. Required Skills & Tools: • Core Skills: - Programming: Python (mandatory), familiarity with TypeScript or Node.js is a plus • LLM Frameworks: - LangChain, LlamaIndex, Hugging Face Transformers • Vector Databases: - FAISS, Pinecone, Chroma, Weaviate, Milvus, Qdrant • Model Types: - OpenAI GPT, Llama2/3, Mistral, Falcon, Claude, Gemini • Embedding Models: - Sentence Transformers, OpenAI Embeddings, Instructor, or Custom Models • RAG Stack: - Document loaders, text chunking, embedding generation, retrieval, context assembly.
Requirements
- Python
- LLMs
- RAG
- Vector Databases
- Document Processing
Preferred Technologies
- Python
- LLMs
- RAG
- Vector Databases
- Document Processing
Similar Jobs
AI Engineer
Money Forward India
AI Engineer
proMX
AI Engineer
AWIGN ENTERPRISES PRIVATE LIMITED