About the job
As an AI Engineer at our company, you will be responsible for leveraging your expertise in Large Language Models (LLMs), Retrieval-Augmented Generation (RAG) systems, and Vector Database architectures to build end-to-end AI-driven solutions. Your role will involve enhancing knowledge retrieval, automating reasoning, and delivering scalable conversational or cognitive experiences. • *Key Responsibilities:** - **RAG Architecture Design:**- Develop and implement Retrieval-Augmented Generation pipelines using LLMs integrated with external knowledge sources and vector stores. - **LLM Integration & Fine-Tuning:**- Fine-tune or prompt-engineer models like GPT, Llama, Falcon, Mistral, T5, or Claude. - Optimize inference workflows for efficiency, context management, and accuracy. - **Document Processing & Chunking:**- Design intelligent text-splitting and chunking strategies for long documents. - Build embedding generation and context retrieval pipelines. - **Vector Database Management:**- Integrate and optimize vector stores like FAISS, Pinecone, Chroma, Weaviate, Milvus, or Qdrant. - Implement similarity search, hybrid retrieval, and ranking mechanisms. - **Python-Based AI Development:**- Build APIs and microservices using FastAPI / Flask / LangChain / LlamaIndex. - Create reusable AI pipelines for inference, retraining, and data ingestion. - **Data Handling & Preprocessing:**- Clean, transform, and index structured and unstructured data for efficient knowledge retrieval. - **Performance Optimization & Monitoring:**- Evaluate model performance using precision, recall, BLEU, ROUGE, or RAG-specific metrics. - Deploy and monitor models using Docker, MLflow, and cloud environments (AWS/GCP/Azure). - **Collaboration:**- Work cross-functionally with data scientists, backend engineers, and domain experts to integrate AI models into enterprise applications. • *Required Skills & Tools:** - **Core Skills:**- Programming: Python (mandatory), familiarity with TypeScript or Node.js is a plus - **LLM Frameworks:**- LangChain, LlamaIndex, Hugging Face Transformers - **Vector Databases:**- FAISS, Pinecone, Chroma, Weaviate, Milvus, Qdrant - **Model Types:**- OpenAI GPT, Llama2/3, Mistral, Falcon, Claude, Gemini - **Embedding Models:**- Sentence Transformers, OpenAI Embeddings, Instructor, or Custom Models - **RAG Stack:**- Document loaders, text chunking, embedding generation, retrieval, context assembly - **APIs & Deployment:**- FastAPI, Flask, Docker, MLflow, Streamlit - **Version Control:**- Git, GitHub/GitLab
Requirements
- LLMs
- RAG
- Python
- FastAPI
- Flask
Preferred Technologies
- LLMs
- RAG
- Python
- FastAPI
- Flask
Similar Jobs
AI Engineer
Money Forward India
AI Engineer
proMX
AI Engineer
AWIGN ENTERPRISES PRIVATE LIMITED