About the job
Why this role We're building agentic AI for recruitment workflows sourcing, screening, interview assistance, and offer orchestration. You'll own LLM/agent design, retrieval, evaluation, safety, and targeted traditional ML models where they outperform or complement LLMs. What you'll do • Hands-on AI (70-80%): design & build agent workflows (tool use, planning/looping, memory, self-critique) using multi-agent frameworks (e.g., LangChain, LangGraph; plus experience with similar ecosystems like AutoGen/CrewAI is a plus). • Retrieval & context (RAG): chunking, metadata, hybrid search, query rewriting, reranking, and context compression. • Traditional ML: design and ship supervised/unsupervised models for ranking, matching, dedup, scoring, and risk/quality signals. • Feature engineering, leakage control, CV strategy, imbalanced learning, and calibration. • Model families: Logistic/Linear, Tree ensembles, kNN, SVMs, clustering, basic time-series. • Evaluation & quality: offline/online evals (goldens, rubrics, A/B), statistical testing, human-in-the-loop; build small, high-signal datasets. • Safety & governance: guardrails (policy/PII/toxicity), prompt hardening, hallucination containment; bias/fairness checks for ML. • Cost/perf optimization: model selection/routing, token budgeting, latency tuning, caching, semantic telemetry. • Light MLOps (in-collab): experiment tracking, model registry, reproducible training; coordinate batch/real-time inference hooks with platform team. • Mentorship: guide 2-3 juniors on experiments, code quality, and research synthesis. • Collaboration: pair with full-stack/infra teams for APIs/deploy; you won't own K8s/IaC. What you've done (must-haves) • 8-10 years in software/AI with recent deep focus on LLMs/agentic systems plus delivered traditional ML projects. • Strong Python; solid stats/ML fundamentals (bias-variance, CV, A/B testing, power, drift). • Built multi-agent or tool-using systems with LangChain and/or LangGraph (or equivalent), including function/tool calling and planner/executor patterns. • Delivered RAG end-to-end with vector databases (pgvector/FAISS/Pinecone/Weaviate), hybrid retrieval, and cross-encoder re-ranking. • Trained and evaluated production ML models using scikit-learn and tree ensembles (XGBoost/LightGBM/CatBoost); tuned via grid/Bayes/Optuna. • Set up LLM and ML evals (RAGAS/DeepEval/OpenAI Evals or custom), with clear task metrics and online experiments. • Implemented guardrails & safety and measurable quality gates for both LLM and ML features. • Product sense: translate use-cases into tasks/metrics; ship iteratively with evidence.
Requirements
- AI
- Python
- ML
- LLMs
- Data Engineering
Qualifications
- 8-10 years in software/AI
- Experience with LLMs/agentic systems
Preferred Technologies
- AI
- Python
- ML
- LLMs
- Data Engineering
Similar Jobs
Lead AI Engineer
Spectrum Talent Management
Lead AI Engineer
Leapfrog Technology
Lead AI Engineer
Luxoft