About the job
About the Role: Our client, a medium-sized Financial Services & Technology company based in the US, is seeking a highly skilled Senior Machine Learning Engineer to support LLM evaluation, task design, and advanced model benchmarking. This is a short-term, part-time remote engagement with flexible hours, ideal for ML professionals with competitive modeling backgrounds and strong applied research skills. Key Responsibilities: • Design and develop LLM evaluation tasks, benchmarks, and assessment frameworks. • Create datasets, prompts, scoring methods, and ground-truth structures for robust LLM evaluation. • Build tools for model comparison, error analysis, and performance measurement. • Conduct research-driven experimentation using modern ML/NLP frameworks. • Document evaluation methodologies, insights, and reproducible pipelines. Minimum Qualifications: • 3+ years of full-time experience in machine learning model development. • Technical degree in Computer Science, Electrical Engineering, Mathematics, Statistics, or related field. • Demonstrated competitive ML experience (Kaggle, DrivenData, etc.). • Evidence of high performance in ML competitions (Kaggle medals, finalist placements, leaderboard ranks). • Strong proficiency in Python, PyTorch/TensorFlow, and modern ML/NLP toolkits. Preferred Qualifications: • Experience working with LLMs, prompt engineering, or task generation. • Background in NLP, generative models, or evaluation research. • Familiarity with data-centric evaluation, annotation workflows, and scoring metrics. • Prior consulting or short-term project experience. Pay & Benefits: • Short-term, part-time consulting with flexible hours. • Competitive hourly or retainer-based compensation. • Work on cutting-edge LLM evaluation and AI research problems. • Collaborate with globally distributed technical teams. • 100% Remote and flexible work setup. Employment Details: • Position: Senior Machine Learning Engineer – LLM Evaluation / Task Creation • Engagement: Independent Contractor (Remote, Short-Term) • Industry: AI, Machine Learning, NLP Research • Key Skills: LLM Evaluation, Python, PyTorch, NLP, ML Competitions, Task Creation • Confidentiality & IP Protection Required.
Requirements
- Machine Learning
- LLM Evaluation
- Python
- NLP
- Task Creation
Qualifications
- Technical degree in Computer Science, Electrical Engineering, Mathematics, Statistics
Preferred Technologies
- Machine Learning
- LLM Evaluation
- Python
- NLP
- Task Creation
Benefits
- Competitive hourly compensation
- Flexible work setup
Similar Jobs
Senior Machine Learning Engineer
ServiceNow
Machine Learning Engineer
Tata Consultancy Services
Senior Machine Learning Engineer
Neo Wealth and Asset Management