About the job
Role Overview You will design, train, optimize, and deploy machine learning models for on-device inference with strict latency and memory constraints. Your focus areas: • Small LLMs / SLMs (<1B parameters) • Computer Vision models for real-time applications • Model compression & quantization • Edge deployment & performance optimization with tools like Qualcomm AI hub Key Responsibilities • Train and fine-tune small LLMs for domain-specific applications. • Use of tools like Qualcomm AI Hub. • Develop and optimize small Computer Vision models for image explanation and description like smolvlm. • Apply quantization, pruning, distillation, LoRA/QLoRA. • Deploy models using TensorFlow Lite, ONNX Runtime, TensorRT, or PyTorch Mobile. • Optimize models for low-latency, low-memory environments. • Build real-time AI pipelines for AR/VR and Android-based devices. • Work closely with Unity, mobile, Android native module, and backend teams for AI integration. Required Skills • Strong Python, PyTorch and/or TensorFlow expertise. • Hands-on experience with offline edge deployment and model optimization. • Experience with YOLO, MobileNet, EfficientNet or similar lightweight architectures. • Knowledge of transformer architectures and small LLM fine-tuning. • Experience deploying models on Android, embedded systems, or edge hardware. • Strong understanding of memory, compute, and inference optimization. What We’re Looking For • Someone who has deployed real on-device AI systems (not cloud training). • Strong performance optimization mindset. • Comfortable working in R&D-driven, fast-moving environments.
Requirements
- Machine Learning
- Edge AI
- Computer Vision
- Python
- TensorFlow
- PyTorch
Preferred Technologies
- Machine Learning
- Edge AI
- Computer Vision
- Python
- TensorFlow
- PyTorch
About the company
InfiVR builds advanced VR/AR and AI-powered enterprise solutions. We specialize in real-time, on-device intelligence for immersive training, industrial automation, and smart assistance systems.
Similar Jobs
Machine Learning Engineer
QC Verify, LLC
Machine Learning Engineer
Apple