About the job
About TalentGum TalentGum is transforming extracurricular learning for children aged 5–14 years through engaging live online courses in music, dance, chess, and public speaking etc. Our mission is to build the next generation of learning intelligence — an AI-driven platform that can observe, understand, and help children improve their creative and cognitive skills through personalized, real-time insights. Role Overview As an AI Research Engineer, you’ll design and develop machine learning systems that can understand and evaluate human performance — starting with sound (music, speech) and later expanding to vision (movement, gestures). You’ll collaborate with Audio / Vision Engineers, Backend Developers, and Subject Matter Experts (SMEs) such as musicians, dancers, and coaches to convert expert intuition into measurable, scalable AI feedback. Key Responsibilities • Research, design, and train AI models for real-time performance evaluation across domains (music first, then dance / speech / chess). • Implement and optimize deep learning architectures for audio and / or visual understanding (CNNs, RNNs, Transformers). • Work closely with Audio / Vision Engineers to build data pipelines for clean, real-time feature extraction (spectrograms, keypoints, pose sequences). • Collaborate with SMEs to define “performance quality” metrics and label datasets. • Develop evaluation frameworks to quantify model accuracy vs. expert feedback. • Experiment with cross-modal fusion (audio + vision) for synchronized analysis in future domains like dance. • Optimize models for low-latency inference on web / mobile devices (ONNX, TensorRT, TF Lite). • Document research findings, prototype outcomes, and contribute to internal knowledge-sharing. Required Skills & Experience • 3+ years of hands-on experience in Machine Learning / Deep Learning (PyTorch, TensorFlow). • Strong mathematical foundation in signal processing, time-series analysis, and statistics. • Proven experience with audio or visual data — music, speech, motion, or similar perceptual domains. • Familiarity with MIR (Music Information Retrieval) or Computer Vision tasks like: o Pitch detection, beat tracking, timbre classification, speech analysis, o Pose estimation, gesture recognition, or motion tracking. • Experience with model optimization and deployment (TorchScript, ONNX, TensorRT). • Strong Python skills and familiarity with libraries such as NumPy, pandas, Librosa, Essentia, OpenCV, or MediaPipe. Nice to Have • Research or published work in audio AI, multimodal AI, or performance evaluation. • Experience building or experimenting with real-time ML inference systems. • Background in music, performing arts, or educational AI. • Familiarity with cloud platforms (AWS, GCP) and CI / CD for ML (MLflow, DVC). • Curiosity and creativity in experimenting with human-centered AI.
Requirements
- Machine Learning
- Deep Learning
- Python
- PyTorch
- TensorFlow
Preferred Technologies
- Machine Learning
- Deep Learning
- Python
- PyTorch
- TensorFlow
About the company
TalentGum is transforming extracurricular learning for children aged 5–14 years through engaging live online courses in music, dance, chess, and public speaking etc. Our mission is to build the next generation of learning intelligence — an AI-driven platform that can observe, understand, and help children improve their creative and cognitive skills through personalized, real-time insights.
Similar Jobs
AI Engineer
Vista Applied Solutions Group Inc
AI Engineer
JRD Systems
AI Engineer
Infiswift Technologies