ClanX

Senior Data Scientist

ClanX
India Not disclosed
Yesterday
Remote
Apply to Job

About the job

Drive frontier research in speech-to-speech and multimodal AI systems to build natural, self-learning, voice-capable AI Workers. Requirements • 3+ years of applied or academic experience in speech, multimodal, or LLM research • Bachelor’s or Master’s in Computer Science, AI, or Electrical Engineering • Strong in Python and scientific computing, including JupyterHub environments • Deep understanding of LLMs, transformer architectures, and multimodal embeddings • Experience in speech modeling pipelines: ASR, TTS, speech-to-speech, or audio-language models • Knowledge of turn-taking systems, VAD, prosody modeling, and real-time voice synthesis • Familiarity with self-supervised learning, contrastive learning, and agentic reinforcement (ART) • Skilled in dataset curation, experimental design, and model evaluation • Comfortable with tools like Agno, Pipecat, HuggingFace, and PyTorch • Exposure to LangChain, vector databases, and memory systems for agentic research • Strong written communication and clarity in presenting research insights • High research curiosity, independent ownership, and mission-driven mindset Responsibilities • Research and develop direct speech-to-speech modeling using LLMs and audio encoders/decoders • Model and evaluate conversational turn-taking, latency, and VAD for real-time AI • Explore Agentic Reinforcement Training (ART) and self-learning mechanisms • Design memory-augmented multimodal architectures for context-aware interactions • Create expressive speech generation systems with emotion conditioning and speaker preservation • Contribute to SOTA research in multimodal learning, audio-language alignment, and agentic reasoning • Define long-term AI research roadmap with the Research Director • Collaborate with MLEs on model training and evaluation, while leading dataset and experimentation design Interview process • Screening / HR round • Technical round(s) — coding, system design, ML case studies • ML / research deep dive • Final / leadership round

Requirements

  • speech modeling
  • multimodal AI
  • LLM research
  • Python

Qualifications

  • 3+ years of applied or academic experience in speech
  • Bachelor’s or Master’s in Computer Science, AI, or Electrical Engineering

Preferred Technologies

  • speech modeling
  • multimodal AI
  • LLM research
  • Python

About the company

GoCommotion is a fast-growing startup revolutionizing Customer Experience (CX) with AI Workers — persistent, multimodal agents capable of real-time, contextual conversations across channels. We blend LLMs, voice modeling, and agentic reinforcement to build truly autonomous AI systems.

Similar Jobs

Mastercard

Senior Data Scientist

Mastercard

GurugramNot disclosed
4 days agoOn-Site
Max Healthcare

Senior Data Scientist

Max Healthcare

GurugramNot disclosed
2 days agoHybrid
A

Senior Data Scientist

Albertsons Companies India

East GodavariNot disclosed
5 days agoOn-Site