AI Backend Engineer

Uplers

Bengaluru • ₹ Not disclosed

10 hours ago

Hybrid

70%

Job Match Score

Want to Land Jobs Faster?

Unlock priority referrals, smarter matches, and tools to stay organised - starting at just ₹399/month.

About the job

Experience: 4.00 + years Salary: INR 7000000-10000000 / year (based on experience) Expected Notice Period: 30 Days Shift: (GMT+05:30) Asia/Kolkata (IST) Opportunity Type: Hybrid () Placement Type: Full Time Permanent position (Payroll and Compliance to be managed by: Sedona.AI) (*Note: This is a requirement for one of Uplers' client - Sedona.AI) What do you need for this opportunity? Must have skills required: Python, Rust, Kubernetes, Terraform, Cloud, AI model deployment, Asr/tts, VLMs Sedona.AI is Looking for: We're looking for an AI Backend Engineer at Sedona AI to own the end-to-end inference infrastructure powering AI experiences for millions of homes. You'll design, build, and operate the systems that serve language, audio, and vision models in production - obsessing over p99 latency, cost efficiency, and reliability at scale. If you think deeply about distributed systems, model serving, and real-time data pipelines, and want to solve problems that don't have clean answers yet - this is the role. What You'll Work On • Design and own the backend systems that serve AI models in production — spanning language, audio, and vision-language models • Architect distributed serving infrastructure that handles real-world traffic reliably and efficiently • Build and optimize real-time audio processing pipelines for cloud-based inference • Serve vision-language and video understanding models at scale • Push the boundaries of inference performance — through smarter batching, caching, quantization, and serving strategies • Own the full stack from API design to model serving internals • Work directly with ML researchers and product engineers to take models from prototype to production What We're Looking For • 4+ years of experience in ML infrastructure, platform engineering, or large-scale backend systems • Strong foundations in systems engineering — distributed systems, backend design, performance optimization • Experience building and operating large-scale backend services in production • Hands-on experience deploying AI models using frameworks such as vLLM, SGLang, Triton, or equivalent • Real understanding of what it takes to serve models reliably — not just spin them up • Familiarity with real-time or streaming data processing, particularly for audio pipelines • Understanding of multimodal model architectures — audio (ASR, TTS), vision-language models (VLMs), and how they differ at inference time • Strong expertise in Kubernetes, Terraform, and cloud platforms (AWS and/or GCP) • Strong in Python and Rust; solid Linux fundamentals Nice to Have • Experience with inference optimization — KV-cache tuning, speculative decoding, dynamic batching • Familiarity with TensorRT-LLM or hardware-specific GPU optimizations • Experience with GPU cost modeling, multi-region inference, or traffic routing • Experience with high-availability system design or load balancing

Requirements

Python
Rust
Kubernetes
Terraform

Preferred Technologies

Python
Rust
Kubernetes
Terraform

Get More Job Recommendations Like this

Set Alert

About the company

Our goal is to make hiring reliable, simple, and fast. Our role will be to help all our talents find and apply for relevant contractual onsite opportunities and progress in their career. We will support any grievances or challenges you may face during the engagement.

Similar Jobs