Bryckel AI

Junior ML Engineer

Bryckel AI
Moradabad Not disclosed
4 days ago
Remote
Apply to Job

About the job

Junior ML Engineer – LLM Infrastructure & Orchestration About Us We are a legal AI platform that ingests entire contracts and runs long-context, multimodal LLM pipelines on AWS Bedrock (Claude) and Vertex AI (Gemini). We operate schema-constrained LLM systems : prompts define intent, and Pydantic models enforce structure, validation, and reliability across production workflows. We’re hiring an ML Engineer (~1 year experience) to own LLM orchestration, latency, and scaling for workflows already live with customers. Available to join immediately or within 1 month. This role is production ML systems engineering, not model training. What You’ll Do • Build and operate end-to-end LLM pipelines for full-document analysis (100–500+ page contracts) • Implement schema-first LLM inference using Pydantic to produce deterministic, typed outputs • Own LLM orchestration logic : prompt routing, validation, retries, fallbacks, and partial re-execution • Optimize latency, throughput, and cost for long-context inference (batching, streaming, async execution) • Build and scale OCR → document parsing → LLM inference pipelines for scanned leases (Textract) • Develop streaming and async APIs using FastAPI • Manage distributed background workloads with Celery (queues, retries, idempotency, backpressure) • Productionize report generation (DOCX / EXCEL) as deterministic pipeline outputs • Deploy, monitor, and scale inference workloads on AWS (Bedrock, EC2, S3, Lambda) • Debug production issues : timeouts, schema failures, partial extractions, cost spikes What You’ll Own Technically • Pydantic-based schemas for all LLM outputs • Prompt ↔ schema contracts and versioning • Validation, retry, and fallback mechanisms • Latency and cost optimization for long-context inference • Reliability of OCR + LLM pipelines at scale Must Have • Strong Python and async programming fundamentals • ~1 year experience working on production ML or LLM systems • Hands-on experience with Claude, Gemini, and AWS Bedrock • Experience with schema-constrained LLM outputs (Pydantic, JSON Schema, or similar) • Experience with OCR and document-heavy pipelines • Experience with Celery or distributed async job systems • Comfort treating LLMs as non-deterministic services requiring validation and retries • Individual contributor mindset in a lean startup • Available to join immediately or within 1 month Nice to Have (Strong ML Signals) • Experience with streaming LLM responses • Familiarity with long-context failure modes and truncation issues • Experience with LLM output evaluation or regression testing • Cost monitoring and optimization for LLM inference Why Join Us • Work on real production ML systems, not demos • Own core LLM infrastructure end-to-end • Direct exposure to long-context, document-scale AI • Fully remote, fast-paced startup

Requirements

  • Python
  • async programming
  • Claude
  • Gemini
  • AWS Bedrock
  • Pydantic
  • OCR
  • Celery
  • distributed async job systems

Preferred Technologies

  • Python
  • async programming
  • Claude
  • Gemini
  • AWS Bedrock
  • Pydantic
  • OCR
  • Celery
  • distributed async job systems

About the company

We are a legal AI platform that ingests entire contracts and runs long-context, multimodal LLM pipelines on AWS Bedrock (Claude) and Vertex AI (Gemini).

Similar Jobs

Akoni Technologies

AI / ML Engineer

Akoni Technologies

AnandNot disclosed
8 hours agoRemote
Akoni Technologies

AI / ML Engineer

Akoni Technologies

AnandNot disclosed
YesterdayRemote
IBM

ML Engineer

IBM

IndiaNot disclosed
Last MonthOn-Site