About the job
What You’ll Do • Build and own AI-backed features end to end, from ideation to production — including layout logic, smart cropping, visual enhancement, out-painting and GenAI workflows for background fills. • Design scalable APIs that wrap vision models like BiRefNet, YOLOv8, Grounding DINO, SAM, CLIP, ControlNet, etc., into batch and real-time pipelines. • Write production-grade Python code to manipulate and transform image data using NumPy, OpenCV (cv2), PIL, and PyTorch. • Handle pixel-level transformations — from custom masks and color space conversions to geometric warps and contour ops — with speed and precision. • Integrate your models into our production web app (AWS based Python/Java backend) and optimize them for latency, memory, and throughput. • Frame problems when specs are vague — you’ll help define what “good” looks like, and then build it. • Collaborate with product, UX, and other engineers without relying on formal handoffs — you own your domain. What You’ll Need • 2–3 years of hands-on experience with vision and image generation models such as YOLO, Grounding DINO, SAM, CLIP, Stable Diffusion, VITON, or TryOnGAN — including experience with inpainting and outpainting workflows using Stable Diffusion pipelines (e.g., Diffusers, InvokeAI, or custom-built solutions). • Strong hands-on knowledge of NumPy, OpenCV, PIL, PyTorch, and image visualization/debugging techniques. • 1–2 years of experience working with popular LLM APIs such as OpenAI, Anthropic, Gemini and how to compose multi-modal pipelines. • Solid grasp of production model integration — model loading, GPU/CPU optimization, async inference, caching, and batch processing. • Experience solving real-world visual problems like object detection, segmentation, composition, or enhancement. • Ability to debug and diagnose visual output errors — e.g., weird segmentation artifacts, off-center crops, broken masks. • Deep understanding of image processing in Python: array slicing, color formats, augmentation, geometric transforms, contour detection, etc. • Experience building and deploying FastAPI services and containerizing them with Docker for AWS-based infra (ECS, EC2/GPU, Lambda). • Solid grasp of production model integration — model loading, GPU/CPU optimization, async inference, caching, and batch processing. • A customer-centric approach — you think about how your work affects end users and product experience, not just model performance. • A quest for high-quality deliverables — you write clean, tested code and debug edge cases until they’re truly fixed. • The ability to frame problems from scratch and work without strict handoffs — you build from a goal, not a ticket.
Requirements
- Vision and Image Generation Models
- Python
- NumPy
- OpenCV
- FastAPI
- LLM APIs
Preferred Technologies
- Vision and Image Generation Models
- Python
- NumPy
- OpenCV
- FastAPI
- LLM APIs
About the company
Our goal is to make hiring reliable, simple, and fast. Our role will be to help all our talents find and apply for relevant contractual onsite opportunities and progress in their career. We will support any grievances or challenges you may face during the engagement.
Similar Jobs
AI Engineer
People Prime Worldwide
AI Engineer
Vista Applied Solutions Group Inc
AI Engineer
Clifford Chance