About the job
Location : Remote Team : Applied AI / Data Science Role Overview As a Data Scientist at blkbox.ai, you will architect, implement, and scale data-driven and LLM-based systems that generate high-performing mobile gaming ads at production level. This role blends data science, multimodal AI, generative models, and LLM automation—focusing on retrieval pipelines, embedding-based similarity search, and production-grade model operations. You will design and deploy end-to-end LLM workflows—from prompt design and model orchestration to RAG pipelines and vector database integration—enabling automated ad creation. Key Responsibilities LLM System Design, Data Intelligence & Workflow Engineering • Architect LLM-powered workflows for narrative generation, scripting, voiceovers, storylines, and other ad concepts • Build data-driven logic and model selection strategies informed by creative performance and mobile gaming outcomes • Run structured data analysis using Python and SQL to identify creative attributes correlated with winning ad performance • Integrate multimodal data sources into creative intelligence pipelines LLM, RAG & Vector Retrieval • Use LLM APIs such as OpenAI GPT and Google Gemini for scripted and generative workflows • Build retrieval-augmented generation (RAG) pipelines for : • story / voiceover generation • gameplay retrieval • creative asset selection • Work with modern embedding models (OpenAI, sentence-transformers, etc.) • Implement vector search via Pinecone, Weaviate, Chroma or FAISS • Experiment with prompt engineering, prompt templates, function calling, and structured prompting Production AI, Monitoring & Optimization • Collaborate with backend engineers to integrate LLM components into production pipelines • Monitor and optimize LLM workflows for : • latency • token usage and cost • throughput • failure rates • hallucinations • Evaluate multiple LLMs and embeddings for performance, relevance, and cost efficiency • Implement reproducible workflows, model evaluation systems, and reliability checks at scale Cross-Functional Collaboration • Partner with production and creative teams to translate creative requirements into systematic generative pipelines • Present technical recommendations clearly to product, engineering, and creative stakeholders • Own experiments end-to-end including benchmarking, metrics, prototype automation, and deployment Required Skills & Qualifications Technical • Strong proficiency in Python and SQL • Hands-on experience with LLMs, RAG concepts, embeddings, and vector search • Experience calling model APIs (OpenAI, Gemini, etc.) • Familiarity with LangChain, LlamaIndex or similar frameworks is a plus • Strong understanding of : • LLM evaluation frameworks • prompt engineering patterns • latency and cost optimization • model hallucination control • Solid foundation in statistics, experimentation, and ML deployment workflows Soft Skills & Behaviors • Ability to communicate complex technical concepts to non-technical partners • Curious, inventive, and motivated by experimentation and rapid iteration • Adaptability in fast-moving production environments • Strong ownership mindset and comfort working cross-functionally Why This Role Is Unique • You’ll be building real production LLM systems—not just prototypes • You’ll work across creative, multimodal, and generative domains • You’ll influence automated ad production for some of the largest mobile gaming companies in the world
Requirements
- Python
- LLMs
- Data Science
- Workflow Engineering
Preferred Technologies
- Python
- LLMs
- Data Science
- Workflow Engineering
Similar Jobs
Data Scientist
Tensor Planet
Data Scientist
Accelleron
Data Scientist
PI Industries Ltd