About the job
• Define system architecture for multimodal document processing, OCR, LMMs, and data pipelines. • Design modular, pluggable OCR evaluation platform (Tesseract, Textract, DocAI, GPT-Vision, LayoutLM, Donut, etc.). • Architect preprocessing (OpenCV), post-processing, normalization, and document understanding layers. • Lead design of routing engines, ensemble logic, and feedback fine-tuning loops. • ... Oversee data lake, vector index, and metadata storage design. • Guide cloud infrastructure (AWS/Azure/GCP) for scalable processing of millions of documents.
Requirements
- Machine Learning
- Artificial Intelligence
- OCR
- Data Pipelines
Qualifications
- Undergraduate
Preferred Technologies
- Machine Learning
- Artificial Intelligence
- OCR
- Data Pipelines
Similar Jobs
S
Engineer
S&P Global
New Delhi•Not disclosed
Last week•Hybrid
Engineer
Luxoft
Mumbai•Not disclosed
This Month•On-Site
AI/ML Engineer
Litmus7
Kochi•Not disclosed
2 days ago•On-Site