Senior AI Engineer (Computer Vision+LLM)

Berlin, Germany

⚠️ We'll shut down after Aug 1st - try foo🦍 for all jobs in tech ⚠️

Full Time Senior-level / Expert EUR 101K - 188K * ^est.

Delivery Hero

Delivery Hero - Always delivering an amazing experience.

View all jobs at Delivery Hero

Apply now Apply later

Posted 5 hours ago

Company Description

As the world’s pioneering local delivery platform, our mission is to deliver an amazing experience, fast, easy, and to your door. We operate in over 70+ countries worldwide, powered by tech, designed by people. As one of Europe’s largest tech platforms, headquartered in Berlin, Germany. Delivery Hero has been listed on the Frankfurt Stock Exchange since 2017 and is part of the MDAX stock market index. We enable creative minds to deliver solutions that create impact within our ecosystem. We move fast, take action and adapt. No matter where you're from or what you believe in, we build, we deliver, we lead. We are Delivery Hero.

Job Description

We are on the lookout for a hands-on Sr. AI Engineer to join our Vendor Data Team as we push the boundaries of content understanding in real-world scenarios. This role focuses on building AI systems that can see and understand, going beyond text to tackle complex visual inputs — from everyday photos to noisy, low-resolution images captured in uncontrolled environments.

You’ll work at the intersection of computer vision and natural language processing to build multimodal AI pipelines that extract structured insights from unstructured, image-based content. Your work will ship, not sit in research notebooks.

We don’t just build models — we build autonomous, adaptive systems that orchestrate reasoning, take action, and evaluate themselves. From OCR pipelines enhanced by LLMs to agentic workflows that make decisions and improve with feedback, your code will help turn raw visual data into usable intelligence — at scale, and in production.

And the best about it? Your role has a direct impact on making food more affordable for millions of people every day.

Build and deploy multimodal AI pipelines combining OCR, computer vision, and LLMs to extract structured data from images and photos
Design agentic AI systems that dynamically generate prompts, invoke tools, and make decisions based on visual content
Develop evaluation frameworks for AI pipelines and agentic systems, including synthetic simulations, automated feedback loops, and performance benchmarking
Optimize OCR and visual content parsing for challenging real-world inputs using tools like Tesseract, LayoutLM, Donut, or TrOCR
Work closely with software engineers to seamlessly integrate AI features into our production systems. Design and maintain scalable infrastructure for LLM inference, API serving, and orchestration.
Implement comprehensive logging, observability, and monitoring for AI applications in production to ensure reliability and performance.

Qualifications

3+ years of hands-on experience in applied AI/ML, ideally in computer vision, document/image understanding, or NLP. Practical knowledge of image preprocessing (denoising, deblurring, segmentation) and pipeline optimization
Experience integrating AI models into real-time or batch applications (e.g., bots, assistants, search).
Strong software engineering background with experience in APIs, microservices, and distributed systems.
Familiarity with vector databases (e.g., FAISS, Weaviate, Pinecone), embedding strategies, and retrieval-augmented generation (RAG) and tools like Langfuse, Grafana, or OpenTelemetry for observability.
Experience designing or contributing to AI agents evaluation systems, including prompt testing, output validation, synthetic data generation, or simulation-based evaluation.
You are a pragmatic engineer who understands what is needed to get things done in a collaborative manner. You’re a self-organizing, proactive person eager to work in a fast-paced, fault-tolerant, and agile environment, you unblock yourself, ship code quickly, and iterate to improve

Additional Information

Familiarity with judge models, human-in-the-loop feedback, or agent safety validation
Understanding of prompt engineering for visual contexts and multimodal agents
Experience building autonomous workflows with frameworks like LangChain, AutoGen, or CrewAI
Background in delivering AI-powered features inside real-world web or mobile applications

We believe diversity and inclusion are key to creating not only an exciting product, but also an amazing customer and employee experience. Fostering this starts with hiring - therefore we do not discriminate on the basis of racial identities, religious beliefs, color, national origin, gender identities or expressions, sexual orientations, age, marital or disability statuses, or any other aspect that makes you, you. We encourage you to let us know if you need any accommodations or specific accessibility support to ensure a smooth interview experience—just include it in your application. You're welcome to share your pronouns (he/she/they) right from the start so we can address you respectfully from our first contact.

Apply now Apply later

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats: 3 0 0

Categories: Computer Vision Jobs Deep Learning Jobs Engineering Jobs

Tags: Agile APIs Computer Vision Distributed Systems Engineering FAISS Grafana LangChain LLMs Machine Learning Microservices NLP OCR Pinecone Pipelines Prompt engineering RAG Research Testing Weaviate