Senior AI Software Engineer
Heidelberg
Aleph Alpha
Pioneering sovereign, European AI technology to transform human-machine interaction that can find solutions for the challenges of tomorrow.Aleph Alpha Research’s mission is to deliver category-defining AI innovation that enables open, accessible, and trustworthy deployment of GenAI in industrial applications. Our organization develops foundational models and next-generation methods that make it easy and affordable for Aleph Alpha’s customers to increase productivity in finance, administration, R&D, logistics, and manufacturing processes.
We are hiring to grow our org in Heidelberg, Germany, and are looking for well-rounded, experienced Senior AI Software Engineers with experience in DevOps/MLOps.
As a Senior AI Software Engineer in Aleph Alpha Research, you help the research teams take model and algorithm development to the next level. You own significant portions of the research infrastructure, including the pipelines related to data processing, our testing infrastructure, and engineering-heavy parts of our distributed training software. You also contribute your software engineering experience to research projects that have a significant influence on our ability to deliver novel category-defining AI capabilities.
As part of our Infrastructure and Platform Engineering Team, you maintain and develop cluster infrastructure, manage cloud infrastructure, build and manage DevOps and MLOps pipelines, implement SE best practices (CI/CD, monitoring, testing frameworks), and collaborate and co-develop with our Data Center Infrastructure Team as well as Product Teams on shared platform components or projects.
In our Data Engineering and Distributed Training Team, you engineer and optimize data processing pipelines, develop and maintain components of distributed training software, and build and maintain infrastructure and support for data-heavy tasks.
As part of our Research SE Teams, you work alongside Researchers and other SW Engineers on model and algorithm development, collaborate on ablation studies, Proof of Concepts (POCs), and model optimizations. You create robust, maintainable codebases that support efficient transition of new technologies and R&D artifacts from research to production, and co-own efforts that aim to make parts of our code source available to the broader research community.
Your responsibilities
Depending on your profile, you will contribute to one or more of the following areas:
Design and (continuous) development of the research infrastructure, establish mechanisms that improve code quality, testing, and feature delivery
Support the development, training, and maintenance of deep learning models, in collaboration with the researchers as well as the SW/HW engineers at our distributed computation centers
Developing and optimizing lower-level code for data processing, tokenization, or research projects
Contributing your software-engineering expertise to research projects (this could be, for example, in areas such as agent interfaces or data generation)
Help production AI research innovations into real-world applications
Engaging in our hiring process and otherwise mentoring engineers and researchers in terms of software development best practices
Most of our training code is written in Python, with PyTorch being our main deep learning framework. Some of our lower-level code is written in Rust.
Your profile
Basic Qualifications
5+ years of non-internship professional experience across the full software development lifecycle, including coding standards, code reviews, source control management, build processes, testing, and operational excellence
Demonstrated ability to solve complex and novel problems independently using state-of-the-art scientific approaches
Proven track record in design and architecture (e.g., design patterns, reliability, scaling) of large-scale systems
Deep expertise in at least one major programming language, and ability to independently implement complex changes to foundational systems or algorithms
Strong communication skills, with the ability to convey complex technical concepts, anticipate scientific or engineering limitations and drive consensus across teams
Bachelor’s degree in computer science, engineering, or a closely related field
Ready to relocate to Heidelberg, Germany, or otherwise come to our Heidelberg office regularly, potentially weekly
Preferred Qualifications
Demonstrated skills in integrating complex systems with cross-team collaboration to enhance solution consistency and overall impact
Proven experience designing and delivering high-performance, scalable systems into production environments
Contributions to research outputs or top-tier publications
Experience with systems programming and low-level languages such as Rust, focusing on performance and reliability
Master’s degree in computer science or related fields
We do not necessarily require prior experience in machine learning for this role, but we do value your eagerness to learn. If you have prior experience in ML, we will be particularly excited about:
Experience productizing AI research innovations into real-world applications, especially in areas such as large-scale data processing and distributed computation for foundational model training or inference.
Familiarity with popular NLP tools and frameworks such as PyTorch or HF Transformers, with knowledge of transformer architectures.
Ability to write clear proposals or publications, and demonstrated excellence in explaining research contributions to both technical and non-technical stakeholders.
Proven ability to apply advanced scientific methods to novel problems, resulting in impactful outputs such as publications or projects.
Our tenets
We believe embodying these values would make you a great fit in our team:
We own work end-to-end, from idea to production: You take responsibility for every stage of the process, ensuring that our work is complete, scalable, and of the highest quality.
We ship what matters: Your focus is on solving real problems for our customers and the research community. You prioritize delivering impactful solutions that bring value and make a difference.
We work transparently: You collaborate and share your results openly with the team, partners, customers, and the broader community through publishing and sharing results and insight including blogposts, papers, checkpoints, and more.
We innovate through leveraging our intrinsic motivations and talents: We strive for technical depth and to balance ideas and interests of our team with our mission-backwards approach, and leverage the interdisciplinary, diverse perspectives in our teamwork.
What you can expect from us
Become part of an AI revolution!
30 days of paid vacation
Access to a variety of fitness & wellness offerings via Wellhub
Mental health support through nilo.health
Substantially subsidized company pension plan for your future security
Subsidized Germany-wide transportation ticket
Budget for additional technical equipment
Flexible working hours for better work-life balance and hybrid working model
Virtual Stock Option Plan
JobRad® Bike Lease
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Architecture CI/CD Computer Science Deep Learning DevOps Engineering Finance Generative AI Industrial Machine Learning MLOps Model training NLP Pipelines Python PyTorch R R&D Research Rust Security Testing Transformers
Perks/benefits: Career development Equity / stock options Flex hours Flex vacation
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.