Machine Learning Engineer
Ramat Gan, Tel Aviv District, IL
Immunai
We partner with top biopharmaceutical companies and research institutions to discover and advance novel therapeutics.Description
About Immunai:
Immunai is an engineering-first platform company aiming to improve therapeutic decision-making throughout the drug discovery and development process. We are mapping the immune system at unprecedented scale and granularity and applying machine learning to this massive clinico-immune database, in order to generate novel insights into disease pathology for our partners - pharma companies and research institutes. We provide a comprehensive, end-to-end solution - from data generation and curation to therapeutics development, that continuously supports and validates the capabilities of our platform.
As drug development is becoming increasingly inefficient, our ultimate goal is to help bring breakthrough medicines to patients as quickly and successfully as possible.
Immunai is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees.
About the role:
We are seeking a highly skilled Machine Learning Engineer to enable a large team of deep learning engineers, data scientists and computational biologists.
We are looking for a driven candidate to expand our model observability software, lead the optimization of GPU infrastructure, and ensure efficient collaboration with data engineering. The ideal candidate will have development skills in python, deep expertise in cloud infrastructure, high-performance computing, and AI workload optimization. This role involves designing robust GPU management systems, automating model performance metrics, and supporting researchers with the tools they need to train and deploy machine learning models effectively.
Location: Ramat Gan, Israel (hybrid role)
What will you do?
GPU Infrastructure Optimization:
- Design and implement strategies to maximize the efficient utilization of GPU resources across the organization.
- Develop tools and processes for GPU allocation, workload management, and performance monitoring in alignment with selected infrastructure tools.
- Monitor and fine-tune GPU performance to ensure optimal throughput for machine learning workloads.
Model Observability:
- Build and maintain a robust system for automated reporting of key model performance metrics.
- Integrate with diverse data sources to create customizable dashboards for monitoring performance across datasets.
- Set up anomaly detection systems and alerts to ensure timely identification of performance degradation.
- Enhance the existing benchmarking suite for seamless evaluation of datasets in federated data lakes.
Collaboration and Support:
- Partner with machine learning engineers, data engineers, and DevOps teams to enable researchers to efficiently train and deploy models.
- Provide technical guidance and support for effectively utilizing available infrastructure and tools.
Technology Research:
- Stay updated with the latest advancements in GPU technologies, ML infrastructure best practices, and model performance metrics.
- Evaluate and recommend new tools, technologies, and approaches to enhance the efficiency of the ML enablement platform.
Git and Code Ownership:
- Implement best practices for Git workflows, code versioning, and safe release processes.
- Foster a culture of high-quality, collaborative development within the engineering team.
Requirements
Required qualifications:
- Bachelor’s degree in Computer Science, Engineering, or a related field
- 4+ years of experience as a software engineer
- 2+ years of experience in cloud infrastructure or developer platform teams
- Hands-on experience with high-performance computing (HPC) and GPU cluster performance optimization for AI workloads
- Proficiency in Python and Git
- Strong knowledge of GPU technologies and deployment strategies
- Familiarity with GCP compute deployment options, such as Kubernetes
- Experience integrating observability tools for model performance metrics and evaluation
Preferred qualifications:
- Knowledge of federated learning and multi-dataset evaluation methodologies
- Experience in designing and scaling benchmarking frameworks
- Strong analytical and troubleshooting skills in cloud infrastructure and GPU utilization
Desired personal traits:
- You want to make an impact on humankind
- You prioritize “We” over “I”
- You enjoy getting things done and striving for excellence
- You collaborate effectively with people of diverse backgrounds and cultures
- You have a growth mindset
- You are candid, authentic, and transparent
*Please note that when you apply for a position at Immunai, your application will be processed via our recruitment platform Comeet. You can read more about how we process personal data here: https://www.immunai.com/privacy-policy/
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Computer Science Deep Learning DevOps Drug discovery Engineering GCP Git GPU HPC Kubernetes Machine Learning ML infrastructure ML models Pharma Privacy Python Research
Perks/benefits: Career development Startup environment
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.