Principal ML Engineer (Infra/hardware)

Poland

Neurons Lab

Welcome to Neurons Lab. We support fast-growing companies seeking AI solutions through collaboration.

View all jobs at Neurons Lab

Apply now Apply later

About the project

We're looking for an experienced ML Infrastructure Engineer who has successfully implemented large-scale ML infrastructure optimization projects. The primary focus is migrating and optimizing computer vision models from Nvidia GPU-based infrastructure to AWS Inferentia/Trainium while getting performance boost and cost reduction.

Current Infrastructure:

  • ML Models: RetinaFace, OpenPose, CLIP, and other CV models

  • Hardware: A10/T4 GPUs on EKS

  • Serving: Triton Inference Server

  • Orchestration: Mix of Kubernetes and Ray

Stage: Presale and Delivery

Duration: 2 months (preliminary)

Capacity: part-time (20h/week)

Areas of Responsibility

  • Technical Leadership:

    • Lead the architecture design for ML infrastructure modernization

    • Define compilation and optimization strategies for model migration

    • Establish performance benchmarking framework

    • Set up monitoring and alerting for the new infrastructure

  • Performance Optimization:

    • Implement efficient model compilation pipelines for Inferentia2

    • Optimize batch processing and memory layouts

    • Fine-tune model serving configurations

    • Ensure latency requirements are met across all services

  • Cost Optimization:

    • Analyze and optimize infrastructure costs

    • Implement efficient resource allocation strategies

    • Set up cost monitoring and reporting

    • Achieve target cost reduction while maintaining performance

Skills

  • Proven track record of ML infrastructure optimization projects

  • Hands-on experience with AWS Neuron SDK and Inferentia/Trainium deployment

  • Deep expertise in PyTorch model optimization and compilation

  • Experience with high-throughput computer vision model serving

  • Production experience with both Kubernetes and Ray for ML workloads

Knowledge

  1. Model Optimization Expertise:

    • Deep understanding of ML model architecture optimization

    • Experience with model compilation techniques for specialized hardware (Inferentia/Trainium)

    • Proficiency in optimizing computer vision models (CNN architectures)

    • Knowledge of model serving optimization patterns

  2. Performance Optimization:

    • Advanced understanding of ML model inference optimization

    • Expertise in batch processing strategies

    • Memory layout optimization for vision models

    • Experience with pipeline parallelism implementation

    • Proficiency in latency/throughput optimization techniques

  3. Hardware Acceleration:

    • Deep knowledge of ML accelerator architectures

    • Understanding of hardware-specific optimizations

    • Experience with model compilation for specialized chips

    • Proficiency in memory access pattern optimization

  4. Performance Analysis:

    • Proficiency in ML model profiling tools

    • Experience with performance bottleneck identification

    • Knowledge of performance monitoring techniques

    • Ability to analyze and optimize inference patterns

Nice to Have:

  • Experience with Ray architecture for ML serving

  • Knowledge of distributed ML systems

  • Understanding of ML pipeline optimization

  • Experience with model quantization techniques

Experience

  1. Model Optimization (4+ years):

    • Proven track record of optimizing large-scale ML inference systems

    • Successfully implemented hardware-specific model optimizations

    • Demonstrated experience with computer vision model optimization

    • Led projects achieving significant performance improvements

  2. Proven Results (Examples):

  • Successfully optimized computer vision models similar to RetinaFace/CLIP

  • Achieved significant cost reduction while maintaining performance

  • Implemented efficient batch processing strategies

  • Developed performance monitoring and optimization frameworks

Apply now Apply later

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats:  0  0  0

Tags: Architecture AWS Computer Vision GPU Kubernetes Machine Learning ML infrastructure ML models Model inference Pipelines PyTorch

Regions: Remote/Anywhere Europe
Country: Poland

More jobs like this