Software Engineer I - AI/ML, AWS Neuron Distributed Training
Tasks
- Develop distributed training solutions
- Extend distributed training frameworks
- Manage gradients for stability
- Optimize low precision training
- Optimize mixed precision training
- Profile end to end training performance
- Support deployment of training workloads
- Tune training pipelines
Perks/Benefits
- N/A
Skills/Tech-stack
AWS Trainium | BF16 | Deep learning | Distributed Training | FP8 | FSDP | Gradient Management | Hugging Face | Language Models | Large Language Models | Loss Scaling | Machine Learning | Mixture of Experts | Performance Profiling | PyTorch | Python | Reinforcement Learning | Reinforcement Learning Workloads | TorchTitan | Training Optimization | Transformers
Education
Regions
Countries
States
Cities
Related jobs
-
C++ | Computer Vision | Data Processing | Debugging | Image classificationSenior-level Full TimeSan Diego, CA, USA4h ago
-
Software Engineer, AI/ML, Google Research USD 147K-211KData Processing | Data Structures | Data Structures and Algorithms | Debugging | Distributed ComputingMid-level Full TimeMountain View, CA, USA4h ago
-
Analytics Engineer USD 147K-225KApache Airflow | BigQuery | DBT | Data Modeling | Data Visualization401k | Comprehensive benefits | Equity | Flexible time offSenior-level Full TimeUS Remote, Los Angeles, CA; San … R12h ago
-
Autonomy | C++ | CPU GPU | CPU GPU Debugging | Critical Systems401k | Health insurance | Paid Company Holidays | Paid time off | Phone stipendSenior-level Full TimeSan Carlos - Hybrid R14h ago
-
Autonomy | C++ | Data Ingestion | Data Ingestion Pipelines | Deployment401k | Health insurance | Paid Holidays | Paid time off | Phone stipendMid-level Full TimeSan Carlos - Hybrid R14h ago
-
Agent Orchestration | Airflow | Argo Workflows | Artifact versioning | Autonomous workflowsRemote work flexibilitySenior-level Full TimeRemote - United States R14h ago
-
Senior Databricks Engineer USD 180K-247KAWS | Autoscaling | Azure | CI/CD | CachingVisa sponsorshipSenior-level Full TimeCanada R15h ago
-
Staff Applied Scientist USD 244K-320KAgentic Systems | Artificial Intelligence | Benchmarking | CI/CD | Computer VisionEmployee communities | Experience bonus | Hybrid work model | Wellness reimbursementSenior-level Full TimeSeattle, Washington, United States15h ago
-
Senior-level Full TimeCanada R15h ago
-
Capacity Analysis | Cloud Computing | Continuous Improvement | Data Visualization | Data Warehousing401k | Dental insurance | Discounts | Health insurance | Paid leaveMid-level Full TimeUniversal City, CALIFORNIA, United States15h ago
-
AI Research Engineer USD 190K-280KDeep learning | Generative AI | Language Models | Language Processing | Large Language ModelsCareer development | Diversity and inclusion | Flexible work environmentMid-level Full TimeSeattle, Washington, United States; South San …16h ago
-
Sr. AI Engineer (Applied AI & ML Systems) USD 132K-165KAgentic AI | Context engineering | Continuous Improvement | Data Engineering | Data PipelinesE learning license | Hackathons | Healthcare benefits | Home office setup allowance | Identity theft protectionSenior-level Full TimeUnited States R16h ago
-
Machine Learning Engineer USD 159K-216KC++ | Computer Vision | ITK | Image analytics | Image registrationEntry-level Full TimeSunnyvale, CA, United States17h ago
-
Data Engineer USD 95K-140KApache Spark | Automated testing | Azure Databricks | CI/CD | Data ModelingMid-level Full TimeUS Remote R17h ago
-
Senior Applied AI Engineer CAD 144K-165KAI SDK | AWS ECS | AWS ECS Fargate | AWS Key Management Service | AWS LambdaSenior-level Full TimeCanada17h ago
-
Senior Analytics Engineer USD 180K-208KBigQuery | Cube | Dashboards | Data Modeling | Data orchestration401k with payroll match | Dental vision and mental health care | Employer sponsored medical care | Equity | Flexible PTOSenior-level Full TimeSan Francisco R18h ago
-
Bioinformatics Engineer USD 125K-150KBAM | BED | BWA | Batch | Bismark401k match | Dependent care assistance | Educational benefits | Employee referral bonus | Flexible spending accountMid-level Full TimeRockville, MD19h ago
-
A/B | A/B Testing | AWS | Airflow | Amazon Redshift401k matching | Employee assistance program | Flexible time off | Flexible work arrangement | Paid HolidaysMid-level Full TimeRemote, US R19h ago
-
Data Scientist I (Prescriptive AI) USD 99K-135KCPLEX | DB2 | Data Warehousing | Discrete Event Simulation | Discrete eventCross training | Onsite Work Authorization SupportMid-level Full TimeLittle Rock, AR19h ago
-
Software Engineer - Medical Applications & Algorithms USD 130K-150KAWS CodeBuild | AWS CodePipeline | Agile | Amazon Web Services | C++Cross-functional team collaboration | Hybrid work environment | Medical device industry domainMid-level Full TimeSan Francisco, California, United States20h ago
-
Senior-level Full TimeIrving, TX20h ago
-
C# | C++ | CI/CD | Containerization | Data PipelinesMid-level Full TimeRedmond, WA, US20h ago
-
Associate AI Engineer USD 144K-180K.NET | APIs | ASPNet | AWS | Azure401k matching | Dental insurance | Hybrid work model | Medical insurance | Paid time offMid-level Full TimeIrving, TX R20h ago
-
Agentic AI Engineer USD 130K-170KAgentic AI | Concurrency | Context engineering | Data Compression | Data IngestionCareer growth | Health and well-being programs | Remote work | Supportive teamMid-level Full TimeRemote - United States R21h ago
-
Data Engineer-Secret Clearance Required USD 100K-127KAWS | AWS Glue | AWS Redshift | Azure | Azure Data401k match | Bereavement leave | Disability insurance | Employee assistance program | Employee discount programSenior-level Full TimeRemote - Nationwide, United States R22h ago