AI Infrastructure Engineer
Tasks
- Build and own end to end model serving infrastructure
- Collaborate with algorithms engineers on inference configuration
- Design AI inference platform architecture
- Establish MLOps best practices with CI CD pipelines
- Integrate power data with AI inference software
- Monitor model deployment and support rollback
- Optimize GPU utilization and inference performance
- Serve AI models with fault tolerant low latency systems
- Support infrastructure roadmap build vs buy decisions
Perks/Benefits
- 401k match
- Dental insurance
- Health insurance
- Paid time off
- Remote work
- Travel for retreats and onsite engagements
- Vision insurance
Skills/Tech-stack
C++ | CI/CD | CUDA | Container Orchestration | Datadog | Distributed Systems | Docker | Fault-tolerant | Fault-tolerant systems | GPU Acceleration | Go | Grafana | Helm | Infrastructure as Code | Kubernetes | Low Latency | MLOps | Model Serving | Monitoring | NVIDIA Triton | Observability | Prometheus | Python | Reliability | Rust | SGLang | Scalability | TensorRT | Terraform | Torchserve | VLLM | “as-code”
Education
N/A
Related jobs
-
Early-Career Network Engineer (RAN Optimization) USD 85K-130K4G | 5G | Automation | C Band | CBRS401k match | Dental insurance | Disability insurance | Educational assistance | Financial wellness programsMid-level Full TimePlano,Texas,United States R9h ago
-
Data Engineer USD 126K-208KAPI Integration | Airflow | Amazon Web Services | BigQuery | CCPADEI initiatives | Dental benefits | Employee rewards program | Medical benefits | Mental health supportMid-level Full TimeRemote, United States R9h ago
-
Alerting | Ansible | Bash | CI/CD | CephRemote workSenior-level Full TimeUnited States, United States R11h ago
-
Ansible | Bash | CI/CD | CentOS | CephContract-to-hire | No sponsorship | Remote workSenior-level Full TimeUnited States, United States R11h ago
-
Machine Learning Engineer USD 131K-178KAWS | Cassandra | Convolutional Neural Networks | Data Lakes | Data PipelinesMid-level Full TimeRemote, NY, US R12h ago
-
Software Engineer, Machine Learning USD 213K-293KAPI Design | Agent Orchestration | Artificial Intelligence | Bias Mitigation | C++Senior-level Full TimeSunnyvale, CA | Remote, US | … R15h ago
-
Senior AI Data Engineer USD 155K-185KApache Airflow | Apache Spark | Azure Synapse | BigQuery | ClickHouseEmployer paid Medical Dental Vision Insurance | Flexible paid time off | Manager check ins | Paid cell phone and service | Paid parental leaveSenior-level Full TimeRemote - United States R1d ago
-
Senior Staff Software Engineer - Data Platform USD 200K-250KAWS Glue | AWS IAM | Amazon EMR | Amazon S3 | AmundsenDevelopment dollars | Employee stock purchase program | Family-forming benefits | Financial coaching | Flexible time offSenior-level Full TimeRemote, USA R1d ago
-
Senior Staff Software Engineer - Data Platform USD 200K-250KAWS EMR | AWS Glue | AWS IAM | AWS S3 | Apache AirflowDevelopment dollars | Financial coaching | Flexible remote work | Flexible time off | Free therapy sessionsSenior-level Full TimeRemote, USA R1d ago
-
Staff Machine Learning Engineer USD 189K-389KCalibration | Contextual Bandits | Contextual Decisioning | Data Validation | EmbeddingsEquity eligible | In Office 1 Day Per WeekSenior-level Full TimeSan Francisco, CA, US; Remote, US R1d ago
-
Principal AI/ML Engineer USD 165K-226KC# | C++ | CI/CD | CUDA | Computer Vision401k match | Dental insurance | Health insurance | Life insurance | Paid time offSenior-level Full TimeRemote PA - PA PAR, United … R1d ago
-
APIs | Compliance | Distributed Systems | Enterprise Integration | Generative AIOccasional evening calls | Remote workSenior-level Full TimeRemote - US Based R1d ago
-
AV Safety Engineering Analytics Engineer (GPSSC) USD 160K-246KCI/CD | Dash | Docker | GitHub | JenkinsRemote workMid-level Full TimeWork From Home - United States, … R1d ago
-
Agile | C++ | Deep learning | Distributed Computing | GPU ComputingDiscretionary bonus | Flexible time off | Healthcare | Leave benefits | Retirement benefitsExecutive-level Full TimeNY7 - 50 Hudson Yards, New … R1d ago
-
AI Agents | AWS | Agentic AI | CUDA | Deep learningCompetitive vacation and holidays | Comprehensive wellness programs | Employee networks | Great Place to Work certified | Paid adoption leaveSenior-level Full TimeAustin, United States R1d ago
-
Lead Data Engineer USD 224KApache Airflow | Apache Beam | BigQuery | CI/CD | CMEK401k plan | Adoption reimbursement | Commuter benefits | Critical caregiving leave | Critical illness insuranceSenior-level Full Time112265-NJ-MetroPark, Iselin, United States R1d ago
-
Senior Software Engineer, AI USD 171K-210KAirflow | Amazon Web Services | Apache Hive | Apache Impala | C#Career development access | Employee resource groups | Flexible WFH | Generous PTO | Internet reimbursementSenior-level Full TimeUS-California-Remote, United States R1d ago
-
Senior Software Engineer USD 144K-192KAWS | Angular | Apache Spark | Azure | BuildahCareer development | Employee resource groups | Flexible WFH | Generous PTO | Paid volunteer timeSenior-level Full TimeUS-California-Remote, United States R1d ago
-
Senior-level Full TimeUnited States - Remote R1d ago
-
AI Research Engineer (Applied AI) USD 100K-150KAblation Studies | Accelerator hardware | Computer Vision | Data Quality | Data labelingCareer growth | Full-time employment | Remote workMid-level Full TimeUnited States - Remote R1d ago
-
Senior Data Engineer (Snowflake) USD 78K-133KAPI Development | AWS | AWS Glue | Amazon Redshift | Apache AirflowSenior-level Full TimeRemote CA - R2, United States R1d ago
-
AI Data Infrastructure Engineer USD 100K-150KActive Learning | Apache Beam | CI/CD | Code review | Data GovernanceCareer growth | Health benefits | Remote workMid-level Full TimeUnited States - Remote R1d ago
-
LLM Fine-Tuning Engineer USD 100K-150KAdapter methods | Attention Optimization | DPO | Deep learning | FSDPBenefits package | Career growth potential | Full-time employment | Remote work | W2 employmentMid-level Full TimeUnited States - Remote R1d ago
-
Senior Machine Learning Engineer USD 156K-211KAPI Development | AWS | Agentic Workflows | CI/CD | Cloud ArchitectureAward-winning time-off plans | Comprehensive health, dental, vision coverage | Flexible work models | Life and disability insurance | Retirement and savings planSenior-level Full TimeUS - California - Thousand Oaks … R1d ago
-
AI Performance Optimization Engineer USD 100K-150KBenchmarking | C++ | CUDA | Compiler optimization | Continuous batchingCareer growth | Remote workMid-level Full TimeUnited States - Remote R1d ago