AI Data Infrastructure Engineer
Tasks
- Build evaluation dataset construction pipelines with integrity and contamination controls
- Build high throughput data loading systems
- Build ingestion systems for multimodal data
- Collaborate with ML researchers and engineers on data system requirements
- Design and operate large scale data pipelines for AI training and evaluation
- Design storage architectures for cost throughput and latency
- Develop dataset versioning lineage and provenance tracking
- Document data systems schemas and operational procedures
- Drive observability of data quality drift and pipeline health
- Implement data cleaning deduplication filtering and quality assurance
- Implement data privacy redaction and consent enforcement
- Implement labeling workflows active learning and human in the loop pipelines
- Optimize cost and performance using compression format selection and caching
Perks/Benefits
Skills/Tech-stack
Active Learning | Apache Beam | Apache Spark | CI/CD | Caching | Code review | Compression | Data Lineage | Data Modeling | Data Privacy | Data Quality | Data Versioning | Data loading | Data redaction | Dataset versioning | Distributed Systems | High Throughput | High Throughput Data Loading | High-throughput data | Human-in-the-loop | JVM | Python | Ray | Storage Architecture | Testing | The Loop
Education
Related jobs
-
Junior AI Engineer (Open to remote) USD 110K-135KAPI Development | Language Model | Language Model Evaluation | Language Models | Language Processing401k | Dental insurance | Health savings account | Medical insurance | Paid time offEntry-level Full TimeNew York, NY, US, NY 10019 R5h ago
-
Senior Data Platform Engineer, Remote USD 135K-180KAWS | AWS Lambda | Access Control | Amazon Aurora | Amazon CloudWatchSenior-level Full TimeUnited States, UNITED STATES, United States R7h ago
-
AI Software Engineer USD 181K-270KAWS | CI/CD | Docker | Edge Functions | GitHub CopilotComprehensive benefits | Equity | Learning stipend | Remote-first cultureSenior-level Full TimeUnited States or Canada R11h ago
-
Prompt Engineering Architect USD 100K-150KAgentic Workflows | Embeddings | Evaluation Frameworks | Fine Tuning | Language ModelsSenior-level Full TimeUnited States - Remote R12h ago
-
Storage Engineer (NetApp / Pure / Ceph) USD 100K-150KAnsible | Backup | CRUSH maps | CSI | Capacity PlanningBenefits | Long term multi year engagement | Remote workSenior-level Full TimeUnited States - Remote R12h ago
-
Robotics Software Engineer USD 100K-150KBehavior Trees | C++ | Concurrent Systems | Embedded Systems | Fault detectionRemote workMid-level Full TimeUnited States - Remote R12h ago
-
Senior Data Engineer - Remote - Multiple Levels USD 85K-141KAWS Data | AWS Data Migration | AWS Data Migration Service | AWS Lambda | Airflow401k retirement plan | Dental insurance | Health insurance | Paid Holidays | Parental leaveSenior-level Full TimeHome Office: Tysons, VA, United States R12h ago
-
Senior Machine Learning Engineer, GenAI Data USD 243K-295KAmazon S3 | Batch Processing | C plus plus | Data Pipelines | Data PreprocessingSenior-level Full TimeSan Mateo, CA, United States R12h ago
-
AI Engineer USD 105K-132KAWS | CAP | CLIA | Electronic Health Records | FDA401k benefits | Baby bonding leave | Commuter benefits | Dental insurance | Disability insuranceMid-level Full TimeUS Remote R13h ago
-
Senior Software Engineer, Machine Learning USD 190K-220KAWS | Airflow | DBT | Kubernetes | MLflow401k match | Medical/Dental/Vision insurance | Paid Holidays | Paid parental leave | Remote-first teamSenior-level Full TimeRemote (United States) R14h ago
-
Member of Technical Staff - Principal ML Engineer USD 200K-300KAPI Design | Access Management | Auth0 | Cloud Architecture | Entra ID401k | Equity incentives | FSA | Health insurance | Mental health benefitsSenior-level Full TimeRemote (USA) R15h ago
-
Site Reliability Engineer - Storage Engineer USD 98K-192KAWS | Ansible | Bash | CI/CD | Ceph401k retirement plan | Dental insurance | Employee Assistance Program (EAP) | Employee Health Insurance | Hybrid work optionsSenior-level Full TimeAustin, Texas, United States R15h ago
-
Sr. Data Engineer USD 152K-223KAWS | Access Control | CI/CD | Change Data Capture | DBT401k match | Disability insurance | EAP | Health insurance | Hybrid work flexibilitySenior-level Full TimeUtah | Hybrid R16h ago
-
Enterprise Sales Engineer - FED USD 118K-157K.NET | CRM | Go | Java | Node.js401k match | Community guilds | Dental | Employee stock purchase plan | Fitness reimbursementSenior-level Full TimeDistrict of Columbia, USA, Remote; Virginia, … R16h ago
-
Senior Staff Data Engineer - Platform Data and Analytics USD 268K-368KAWS | Airflow | Alerting | Apache Spark | Compute OptimizationComprehensive benefits | Equity | Hybrid/Remote flexibilitySenior-level Full TimeSan Francisco, CA R16h ago
-
Forward Deployed Engineer (West) USD 220K-250KAI Prototyping | API Integration | AWS | Automation | Cloud NetworkingMid-level Full TimePacific or Mountain Time Zone (Remote) R18h ago
-
Senior AI Engineer USD 95K-197KAWS | Autogen | Azure | CI/CD | Clean CodeAutonomy | Learning and development programs | MentorshipSenior-level Full TimeChicago, Illinois, USA; Los Angeles, California, … R19h ago
-
Lead AI Engineer USD 198K-261KAgentic Frameworks | CI/CD | Cloud Platforms | Containers | Fine TuningSenior-level Full TimeChicago, Illinois, USA; San Francisco, California, … R19h ago
-
Agent systems | Agentic Systems | Air gapped deployments | Air-gapped | Artificial Intelligence401k | Career advancement | Employer paid health care | Equity incentives | FSASenior-level Full TimeSeattle, WA or McLean, VA or … R19h ago
-
Forward Deployed AI Solutions Engineer USD 125K-156KAgentic Workflows | Authentication | CLI | Dashboards | Data Quality401k | Baby bonding leave | Commuter benefits | Dental insurance | Disability insuranceMid-level Full TimeUS Remote R19h ago
-
Senior DataOps Engineer USD 94K-136KAWS | Access Control | Alerting | Amazon Aurora | Amazon DynamoDB401k employer match | AD&D insurance | Dental insurance | Life insurance | Long-term disability insuranceSenior-level Full TimeRemote, United States R19h ago
-
Data Analyst USD 114K-166KA/B | A/B Testing | AI tools | AWS | B testingAI Tool Support | Career growth support | Collaborative code review | No on call expectations | Standard business hoursMid-level Full TimeRemote (United States) R20h ago
-
Agile | Cloud Architecture | Containerization | Data Ingestion | Data Retrieval401k match | Community programs | Company-Paid Holidays | Conferences or workshops | Education assistanceSenior-level Full TimeUS - Remote - California - … R21h ago
-
Data Engineer USD 74K-133KAgile | Apache Airflow | Cloud Composer | Cloud DataStream | Cloud Dataflow401k retirement program | Dental insurance | Disability insurance | Employer 401k match | Flexible time offMid-level Full TimeLisle, IL, United States R22h ago
-
Forward Deployed AI Engineer – Claude 2026 - US USD 200K-305KAPIs | AWS | Access Control | Agents | AnthropicRemote work within the USMid-level Full TimeAtlanta, GA - Remote R1d ago