AI Data Infrastructure Engineer
Tasks
- Build evaluation dataset construction pipelines with integrity controls
- Build high throughput data loading systems for GPU utilization
- Build ingestion systems for text image audio video and structured data
- Design and operate large scale data pipelines for AI training and evaluation
- Design storage architectures across data tiers
- Develop dataset versioning lineage and provenance tracking
- Document data systems schemas and operational procedures
- Drive observability for data quality drift and pipeline health
- Implement data cleaning deduplication filtering and quality assurance
- Implement data privacy redaction and consent enforcement
- Implement labeling workflows active learning and human in the loop improvement
- Optimize cost and performance using compression format selection and caching
Perks/Benefits
Skills/Tech-stack
Apache Beam | CI/CD | Code review | Data Governance | Data Lineage | Data Modeling | Data Privacy | Data Quality | Data Versioning | Data cleaning | Data loading | Data redaction | Dataset Reproducibility | Deduplication | Distributed Systems | GPU Utilization | High Throughput | High Throughput Data Loading | High-throughput data | Java | Observability | Provenance | Python | Ray | Scala | Spark | Storage Architecture | Testing
Education
Related jobs
-
Early-Career Network Engineer (RAN Optimization) USD 85K-130K4G | 5G | Automation | C Band | CBRS401k match | Dental insurance | Disability insurance | Educational assistance | Financial wellness programsMid-level Full TimePlano,Texas,United States R4h ago
-
Data Engineer USD 126K-208KAPI Integration | Airflow | Amazon Web Services | BigQuery | CCPADEI initiatives | Dental benefits | Employee rewards program | Medical benefits | Mental health supportMid-level Full TimeRemote, United States R4h ago
-
Alerting | Ansible | Bash | CI/CD | CephRemote workSenior-level Full TimeUnited States, United States R6h ago
-
Ansible | Bash | CI/CD | CentOS | CephContract-to-hire | No sponsorship | Remote workSenior-level Full TimeUnited States, United States R6h ago
-
Machine Learning Engineer USD 131K-178KAWS | Cassandra | Convolutional Neural Networks | Data Lakes | Data PipelinesMid-level Full TimeRemote, NY, US R7h ago
-
Software Engineer, Machine Learning USD 213K-293KAPI Design | Agent Orchestration | Artificial Intelligence | Bias Mitigation | C++Senior-level Full TimeSunnyvale, CA | Remote, US | … R10h ago
-
Senior AI Data Engineer USD 155K-185KApache Airflow | Apache Spark | Azure Synapse | BigQuery | ClickHouseEmployer paid Medical Dental Vision Insurance | Flexible paid time off | Manager check ins | Paid cell phone and service | Paid parental leaveSenior-level Full TimeRemote - United States R19h ago
-
Senior Staff Software Engineer - Data Platform USD 200K-250KAWS Glue | AWS IAM | Amazon EMR | Amazon S3 | AmundsenDevelopment dollars | Employee stock purchase program | Family-forming benefits | Financial coaching | Flexible time offSenior-level Full TimeRemote, USA R20h ago
-
Senior Staff Software Engineer - Data Platform USD 200K-250KAWS EMR | AWS Glue | AWS IAM | AWS S3 | Apache AirflowDevelopment dollars | Financial coaching | Flexible remote work | Flexible time off | Free therapy sessionsSenior-level Full TimeRemote, USA R20h ago
-
Staff Machine Learning Engineer USD 189K-389KCalibration | Contextual Bandits | Contextual Decisioning | Data Validation | EmbeddingsEquity eligible | In Office 1 Day Per WeekSenior-level Full TimeSan Francisco, CA, US; Remote, US R21h ago
-
Principal AI/ML Engineer USD 165K-226KC# | C++ | CI/CD | CUDA | Computer Vision401k match | Dental insurance | Health insurance | Life insurance | Paid time offSenior-level Full TimeRemote PA - PA PAR, United … R22h ago
-
APIs | Compliance | Distributed Systems | Enterprise Integration | Generative AIOccasional evening calls | Remote workSenior-level Full TimeRemote - US Based R22h ago
-
AV Safety Engineering Analytics Engineer (GPSSC) USD 160K-246KCI/CD | Dash | Docker | GitHub | JenkinsRemote workMid-level Full TimeWork From Home - United States, … R22h ago
-
Agile | C++ | Deep learning | Distributed Computing | GPU ComputingDiscretionary bonus | Flexible time off | Healthcare | Leave benefits | Retirement benefitsExecutive-level Full TimeNY7 - 50 Hudson Yards, New … R22h ago
-
AI Agents | AWS | Agentic AI | CUDA | Deep learningCompetitive vacation and holidays | Comprehensive wellness programs | Employee networks | Great Place to Work certified | Paid adoption leaveSenior-level Full TimeAustin, United States R22h ago
-
Lead Data Engineer USD 224KApache Airflow | Apache Beam | BigQuery | CI/CD | CMEK401k plan | Adoption reimbursement | Commuter benefits | Critical caregiving leave | Critical illness insuranceSenior-level Full Time112265-NJ-MetroPark, Iselin, United States R22h ago
-
Senior Software Engineer, AI USD 171K-210KAirflow | Amazon Web Services | Apache Hive | Apache Impala | C#Career development access | Employee resource groups | Flexible WFH | Generous PTO | Internet reimbursementSenior-level Full TimeUS-California-Remote, United States R22h ago
-
Senior Software Engineer USD 144K-192KAWS | Angular | Apache Spark | Azure | BuildahCareer development | Employee resource groups | Flexible WFH | Generous PTO | Paid volunteer timeSenior-level Full TimeUS-California-Remote, United States R22h ago
-
Staff Backend AI Engineer, Remote USD 140K-215KAPI Gateway | AWS CDK | AWS ECS | AWS EKS | AWS Fargate401k matching | Dental insurance | Flexible time off | Flexible work schedule | Medical insuranceSenior-level Full TimeUnited States, UNITED STATES, United States R22h ago
-
Data Engineer USD 160K-210KAPI Integration | AWS | Amazon Kinesis | Artificial Intelligence | CI/CDSenior-level Full TimeUS - Remote R22h ago
-
API Integration | Agile Scrum | Apache Spark | Azure | Azure DataFlexible work from home | Hybrid workMid-level Full TimeHybrid - US R22h ago
-
Senior Machine Learning Engineer, Agentic USD 163K-245KA/B | A/B Testing | B testing | Collaborative Filtering | Content-Based Filtering401-K matching | Fertility benefits | Health insurance | Life and disability insurance | Mental health benefitsSenior-level Full TimeBellevue, WA; Menlo Park, CA R23h ago
-
Data Engineer (Remote) USD 155K-180KAzure DevOps | Bash | Cloud Storage | Cloud platform | Compute EngineFully remote | W2 employmentMid-level Full TimeRochester, MN R1d ago
-
Senior Data Engineer (AI Native) USD 103K-192KAWS EMR | Amazon Kinesis | Amazon MWAA | Amazon S3 | Apache Airflow401k company match | Employee assistance program | Equipment and tools support | Flexible PTO | Learning and developmentSenior-level Full TimeRemote, USA ; Remote, Canada R1d ago
-
Principal Snowflake Data Platform Engineer USD 165K-216KAWS | Agile | Ansible | Backup and Resiliency | BashFlexible hybrid work model | Health and life insurance | Paid time off | Retirement benefitsSenior-level Full TimeTX, United States R1d ago