AI Data Infrastructure Engineer
Tasks
- Build evaluation dataset construction pipelines with integrity controls
- Build high throughput data loading systems for accelerator training
- Build ingestion systems for multimodal data
- Design and operate large scale AI data pipelines
- Design data storage architectures for cost throughput latency
- Develop dataset versioning lineage and provenance tracking
- Document data systems schemas and operational procedures
- Drive observability of data quality drift and pipeline health
- Implement data cleaning and quality assurance
- Implement data privacy redaction and consent enforcement
- Implement labeling workflows and active learning pipelines
- Optimize cost and performance using compression and caching
Perks/Benefits
Skills/Tech-stack
Apache Beam | CI/CD | Caching | Code review | Compression | Data Lineage | Data Modeling | Data Privacy | Data Quality | Data Versioning | Data loading | Data quality monitoring | Dataset Reproducibility | Distributed Systems | GPU Utilization | High Throughput | High Throughput Data Loading | High-throughput data | Observability | Provenance | Python | Quality monitoring | Ray | Rédaction | Spark | Storage Formats | Testing
Education
Related jobs
-
Senior-level Full TimeCanada R13h ago
-
Senior-level Full TimeUnited States R14h ago
-
Senior-level Full TimeUnited States R14h ago
-
Deep Learning Quality Specialist USD 72K-90KAnnotation Guidelines | Computer Vision | Confluence | Convolutional Neural Networks | Data Annotation401k plan | Commuter benefits | Employee assistance program | Flexible PTO | Fully paid medical/dental/visionMid-level Full TimeSeattle, WA R20h ago
-
Corporate AI Engineer USD 154K-200KAPI Integration | Access Control | Data Quality | Embeddings | Generative AIHybrid work schedule | Volunteer time offMid-level Full TimeAddison, TX (Hybrid); Bellevue, WA (Hybrid); … R20h ago
-
Agile | Amazon RDS | Amazon S3 | Jira | MongoDBPST time zone requirement | Remote workMid-level Full Timeremote, CA R1d ago
-
AI Observability | AWS | Azure | CI/CD | Cost ControlCareer advancement | Fully remote work | Professional development opportunities | Work-life balanceSenior-level Full TimeCanada R1d ago
-
Senior Data Engineer 🇺🇸 USD 160K-200KAWS Glue | AWS Redshift | Amazon S3 | Apache Spark | Automated testingSenior-level Full TimeHybrid (New York, New York, US) R1d ago
-
AWS Data Engineer - Fully Remote - US Only USD 139K-210KAWS Glue | AWS Lambda | AWS Step Functions | Amazon DynamoDB | Amazon RedshiftAbility to work independently | Fully remote | US onlySenior-level Full TimePlano, Texas, United States - Remote R1d ago
-
AI Workflow Orchestration | AI workflow | AWS DynamoDB | AWS Lambda | AWS Step FunctionsArchitectural influence | Engineering Led Collaboration | High technical ownership | Learning opportunities | Remote-first work modelSenior-level Full TimeCanada R1d ago
-
Senior Machine Learning Engineer, Applied AI Modeling USD 139K-218KClassification | Embeddings | Evaluation | Fine Tuning | Hugging FaceHome office stipend | Medical, dental & vision coverage | Paid Holidays | Paid parental leave | Professional development budgetSenior-level Full TimeRemote US R1d ago
-
Sr. AI Engineer USD 150K-175KAccess Control | Agentic Frameworks | Auditability | CI/CD | Cloud Native401-k match | Dental insurance | Expense Reimbursement for Home Office | Life insurance | Medical insuranceSenior-level Full TimeRemote, USA, United States R1d ago
-
Senior Analytics Engineer USD 159K-200KAWS | Airflow | DBT | Dagster | Data ObservabilityAutonomy | Fully remote | High-impact work | Use of AI toolsSenior-level Full TimeRemote US R1d ago
-
Senior-level Full TimeSan Jose, United States R1d ago
-
Lead Data Engineer USD 188K-230KAirflow | Apache Spark | Azure Cosmos | Azure Cosmos DB | Azure DataDomestic travel up to 5 percent | Relocation not authorized | Remote workSenior-level Full TimeRemote - Minnesota, United States R1d ago
-
BigQuery | Cloud Data | Cloud data platform | Code review | DBTLong-term contract | Onsite work in Atlanta GA metro area | Potential conversion to full time | W2 employmentSenior-level Contract Full TimeAtlanta, Georgia, United States R1d ago
-
Computational statistics | MATLAB | NumPy | Pandas | PythonPart-time freelance | Project based workSenior-level FreelanceNew York, New York, United States … R1d ago
-
Combinatorics | Graph theory | Mathematical Statistics | NumPy | Number theoryFlexible hours | Paid per project | Part-time freelance work | Project based workSenior-level FreelanceTexas, United States - Remote R1d ago
-
MATLAB | NumPy | Pandas | Python | RFlexible scheduling | Part-time project-based workSenior-level FreelanceFlorida, United States - Remote R1d ago
-
C# | MATLAB | NumPy | Pandas | PythonPart-time schedule | Project based workSenior-level FreelanceMichigan, United States - Remote R1d ago
-
MATLAB | NumPy | Pandas | Python | RFlexible schedule | Part-time hours | Project based workSenior-level FreelanceUnited States - Remote R1d ago
-
Senior Machine Learning Operations Engineer USD 166K-208KAlerting | CI/CD | Canary Deployment | Champion Challenger | Drift DetectionSenior-level Full TimeSan Francisco, CA, New York, NY, … R1d ago
-
Staff Analytics Engineer USD 109K-125KAmazon Redshift | Automated testing | BigQuery | CI/CD | Cloud Data401k plan | Coaching | Flexible spending account | Flexible vacation policy | Healthcare coverageSenior-level Full TimeUnited States R1d ago
-
AI Engineer USD 99K-163KAPI Integration | AWS | Amazon Bedrock | Data Analysis | Embeddings401k match | Dental insurance | Disability insurance | Hybrid work model | Life insuranceMid-level Full TimeRemote, United States R1d ago
-
Senior-level Full TimeUnited States (Remote) R1d ago