AI Data Infrastructure Engineer
Tasks
- Build evaluation dataset construction pipelines with integrity and contamination controls
- Build high throughput data loading systems to maximize GPU utilization
- Build ingestion systems for text image audio video and structured signals
- Design and operate large scale data pipelines for AI training and evaluation
- Design storage architectures balancing cost throughput and latency
- Develop dataset versioning lineage and provenance tracking for reproducible training
- Document data systems schemas and operational procedures
- Drive observability for data quality drift and pipeline health
- Implement data cleaning deduplication filtering and quality assurance at petabyte scale
- Implement data privacy redaction and consent enforcement
- Implement labeling workflows active learning and human in the loop data improvement
- Optimize cost and performance with compression format selection and caching
Perks/Benefits
- N/A
Skills/Tech-stack
Active Learning | Apache Beam | CI/CD | Caching | Code review | Consent Management | Data Compression | Data Lineage | Data Modeling | Data Privacy | Data Quality | Data Versioning | Data loading | Data provenance | Deduplication | Distributed Systems | High Throughput | High Throughput Data Loading | High-throughput data | Human-in-the-loop | Java | Multimodal Data | Python | Ray | Rédaction | Scala | Spark | Storage Architecture | Testing | The Loop
Education
Related jobs
-
Databricks Solution Architect USD 180K-247KAWS S3 | Apache Spark | Autoscaling | Azure Data | Azure Data LakeSenior-level Full TimeUnited States R22h ago
-
Lead Analytics Manager USD 120K-150KAffiliate Marketing | Business Intelligence | Dashboard Design | Data Ingestion | Data LiteracySenior-level Full TimeAustin, TX (remote); Dallas, TX (remote); … R1d ago
-
C++ | Cloud Computing | Code Reviews | Deployment Automation | Distributed Systems401k match | Caregiving support | Family planning support | Flexible vacation | Gender-affirming careSenior-level Full TimeRemote - United States R1d ago
-
APIs | Agentic Workflows | CI/CD | Cost Management | GeminiSenior-level Full TimeRemote - USA, United States R1d ago
-
Edge AI Engineer USD 100K-150KBias Evaluation | C++ | Core ML | DSP | Edge ComputingCareer growth opportunities | Health benefits | Remote workSenior-level Full TimeUnited States - Remote R1d ago
-
Edge AI Engineer USD 100K-150KBias Evaluation | C++ | Core ML | Edge Computing | Edge inferenceCareer growth potential | H1B transfer support for eligible candidates | Remote workSenior-level Full TimeUnited States - Remote R1d ago
-
AI Research Engineer (Applied AI) USD 100K-150KAblation Studies | Agentic Systems | Computer Vision | Data Quality | Data quality monitoringMid-level Full TimeUnited States - Remote R1d ago
-
AI Research Engineer (Applied AI) USD 100K-150KAblation Studies | Accelerator hardware | Computer Vision | Data Quality | Data quality monitoringMid-level Full TimeUnited States - Remote R1d ago
-
AI Data Infrastructure Engineer USD 100K-150KActive Learning | Apache Beam | Apache Spark | CI/CD | CachingCareer growth | Inclusion and diversity | Remote workMid-level Full TimeUnited States - Remote R1d ago
-
LLM Fine-Tuning Engineer USD 100K-150KAdapter-Tuning | Attention Optimization | DPO | Distributed Training | Evaluation methodologyCareer growth | Remote workMid-level Full TimeUnited States - Remote R1d ago
-
Mid-level Full TimeUnited States - Remote R1d ago
-
AI Performance Optimization Engineer USD 100K-150KAttention Mechanisms | Benchmarking | C++ | Compiler optimization | Continuous batchingBenefits | Career growth | Mentorship | Remote workMid-level Full TimeUnited States - Remote R1d ago
-
AI Performance Optimization Engineer USD 100K-150KBenchmarking | C++ | CUDA | Compiler optimization | Continuous batchingMid-level Full TimeUnited States - Remote R1d ago
-
Prompt Engineering Architect USD 100K-150KAgent systems | Agentic Workflows | Cost Optimization | Embeddings | Evaluation FrameworksCareer growth | Employee mentoring | Remote workSenior-level Full TimeUnited States - Remote R1d ago
-
Prompt Engineering Architect USD 100K-150KAgentic Workflows | Chunking | Embeddings | Evaluation Frameworks | IndexingSenior-level Full TimeUnited States - Remote R1d ago
-
Quantitative Developer (Fintech) USD 100K-150KBacktesting | C++ | Cloud Computing | Concurrency | DebuggingMid-level Full TimeUnited States - Remote R1d ago
-
Robotics Software Engineer USD 100K-150KBehavior Trees | C++ | Concurrent Systems | Control Systems | Embedded Systems100 percent remote | Benefits | Full time direct W2 employment | H1B transfer supportMid-level Full TimeUnited States - Remote R1d ago
-
Robotics Software Engineer USD 100K-150KBehavior Tree | C++ | Concurrent programming | Control Systems | DebuggingMid-level Full TimeUnited States - Remote R1d ago
-
Senior AI Engineer, Enterprise Agentic Solutions USD 142K-196KAPI Development | AWS | Agent systems | Autogen | AzureComprehensive health benefits | Onsite onboarding travel reimbursement | Paid time off | Remote work | Retirement benefitsSenior-level Full TimeRemote - Minnesota, United States R1d ago
-
Senior Data Engineer USD 125K-172KAmazon Web Services | Azure | Data Modeling | Data Quality | Dimensional ModelingSenior-level Full TimeRemote - New Mexico, United States R1d ago
-
Manager, AI Engineering - Analytics USD 197K-267KAgent systems | Artificial Intelligence | Data Modeling | Data Warehousing | EvalsHybrid work flexibility | Professional growth opportunities | Stock equity | Work-life balanceMid-level Full TimeHybrid - San Francisco R1d ago
-
Machine Learning Operations Engineer USD 133K-167KAWS SageMaker | Docker | GitHub Actions | Machine Learning | NumPyCareer development | Communities | Commuting cost coverage | Corporate giving programs | Daily free lunchMid-level Full TimeBoston, Massachusetts, United States R1d ago
-
Applied AI Engineer, Investments USD 134K-183KAPIs | Artificial Intelligence | Cloud technologies | Data Pipelines | Data Processing401k match | Family-forming benefits | Paid time off | Relocation support | Volunteer time offEntry-level Full TimeRedwood City, CA (Hybrid) R1d ago
-
Senior-level Full TimeRemote - United States R1d ago
-
Forward Deployed AI Engineer, West USD 125K-175KAWS | Azure | Docker | GCP | Generative AI401k plan | Dental insurance | Medical insurance | Parental leave | Unlimited paid time offMid-level Full TimeRemote (San Francisco) R1d ago