AI Data Engineer
Tasks
- Build evaluation dataset construction pipelines with integrity and contamination controls
- Build high throughput data loading systems for training GPU utilization
- Build ingestion systems for text image audio video and structured data
- Design and operate large scale data pipelines for AI training and evaluation
- Design storage architectures for cost throughput and latency
- Develop dataset versioning lineage and provenance tracking
- Document data systems schemas and operational procedures
- Drive observability of data quality drift and pipeline health
- Implement data cleaning deduplication filtering and quality assurance
- Implement data privacy redaction and consent enforcement
- Implement labeling workflows active learning and human in the loop improvement
- Optimize cost and performance using compression formats and caching
Perks/Benefits
Skills/Tech-stack
Active Learning | Apache Beam | Apache Spark | Artificial Intelligence | CI/CD | Caching | Code review | Compression | Data Lineage | Data Modeling | Data Privacy | Data Quality | Data Versioning | Data loading | Data provenance | Data redaction | Distributed Systems | High Throughput | High Throughput Data Loading | High-throughput data | Human-in-the-loop | Java | Machine Learning | Python | Ray | Scala | Storage architectures | Testing | The Loop
Education
Roles
Related jobs
-
Early-Career Network Engineer (RAN Optimization) USD 82K-128K4G | 5G | Automation | C Band | CBRSEducational assistance | Matching gifts | Paid sick time | Paid vacation | Parental leaveMid-level Full TimePlano,Texas,United States R9h ago
-
Applied AI Engineer - AI Solutions USD 172K-300KAgentic Workflows | Airflow | Apache Spark | Chroma | CrewAIAnnual travel up to 25% | Employee stock options | Hybrid work | Professional developmentMid-level Full TimeNew York City, NY (Hybrid); Redwood … R19h ago
-
Product Analytics Engineer USD 130K-140KA/B | A/B Testing | Airflow | B testing | DBT401k retirement savings plan | Employer-sponsored healthcare | Flexible spending account | Health savings account | Paid parental leaveSenior-level Full TimeRemote, USA R1d ago
-
Edge AI Engineer USD 100K-150KC++ | Core ML | DSP | Embedded Systems | Federated LearningCareer growth | H1B transfer support | Remote workSenior-level Full TimeUnited States - Remote R1d ago
-
Senior-level Full TimeUnited States - Remote R1d ago
-
Senior-level Full TimeUnited States - Remote R1d ago
-
A2A protocols | API Integration | Agent Orchestration | Agentic Systems | AuthenticationRemote work | Training and support opportunitiesSenior-level Full TimeRemote - USA, United States R1d ago
-
AI Research Engineer USD 100K-150KAccelerator hardware | Agentic Systems | Computer Vision | Data Quality | Data quality monitoringMid-level Full TimeUnited States - Remote R1d ago
-
AI Research Engineer USD 100K-150KAblation Studies | Accelerator hardware | Computer Vision | Data Quality | Data quality monitoringCareer growth | Remote workMid-level Full TimeUnited States - Remote R1d ago
-
AI Research Engineer USD 100K-150KAccelerator hardware | Computer Vision | Data Quality | Deep learning | Distributed TrainingBenefits package | Remote workMid-level Full TimeUnited States - Remote R1d ago
-
Hadoop Big Data Developer USD 100K-150KAWS EMR | Airflow | Apache Atlas | Apache Flink | Apache HBaseBenefits | Full-time W2 employment | Remote workSenior-level Full TimeUnited States - Remote R1d ago
-
Hadoop Big Data Developer USD 100K-150KAirflow | Apache Atlas | Apache Flink | Apache Hive | Apache HudiCareer growth | Remote workSenior-level Full TimeUnited States - Remote R1d ago
-
Mid-level Full TimeUnited States - Remote R1d ago
-
Principal Data Engineer USD 151K-220KAWS | Cloud Computing | Data Governance | Data Management | Data Modeling401k matching | Business resource groups | Dental insurance | Family and medical leave | Health insuranceSenior-level Full TimeKS Remote, United States R1d ago
-
Mid-level Full TimeUnited States - Remote R1d ago
-
LLM Engineer USD 100K-150KAdapter-Tuning | Direct Preference Optimization | Efficient Attention | Evaluation methodology | FSDPMid-level Full TimeUnited States - Remote R1d ago
-
LLM Engineer USD 100K-150KAdapters | DeepSpeed ZeRO | Direct Preference Optimization | Efficient Attention | FSDPMid-level Full TimeUnited States - Remote R1d ago
-
Prompt Engineer USD 100K-150KAgent architecture | Chunking | Embeddings | Evaluation Frameworks | Fine TuningMid-level Full TimeUnited States - Remote R1d ago
-
Prompt Engineering USD 100K-150KAgentic Workflows | Chunking | Design Patterns | Deterministic systems | EmbeddingsRemote workMid-level Full TimeUnited States - Remote R1d ago
-
Prompt Engineer USD 100K-150KAgent architecture | Agent systems | Chunking | Embeddings | EvaluationMid-level Full TimeUnited States - Remote R1d ago
-
Storage Engineer USD 100K-150KAnsible | Automation | CRUSH maps | CSI drivers | Capacity PlanningDirect W2 employment with benefits | Full-time remote work | H1B transfer support | Long term multi year engagementSenior-level Full TimeUnited States - Remote R1d ago
-
Robotics Software Engineer USD 100K-150KBehavior Trees | C++ | Computer Vision | Concurrent programming | Embedded SystemsCareer growth | Mentorship | Remote work | Technical documentation supportMid-level Full TimeUnited States - Remote R1d ago
-
Robotics Software Engineer USD 100K-150KBehavior Trees | C++ | Computer Vision | Concurrent programming | Control SystemsMid-level Full TimeUnited States - Remote R1d ago
-
Robotics Software Engineer USD 100K-150KBehavior Trees | C++ | Concurrent programming | Control Systems | DebuggingMid-level Full TimeUnited States - Remote R1d ago
-
Robotics Software Engineer USD 100K-150KBehavior Trees | C plus plus | Concurrent programming | Debugging | DynamicsCareer growth | Mentorship | Remote workMid-level Full TimeUnited States - Remote R1d ago