数据语料工程师
Tasks
- Align and fuse multimodal data for time synchronization and spatial alignment
- Automate data collection cleaning transformation and dataset building
- Build high-performance data processing pipelines
- Build multimodal AI data corpus pipeline
- Design data workflows for data production
- Implement AI data asset management
- Manage dataset versioning and data lineage
- Monitor data quality and standardize datasets
- Optimize dataset quality and coverage for training and simulation
- Process lidar video sensor and structured data
Perks/Benefits
- N/A
Skills/Tech-stack
Apache Arrow | Apache Beam | Apache Flink | Apache Hudi | Apache NiFi | Apache Parquet | Apache Spark | Dagster | Data Governance | Data Lineage | Data Modeling | Data Quality | Data quality monitoring | Dataset versioning | Delta Lake | Iceberg | Lakehouse | Python | Quality monitoring
Education
N/A
Roles
Related jobs
-
Audit Logging | CI/CD | Data Governance | Data Privacy | Drift DetectionSenior-level Full TimeShanghai, Shanghai, China11h ago
-
Senior AI Engineer CNY 240K-480KAgent Orchestration | Authentication | Authorization | CI Gates | CI/CDSenior-level Full TimeChina14h ago
-
Bash | Cloud platform | Data Processing | Docker | Google CloudAsynchronous culture | Friendly work environment | Hands-off management | Remote/distributed workMid-level Full TimeShanghai, China16h ago
-
SAP China iXp Interns - AI Engineer Intern CNY 28K-50KAWS | Langchain | Language Models | Large Language Models | Prompt TuningPaid internshipEntry-level InternshipShanghai, CN, 20120321h ago
-
Artificial Intelligence | C# | C++ | Computer Architecture | GStreamerSenior-level Full TimeChina Shanghai21h ago
-
Forward Deployed AI Engineer CNY 37K-37KAWS | Agile | Azure | BigQuery | Cloud ComputingTravel opportunitiesEntry-level Full Time Internship北京23h ago
-
Mid-level Full Time北京23h ago
-
Mid-level Full Time北京23h ago
-
Mid-level Full Time杭州23h ago
-
Entry-level Full Time深圳、上海、北京、中国香港1d ago
-
【26届校招】大语言模型后训练算法工程师(Foundation Model) CNY 240K-480KData loading | Distributed Training | Docker | Fine Tuning | Inference OptimizationEntry-level Full Time上海、深圳1d ago
-
Agent数据工程师-2026届 CNY 25K-37KArtificial Intelligence | Data Governance | Data Structures | Data Structures and Algorithms | Data WarehousingEntry-level Internship北京、上海1d ago
-
Agent 服务端开发实习生(AI Agent / AI App) CNY 37K-37KContainerization | Cpluspluplus | Dify | Distributed Systems | GoEntry-level Internship北京、上海1d ago
-
Data Processing | Deep learning | Distributed Training | Generative Models | Human FeedbackFamily leave | Free food and snacks | Health care plan | Life insurance | Long-term disabilitySenior-level Full Time费利蒙1d ago
-
Entry-level Full Time上海1d ago
-
数据算法工程师(实习生) CNY 25K-37KAnomaly Filtering | C++ | Data Generation | Data Processing | Data cleaningInternshipEntry-level Internship上海1d ago
-
Mid-level Full Time武汉1d ago
-
Entry-level Full Time北京 R1d ago
-
Agent 全栈研发工程师(前/后端)-MiMo CNY 180K-300KAPI Design | Authentication | Authorization | Browser Automation | CI/CDEntry-level Full Time北京1d ago
-
AI基础设施研发工程师(Sandbox / 容器化)-MiMo CNY 180K-420KContainerd | Distributed Systems | Docker | ELK | File SystemMid-level Full Time北京 R1d ago
-
数据仓库工程师 CNY 180K-420KADF | Data Architecture | Data Cleansing | Data Modeling | Data TransformationMid-level Full Time杭州1d ago
-
机器学习-2026届(Devops) CNY 240K-480KData Preprocessing | Feature Engineering | Hyperparameter Tuning | Machine Learning | Model DeploymentCareer development path | High-performance computing resources | Internship opportunity | Mentor coaching | System training planSenior-level Internship重庆、北京1d ago
-
Entry-level Full Time北京1d ago
-
高级影像高级算法工程师(博士) CNY 240K-480KC++ | Computer Vision | Deep learning | Face Recognition | Image RecognitionSenior-level Full TimeShenzhen1d ago
-
Mid-level Full Time上海1d ago