数据语料工程师
Tasks
- Align and fuse multimodal data for time synchronization and spatial alignment
- Automate data collection cleaning transformation and dataset building
- Build high-performance data processing pipelines
- Build multimodal AI data corpus pipeline
- Design data workflows for data production
- Implement AI data asset management
- Manage dataset versioning and data lineage
- Monitor data quality and standardize datasets
- Optimize dataset quality and coverage for training and simulation
- Process lidar video sensor and structured data
Perks/Benefits
- N/A
Skills/Tech-stack
Apache Arrow | Apache Beam | Apache Flink | Apache Hudi | Apache NiFi | Apache Parquet | Apache Spark | Dagster | Data Governance | Data Lineage | Data Modeling | Data Quality | Data quality monitoring | Dataset versioning | Delta Lake | Iceberg | Lakehouse | Python | Quality monitoring
Education
N/A
Roles
Related jobs
-
Machine Learning Engineer CNY 300K-380KArtifact tracking | Data Lineage | Data Pipelines | Distributed Systems | DockerFitness Events | Free meals | Hybrid working | Paid time off | Volunteer opportunitiesMid-level Full TimeShanghai, China6h ago
-
Senior-level Full Time上海、北京13h ago
-
机器学习平台研发工程师/专家 CNY 240K-360KDebugging | Distributed Training | Docker | Elastic scaling | Fault ToleranceSenior-level Full Time北京、上海13h ago
-
Ai产品经理 CNY 144K-240KArtificial Intelligence | Coze | Data cleaning | Data collection | Data synchronizationEntry-level Full Time深圳13h ago
-
机器人 Vln 大模型导航-实习生 CNY 25K-37KArtificial Intelligence | C++ | CUDA | Computer Vision | Data PipelinesOnsite workEntry-level Internship北京14h ago
-
Entry-level Internship南京14h ago
-
Entry-level Internship南京14h ago
-
Entry-level Internship南京14h ago
-
nlp算法工程师-2027届 CNY 25K-37KDeep learning | DeepSpeed | Information Retrieval | Intent Recognition | Language ProcessingInternshipEntry-level Internship武汉14h ago
-
Entry-level Full Time上海15h ago
-
数据开发实习生 CNY 25K-37KApache Superset | Data Modeling | Data Quality | Data Visualization | Data quality monitoringEntry-level Internship Part Time上海15h ago
-
Entry-level Internship深圳15h ago
-
Entry-level Internship北京16h ago
-
大模型 Infra 研发实习生(Agentic RL 方向) CNY 25K-37KAsynchronous programming | Concurrency | Distributed Systems | Docker | GRPOEntry-level Internship深圳16h ago
-
AI Agent 开发实习生(通用智能仿真方向) CNY 25K-37KAPI | API Integration | Agent architecture | Agent systems | Asynchronous programmingEntry-level Internship广州16h ago
-
Apache Airflow | Apache Spark | Automated testing | Data Lakes | Data WarehousesCommute subsidy | Disability insurance | Employee assistance program | Employee resource groups | Employee stock ownershipSenior-level Full TimeShanghai, China1d ago
-
Embedded Base Software Testing Engineer- Intern CNY 74K-100KC# | CAN | Excel | Hardware-in-the-loop | I2CEntry-level Full Time InternshipWuhan, Hubei, China1d ago
-
Senior Software Engineer (RAG Backend Developer) CNY 120K-180KA/B | A/B Testing | ABAC | Audit Logging | B testingSenior-level Full TimeGuangzhou, Guangdong, China R1d ago
-
Embedded Base Software Testing Engineer- Intern CNY 74K-100KC# | CAN | Excel | Hardware-in-the-loop | I2CEntry-level Full Time InternshipWuhan, Hubei, China1d ago
-
Magnetic Recording Algorithm Development Engineer CNY 150K-240KAlgorithm Development | Automated Test | Automated Test Equipment | C# | C++Senior-level Full TimeShenzhen, Guangdong Province, China1d ago
-
Assistant Manager, Data Platform Delivery CNY 300K-406KARMA | Amazon SageMaker | Association rule | Association rule learning | AzureMid-level Full TimeChina - Guangzhou1d ago
-
Mid-level Full TimeShanghai, Shanghai, China1d ago
-
Senior-level Full TimeShenyang - PIC, China1d ago
-
Mid-level Full Time深圳2d ago
-
Mid-level Full Time深圳2d ago