Member of Technical Staff, Pre-training Data
Tasks
- Build large-scale web crawling pipelines
- Collaborate on corpus strategy
- Design filtering and deduplication systems
- Improve data pipeline observability and reliability
- Manage data quality and versioning
- Optimize distributed data processing
- Run data ablation experiments
Perks/Benefits
- 401k match
- Health, dental, vision insurance
- Relocation stipend
- Unlimited paid time off
- Visa sponsorship
Skills/Tech-stack
Data Deduplication | Data Filtering | Data Processing | Data Systems | Data pipeline | Data pipeline optimization | Distributed data | Distributed data systems | Experiment design | Pipeline Optimization | Scalability | Software Engineering | System Reliability
Education
Regions
Countries
States
Related jobs
-
Software Engineer, Data Security - USDS USD 118K-237KData Security | Distributed Systems | Language Models | Large Language Models | Performance optimizationEntry-level Full TimeSan Jose, California, United States17h ago
-
Staff Software Engineer, Torch TPU USD 207K-300KCUDA | Computer Vision | Data Processing | Debugging | Distributed SystemsSenior-level Full TimeSunnyvale, CA, USA18h ago
-
C++ | Compilers | Custom Kernels | Data Processing | Data StructuresSenior-level Full TimeMountain View, CA, USA18h ago
-
Robotics Electrical / Computer Engineer USD 150K-200KAllegro | Altium Designer | BLDC Motor | Board Bringup | CAD Tools401k plan | Dental insurance | Equity | Health insurance | Life insuranceSenior-level Full TimeHouston, TX1d ago
-
Software Engineer, Ads Data Application USD 136K-205KArtificial Intelligence | Backend Development | Batch Processing | Data Processing | Data WarehouseMid-level Full TimeSan Jose, California, United States1d ago
-
Software Engineer, Audio Embedded DSP - Reality Labs USD 117K-173KAndroid | Audio Processing | Audio Software | Audio software engineering | C#Mid-level Full TimeSunnyvale, CA1d ago
-
Software Engineer, Systems USD 221K-240KAlgorithms | CSS | Data Analysis | Data Modeling | Data ProcessingEntry-level Full TimeBellevue, WA1d ago
-
Staff Software Engineer, AI/ML, YouTube USD 207K-300KAudio Processing | Data Processing | Debugging | Distributed Systems | Fine TuningSenior-level Full TimeSan Bruno, CA, USA1d ago
-
Senior Staff Software Engineer, AI/ML, Google Cloud USD 262K-365KData Processing | Data Structures | Data Structures and Algorithms | Debugging | Fine TuningSenior-level Full TimeSunnyvale, CA, USA1d ago
-
Senior Software Engineer, AI/ML, Google Cloud AI USD 174K-252KC++ | Data Processing | Data Structures | Data structures algorithms | DebuggingSenior-level Full TimeMountain View, CA, USA1d ago
-
Staff Software Engineer, AI/ML Data Processing USD 207K-300KAPIs | Checkpointing | Cloud technologies | Data Processing | DebuggingSenior-level Full TimeSunnyvale, CA, USA1d ago
-
Space Operations Engineer (Embedded Software) USD 115K-160KC# | C++ | Command and control | Communication Protocols | Data ProcessingMid-level Full TimeSan Francisco, CA2d ago
-
Algorithms | Cloud Computing | Critical Systems | Data Structures | Design PatternsFlexible work arrangement | Mentorship and career growth | Work-life balanceMid-level Full TimeDenver, Colorado, USA2d ago
-
AI Hardware Systems Engineer, Annapurna Labs, Trainium Machine Learning Fleet Operations USD 136K-184KAutomation | Bash | Data Analysis | Data Infrastructure | GPU debuggingMid-level Full TimeAustin, Texas, USA2d ago
-
Software Engineer III, AI/ML, Google Research USD 147K-211KAlgorithms | C++ | Data Processing | Data Structures | Deep learningBenefits | Bonus | EquitySenior-level Full TimeMountain View, CA, USA; Cambridge, MA, …2d ago
-
Senior Staff Software Engineer, AI/ML GenAI, Google Ads USD 262K-365KAlgorithms | C++ | Data Processing | Data Structures | DebuggingBenefitsSenior-level Full TimeMountain View, CA, USA2d ago
-
Agent-based | Agent-based systems | Algorithms | C++ | Cloud PlatformsBenefitsSenior-level Full TimeSunnyvale, CA, USA2d ago
-
Senior Software Engineer, Cloud Asset Platform USD 174K-252KBackend Development | Data Processing | Distributed Systems | Hardware Architecture | High ThroughputBenefits | Bonus | EquitySenior-level Full TimeKirkland, WA, USA2d ago
-
Staff Software Engineer, ML Infrastructure, Core Infra USD 207K-300KCloud platform | Data Processing | Data Storage | Data Systems | DebuggingBenefitsSenior-level Full TimeSunnyvale, CA, USA2d ago
-
Senior Staff Machine Learning Engineer USD 245K-319KApache Beam | Apache Spark | Artificial Intelligence | Data Processing | Deep learningSenior-level Full TimeBrooklyn, NY, United States3d ago
-
Software Engineer II USD 130K-165KC plus plus | Continuous Learning | Debugging | Distributed Systems | GoContinuous learning culture | Hybrid work modelSenior-level Full TimeRedwoodcity, California, United States3d ago
-
AEP Lead Data Solutions Engineer USD 154K-281KAPI | Data Governance | Data Visualization | Data pipeline | ETLSenior-level Full TimeSan Jose, United States3d ago
-
AI Product Engineer USD 149K-214KAI APIs | API Integration | Agents | Data Pipelines | LLMsComprehensive benefits | Flexible work hoursMid-level Full TimeRemote - USA R3d ago
-
Senior Software Engineer - ML Offboard Models USD 179K-268KC++ | Computer Vision | Data Processing | Deep learning | Diffusion ModelsDental insurance | Employee assistance | Health savings account | Life insurance | Medical insuranceSenior-level Full TimePalo Alto, CA3d ago
-
Principal Embedded Software Engineer USD 110K-196KArchitecture Diagrams | CI/CD | Code standards | Communication Protocols | Cross-team5S Standards | Continuous improvement projects | Mentoring and coaching | Travel as neededSenior-level Full TimeTualatin, Oregon, United States3d ago