Member of Technical Staff, Pre-training Data
Tasks
- Build large-scale web crawling pipelines
- Collaborate on corpus strategy
- Design filtering and deduplication systems
- Improve data pipeline observability and reliability
- Manage data quality and versioning
- Optimize distributed data processing
- Run data ablation experiments
Perks/Benefits
- 401k match
- Health, dental, vision insurance
- Relocation stipend
- Unlimited paid time off
- Visa sponsorship
Skills/Tech-stack
Data Deduplication | Data Filtering | Data Processing | Data Systems | Data pipeline | Data pipeline optimization | Distributed data | Distributed data systems | Experiment design | Pipeline Optimization | Scalability | Software Engineering | System Reliability
Education
Regions
Countries
States
Related jobs
-
Staff Software Engineer, Data USD 270KApache Flink | Apache Kafka | Apache Spark | Cloud Data | Cloud data warehousing401k matching | ADND Insurance | Company holidays | Extended parental leave | Flexible spending accountSenior-level Full TimeUSA, Palo Alto20h ago
-
Software Engineer, Data Security - USDS USD 118K-237KData Security | Distributed Systems | Language Models | Large Language Models | Performance optimizationEntry-level Full TimeSan Jose, California, United States22h ago
-
Staff Software Engineer, Torch TPU USD 207K-300KCUDA | Computer Vision | Data Processing | Debugging | Distributed SystemsSenior-level Full TimeSunnyvale, CA, USA23h ago
-
C++ | Compilers | Custom Kernels | Data Processing | Data StructuresSenior-level Full TimeMountain View, CA, USA23h ago
-
Bash | Cloud platform | Data Processing | Docker | Google CloudAsynchronous culture | Competitive compensation | Remote-friendly cultureMid-level Full TimeChicago, USA1d ago
-
Bash | Cloud platform | Data Ingestion | Data Processing | DockerAsynchronous culture | Friendly laid-back atmosphere | Portfolio collaboration | Remote/distributed workMid-level Full TimeSeattle, USA1d ago
-
Bash | Cloud infrastructure | Cloud platform | Data Processing | DockerAsynchronous culture | Bonus | Equity | Friendly work environmentMid-level Full TimeBoston, USA1d ago
-
Bash | Cloud platform | Data Ingestion | Data Processing | DockerMid-level Full TimeSalt Lake City, USA1d ago
-
Bash | Data Processing | Docker | GCP | Large Scale DataAsynchronous culture | Flexible management | Remote workMid-level Full TimeNashville, USA1d ago
-
Bash | Cloud platform | Data Processing | Docker | GCPAsynchronous culture | Friendly and laid-back atmosphere | Handsoff managementMid-level Full TimeRaleigh-Durham, USA1d ago
-
Bash | Data Ingestion | Data Processing | Docker | GCPAsynchronous culture | Flexible management approach | Friendly laid-back atmosphere | Remote distributed environmentMid-level Full TimeMadison, USA1d ago
-
Bash | Cloud platform | Data Processing | Docker | Google CloudMid-level Full TimeIthaca, USA1d ago
-
Bash | Data Ingestion | Data Processing | Docker | GCPAsynchronous culture | Flexible distributed work | Supportive team environmentMid-level Full TimeHouston, USA1d ago
-
Robotics Electrical / Computer Engineer USD 150K-200KAllegro | Altium Designer | BLDC Motor | Board Bringup | CAD Tools401k plan | Dental insurance | Equity | Health insurance | Life insuranceSenior-level Full TimeHouston, TX1d ago
-
Software Engineer, Ads Data Application USD 136K-205KArtificial Intelligence | Backend Development | Batch Processing | Data Processing | Data WarehouseMid-level Full TimeSan Jose, California, United States1d ago
-
Software Engineer, Audio Embedded DSP - Reality Labs USD 117K-173KAndroid | Audio Processing | Audio Software | Audio software engineering | C#Mid-level Full TimeSunnyvale, CA1d ago
-
Software Engineer, Systems USD 221K-240KAlgorithms | CSS | Data Analysis | Data Modeling | Data ProcessingEntry-level Full TimeBellevue, WA1d ago
-
Staff Software Engineer, AI/ML, YouTube USD 207K-300KAudio Processing | Data Processing | Debugging | Distributed Systems | Fine TuningSenior-level Full TimeSan Bruno, CA, USA1d ago
-
Senior Staff Software Engineer, AI/ML, Google Cloud USD 262K-365KData Processing | Data Structures | Data Structures and Algorithms | Debugging | Fine TuningSenior-level Full TimeSunnyvale, CA, USA1d ago
-
Senior Software Engineer, AI/ML, Google Cloud AI USD 174K-252KC++ | Data Processing | Data Structures | Data structures algorithms | DebuggingSenior-level Full TimeMountain View, CA, USA1d ago
-
Staff Software Engineer, AI/ML Data Processing USD 207K-300KAPIs | Checkpointing | Cloud technologies | Data Processing | DebuggingSenior-level Full TimeSunnyvale, CA, USA1d ago
-
Space Operations Engineer (Embedded Software) USD 115K-160KC# | C++ | Command and control | Communication Protocols | Data ProcessingMid-level Full TimeSan Francisco, CA2d ago
-
Algorithms | Cloud Computing | Critical Systems | Data Structures | Design PatternsFlexible work arrangement | Mentorship and career growth | Work-life balanceMid-level Full TimeDenver, Colorado, USA2d ago
-
AI Hardware Systems Engineer, Annapurna Labs, Trainium Machine Learning Fleet Operations USD 136K-184KAutomation | Bash | Data Analysis | Data Infrastructure | GPU debuggingMid-level Full TimeAustin, Texas, USA2d ago
-
Software Engineer III, AI/ML, Google Research USD 147K-211KAlgorithms | C++ | Data Processing | Data Structures | Deep learningBenefits | Bonus | EquitySenior-level Full TimeMountain View, CA, USA; Cambridge, MA, …2d ago