Research Engineer – Benchmarking, Evals & Failure Analysis
Tasks
- Align evaluation systems with training and product goals
- Analyze data quality and performance trends
- Build evaluation pipelines
- Conduct failure analysis
- Create rubrics and evaluators
- Design benchmarking systems
- Develop scoring dashboards and reporting
- Run LLM evaluations and experiments
Perks/Benefits
Skills/Tech-stack
API Integration | Algorithms | Benchmarking | Cloud infrastructure | Dashboards | Data Structures | Deep learning | Evaluation Pipelines | Experiment tracking | Failure analysis | LLM Evaluation | Language Models | Large Language Models | Machine Learning | NoSQL | Python | SQL | Scoring systems
Education
N/A
Regions
Countries
States
Related jobs
-
Software Engineer, Databases (Technical Leadership) USD 151K-293KAI | Agent Orchestration | Automated Performance Tuning | Consensus Protocols | Data IntegritySenior-level Full TimeBellevue, WA | Menlo Park, CA3h ago
-
Low Power Design Methodology and Optimization Engineer USD 163K-237KCPF | CPU Power Optimization | Logic synthesis | Low power | Low power designSenior-level Full TimeAustin, TX, USA3h ago
-
Mid-level Full TimeSanta Barbara, CA, USA3h ago
-
Software Engineer, Managed Service for Apache Spark USD 147K-211KAPI Integration | Apache Flink | Apache Hadoop | Apache Spark | Apache YARNMid-level Full TimeKirkland, WA, USA3h ago
-
Data Center Analytics Engineer USD 120K-172KAnalytics | Artificial Intelligence | Data Engineering | Data Preparation | Data QualityMid-level Full TimeAustin, TX, USA3h ago
-
Staff Software Engineer, Cooling Optimization USD 207K-300KC++ | Compute Technologies | Control Theory | Cooling systems | Data StructuresSenior-level Full TimeSunnyvale, CA, USA3h ago
-
Senior Software Engineer, Generative AI, Safety Classifiers, Agentic Systems, Google Ads USD 174K-252KAI Agents | Data Processing | Debugging | Deep learning | GenAISenior-level Full TimeMountain View, CA, USA3h ago
-
Staff Software Engineer, AI/ML GenAI, Google Cloud USD 207K-300KCloud platform | Computer Vision | Data Processing | Data Structures | Data structures algorithmsSenior-level Full TimeNew York, NY, USA3h ago
-
Senior Software Engineer, AI/ML, Creative Intelligence USD 174K-252KAlgorithms | C++ | Data Processing | Data Structures | Deep learningSenior-level Full TimeMountain View, CA, USA3h ago
-
Computer Vision | Data Processing | Data Storage | Debugging | Deep learningSenior-level Full TimeSunnyvale, CA, USA3h ago
-
Senior Staff Software Engineer, AI/ML, IAM USD 262K-365KAccess Management | Authentication | Authorization | C++ | Cloud infrastructureSenior-level Full TimeSeattle, WA, USA; San Francisco, CA, …3h ago
-
Data Engineer, Product Data Warehouse, Go-To-Market USD 156K-226KApache Flume | Apache Spark | Business Intelligence | Code review | DashboardsMid-level Full TimeNew York, NY, USA; Atlanta, GA, …3h ago
-
Staff Software Engineer, Data Cloud Frontier AI USD 207K-300KComputer Vision | Data Processing | Distributed Systems | Fine Tuning | Language ModelsSenior-level Full TimeSeattle, WA, USA; Kirkland, WA, USA3h ago
-
Mid-level Full TimeMountain View, CA, USA3h ago
-
Genome Editing Pipeline Data Scientist USD 94K-141KAI Model Deployment | AI model | Analytics | Bias Mitigation | Business IntelligenceDental insurance | Health insurance | Paid time off | Retirement plan | Sick leaveMid-level Full TimeChesterfield, Missouri, US5h ago
-
Adversarial prompting | Computer Architecture | Computer Engineering | Computer networks | Data labelingFlexible schedule | Fully remote | No visa sponsorshipEntry-level ContractRemote (USA) R9h ago
-
Adversarial prompting | Engineering Mechanics | Engineering design | Engineering principles | Error detectionFlexible hours | Fully remoteMid-level ContractRemote (USA) R9h ago
-
Lead ML Inference Engineer, Advertising USD 246K-486KArtificial Intelligence | Co-design | Distributed Systems | GPU Acceleration | Hardware-Software Co-designCommuter benefits | Dental insurance | Disability benefits | Equity awards | Health insuranceSenior-level Full TimeSan Jose, California10h ago
-
Senior AI Engineer USD 139K-229KAnt | Apache Lucene | Apache Solr | Big Data | Configuration ManagementHealth and wellness programs | Time offSenior-level Full TimeSunnyvale, CA, United States11h ago
-
Senior Software Engineer/Computer Scientist USD 145K-170KC# | C++ | Configuration Management | Continuous integration | Distributed SystemsEmployee-owned company | Onsite work | Reasonable accommodationSenior-level Full TimeOrlando, FL, US12h ago
-
Staff Machine Learning Engineer 2, Ads USD 159K-309KAWS | Airflow | Apache Spark | BigQuery | Cloud Platforms401k plan company match | Disability insurance | Electric Car Charging Station | Employee assistance program | Flexible spending accountSenior-level Full TimeMountain View, USA13h ago
-
Staff Machine Learning Engineer 2, Ads USD 164K-282KAWS | Airflow | Amazon SageMaker | Apache Spark | BigQuery401k plan with company match | Dental insurance | Disability insurance | Electric car charging | Employee assistance programSenior-level Full TimeMountain View, USA13h ago
-
Associate Director, Biostatistics & AI USD 173K-217K21 CFR | 21 CFR Part 11 | ADaM | Adaptive Design | Annex 11401k employer match | Company provided life and disability | Comprehensive health care | Employee stock purchase program | Flex Spending AccountsMid-level Full TimeRemote - USA R13h ago
-
Senior Computational Fluid Dynamics Engineer USD 100K-190KANSYS-FLUENT | Computational Fluid Dynamics | Data Preprocessing | Data postprocessing | Fluid Dynamics401k | Bonuses | Equity | FSA | Flexible time offSenior-level Full TimeSanta Clara, CA or Remote R14h ago
-
Data Analysis | Deep learning | GenAI | Langchain | Language ModelsFreelance project-based work | Part-time hours | Project-based compensationMid-level FreelanceUnited States - Remote R14h ago