AI Evaluation Scientist
Tasks
- Analyze model behavior
- Assess AI model outputs
- Build evaluation scripts
- Collaborate with data scientists
- Contribute to evaluation framework development
- Design evaluation processes
- Develop benchmark datasets
- Develop test harnesses
- Document evaluation results
- Perform error analysis
- Support responsible deployment
Perks/Benefits
Skills/Tech-stack
AI Evaluation | AI evaluation frameworks | Behavior Analysis | Data Analysis | Evaluation Frameworks | Evaluation metrics | Hugging Face | Langchain | Language Processing | Model behavior | Model behavior analysis | Natural Language | Natural Language Processing | PyTorch | Python | Scikit-learn | Statistical Testing | Test Design
Education
Roles
Related jobs
-
Mid-level Full TimeFort Lauderdale, FL, United States5h ago
-
Lead Applied Scientist USD 150K-170KAgentic AI | Autogen | CrewAI | Data Quality | Deep learning401k match | Dental insurance | Employee assistance program | Employee recognition | Employee stock purchase planSenior-level Full TimeWork From Home, United States R6h ago
-
Sr. Tech Lead, GTM Applied AI & Analytics USD 150K-243KAirflow | Data Warehousing | Databricks | Fine Tuning | LLM APIsSenior-level Full TimeSan Francisco, CA, United States10h ago
-
Senior-level Full TimeMenlo Park, CA11h ago
-
Data Scientist USD 167K-203KData Modeling | Experimental Design | Forecasting | Hypothesis Testing | OptimizationEntry-level Full TimeMenlo Park, CA11h ago
-
Data Scientist, Product USD 209K-235KClustering | Data Mining | Descriptive Statistics | Distributed Systems | ETLTelecommutingSenior-level Full TimeMenlo Park, CA | Remote, US R11h ago
-
AI Engineer, Professional Services, Google Cloud USD 183K-265KApache Beam | Apache Spark | C++ | Data Validation | Data WarehousingTechnical workshops | Travel opportunitiesSenior-level Full TimeAustin, TX, USA; Atlanta, GA, USA11h ago
-
Staff Product Data Scientist, Google Play App USD 192K-278KA/B | A/B Testing | B testing | Experiment design | LoggingSenior-level Full TimeMountain View, CA, USA11h ago
-
Research Data Scientist, Operations Data Science USD 147K-211KMachine Learning | Operations Research | Python | R | SQLMid-level Full TimeAustin, TX, USA; Atlanta, GA, USA11h ago
-
Senior Staff Research Data Scientist, DevIE USD 262K-365KData Modeling | Data Quality | Machine Learning | Python | RSenior-level Full TimeSunnyvale, CA, USA; Kirkland, WA, USA11h ago
-
Manager, IT AI Engineering - Agent Engineering USD 96K-131KA/B | A/B Testing | API Design | Access Control | Agent architecture401k program | Car discounts | Cruise discounts | Employee assistance program | Flexible spending accountsMid-level Full TimeFort Worth, TX, US14h ago
-
Intern, AI Engineering USD 80K-124KCUDA | Hugging Face | Hugging Face Transformers | Inference Optimization | Language ModelsEmployee benefits | Flexible work environment | Remote work optionsEntry-level InternshipSan Francisco, California19h ago
-
AI APIs | Browser Automation | Cloud Computing | Dashboard Development | Data VisualizationCareer development | Intern community | Mentorship | TrainingEntry-level InternshipNew York, New York19h ago
-
Scientist, Advanced Microscopy (Data Science & TEM) USD 100K-120K3D Imaging | Computer Vision | Data Analysis | Data Modeling | Data VisualizationSenior-level Full TimePhoenix, AZ, United States20h ago
-
Data Scientist , Amazon Business USD 136K-184KAWS | Data Analysis | Data Modeling | Fraud Detection | Machine LearningMid-level Full TimeSeattle, Washington, USA22h ago
-
Lead Architect , AI Solutions Architecture - PI USD 169K-279KAI Act | AI Foundry | AI RMF | AI Services | API Integration401k match | Health insurance | Mental health counseling | Paid Holidays | Paid time offSenior-level Full TimeHartford - Tower, United States22h ago
-
Lead Architect , AI Solutions Architecture - EDDA USD 169K-279KAI RMF | AWS Bedrock | Agentic AI | Artificial Intelligence | CI/CD401k match | Health insurance | Paid time off | Volunteer rewards | Wellness programSenior-level Full TimeHartford - Tower, United States22h ago
-
Senior Architect, AI Solutions Architecture - ETS USD 139K-230KAWS Bedrock | Agent Orchestration | Artificial Intelligence | CI/CD | Crew AI401k match | Health insurance | Mental health counseling | Paid Holidays | Paid time offSenior-level Full TimeHartford - Tower, United States22h ago
-
Lead Architect, AI Solutions Architecture - BSI/Cyber USD 169K-279KAI Act | AI Ops | AI RMF | AWS Bedrock | Agentic AI401k match | Health insurance | Mental health counseling | Paid time off | Volunteer rewardsSenior-level Full TimeHartford - Tower, United States22h ago
-
Lead Architect , AI Solutions Architecture - Claim USD 169K-279KAI Act | AI Ops | AWS Bedrock | Agent Orchestration | Agentic AI401k match | Employee assistance program | Health insurance | Matching gift program | Paid time offSenior-level Full TimeHartford - Tower, United States22h ago
-
Lead Architect , AI Solutions Architecture - BI/INTL USD 169K-279KAI Act | AI Ops | AI RMF | AI Services | AWS Bedrock401k match | Free counseling services | Health coaching | Health insurance | Matching giftSenior-level Full TimeHartford - Tower, United States22h ago
-
Senior Architect, AI Solutions Architecture - CorpTech USD 139K-230KAI Governance | AI Ops | AWS Bedrock | Agent Orchestration | Agentic AI401k matching | Health insurance | Mental health counseling | Paid Holidays | Paid time offSenior-level Full TimeHartford - Tower, United States22h ago
-
Scientist III, Data Engineer USD 161K-186KACID | API Gateway | AWS API | AWS API Gateway | AWS CodePipeline401k | Accident insurance | Commuter benefits | Dental insurance | Employee assistance programSenior-level Full TimeUS - Waltham, MA - 168 … R22h ago
-
Data Scientist - Innovation - PhD (Irving, TX) USD 95K-115KAWS | Agentic AI | Attention | CNV | CNV CallingOn-site employment | Relocation assistanceMid-level Full TimeIrving - HQ, United States22h ago
-
Data Scientist USD 75K-109KData Modeling | Machine Learning | Numerical Optimization | Python | SQLMid-level Full Time265 Charles Street Boston (Austen Building), …22h ago