Applied AI / Evaluation Engineer
Tasks
- Build AI evaluation harnesses
- Build AI regression testing systems
- Build golden reference datasets
- Capture human review signals
- Create automated quality gates
- Define evaluation dimensions
- Detect AI drift
- Develop human in the loop tooling
- Implement LLM as judge pipelines
- Implement agent observability tracing
- Implement continuous evaluation in CI CD
- Instrument AI telemetry
- Monitor latency and cost
- Normalize and associate review signals with interactions
- Produce evaluation reports and quality metrics
- Track AI regression metrics
- Validate judge rationales
Perks/Benefits
Skills/Tech-stack
Adversarial Testing | Agent Performance | Agent Performance Monitoring | Bias detection | CI/CD | Drift Detection | End to End | End-to-end tracing | Evaluation Harness | Golden datasets | Human-in-the-loop | LLM-as-judge | Language Processing | Learning evaluation | Machine Learning | Machine learning evaluation | NLP evaluation | Natural Language | Natural Language Processing | Observability | Performance Monitoring | Python | Quality gates | Regression testing | Statistics | Telemetry | The Loop
Education
Related jobs
-
Azure Data | Azure Data Factory | Azure Data Lake | Azure Data Lake Storage | Azure SynapseMid-level Full TimeMiami, FL, United States7h ago
-
Azure Data Engineer (Telecommunications) USD 135K-165KAzure | CI/CD | DBT | Data Quality | DatabricksSenior-level ContractFrisco, United States7h ago
-
Data Synthesis | Deep learning | Language Models | Language Processing | Large Language ModelsEntry-level InternshipSan Jose, California, United States8h ago
-
AWS | Alteryx | Amazon SageMaker | Azure | Azure DataMid-level Full TimeNew York, NY, United States8h ago
-
Machine Learning Engineer USD 128K-214KAWS | Agile | Azure | Cloud platform | GitHealth insurance | Holiday pay | Learning and development | Life insurance | Long-term disabilityMid-level Full TimeUSA-Remote Work R8h ago
-
Strategic Intelligence & Advanced Analytics Engineer USD 108K-136KAnomaly Detection | Artificial Intelligence | Azure | Data Pipelines | Data QualityPaid parental leave | Paid time off | Public service loan forgiveness | Tuition reimbursement | Wellness programsMid-level Full TimeTexas-Dallas-5323 Harry Hines Blvd8h ago
-
Fine Tuning | GPU resource management | Intelligent agents | Language Models | Large Language ModelsEntry-level Full TimeSan Jose, California, United States9h ago
-
Software Engineer, Video AI/ML Specialist USD 141K-211KAI | AV1 | AV2 | Audio Processing | Audio/VideoMid-level Full TimeBellevue, WA | Menlo Park, CA …10h ago
-
Tech Lead, AI Research Scientist (Robotics) USD 170K-251KAction Conditioned World Models | Artificial Intelligence | Computer Vision | Deep learning | Dexterous ManipulationMentorship opportunities | Open science contributions | Work authorization supportSenior-level Full TimeMenlo Park, CA10h ago
-
Network Engineer, Deployment & Support USD 101K-156K400G | 800G | AI | Automation | Coherent opticsMid-level Full TimeMenlo Park, CA | Eagle Mountain, …10h ago
-
Artificial Intelligence | Data Analysis | Data Structures | Data structures algorithms | Human-in-the-loopSenior-level Full TimeMountain View, CA, USA10h ago
-
Agent tooling | Artificial Intelligence | C++ | Cloud Architecture | Conversational AISecret clearance | TravelSenior-level Full TimeAtlanta, GA, USA; Austin, TX, USA10h ago
-
AI Pipelines | BigQuery | Cloud Composer | Cloud Pub/Sub | Cloud SpannerMid-level Full TimeChicago, IL, USA; Atlanta, GA, USA10h ago
-
Software Engineer III, AI/ML GenAI, Google Cloud Compute USD 147K-211KAudio generation | C++ | Computer Vision | Data Processing | DebuggingSenior-level Full TimeSunnyvale, CA, USA10h ago
-
Technical Program Manager II, AI/ML, Google Ads USD 138K-198KCross-Functional Collaboration | Cross-functional | Data analytics | Functional collaboration | Gemini ModelsMid-level Full TimeNew York, NY, USA10h ago
-
Senior Software Engineer, Applied AI Commerce USD 174K-252KAutomated Evaluation | C++ | Cloud | Evaluation datasets | GeminiSenior-level Full TimeSunnyvale, CA, USA; Kirkland, WA, USA10h ago
-
Senior Photonic Engineer, Machine Learning USD 159K-231KCircuit simulation | Data center | Data center network | Data center network architecture | Digital SignalSenior-level Full TimeSunnyvale, CA, USA10h ago
-
Data Processing | Data Storage | Data Structures | Data Structures and Algorithms | Distributed SystemsSenior-level Full TimeMountain View, CA, USA10h ago
-
Senior-level Full TimeNew York, New York, United States14h ago
-
Corporate Engineering Technical Sales DBA USD 200K-225KAIX | Backup and Restore | Bash | DB2 | Database Replication TechnologiesHybrid work | Travel opportunitiesMid-level Full TimeWaltham, MA, US15h ago
-
Applied AI ML Lead - LLM SUITE ENGINEERING USD 176K-215KAPI Design | AWS | Agentic AI | Caching | Cloud NativeBackup childcare | Financial coaching | Health care coverage | Mental health support | On-site health and wellness centersSenior-level Full TimeWilmington, DE, United States18h ago
-
AI Data Engineer USD 120K-220KAgent memory | Amazon Web Services | Audio Processing | Batch Processing | Cloud infrastructureAccess to AI tools | Equity | Remote opportunitiesMid-level Full TimeSan Francisco Bay Area20h ago
-
Senior-level Full TimeRaleigh, NC, US20h ago
-
AI Innovation Analyst - Internal USD 65K-80KAI Governance | AI Services | Authentication | Automation | AzureEntry-level Full TimeMiami, FL21h ago
-
Senior AI Engineer USD 107K-199KAKS | API Design | Alerts | Anomaly Detection | Apache SparkHybrid work environment | Inclusion support | Learning opportunities | Well-being supportSenior-level Full TimeUSA, Massachusetts, Boston, 200 Berkeley Street, …21h ago