Senior AI Engineer, Agentic Evaluation & V&V
Tasks
- Analyze system behavior and failure modes
- Build evaluation frameworks for agentic AI systems
- Create reusable SDK interfaces and adapters
- Define and apply evaluation metrics
- Design benchmark scenarios and scoring logic
- Develop simulation based validation systems
- Ensure testing documentation reproducibility
- Implement experiment harnesses
- Translate mission domain concepts into evaluation scenarios
Perks/Benefits
Skills/Tech-stack
Agentic AI | Artificial Intelligence | Benchmarking | Evaluation Frameworks | Experiment tracking | LLM Agents | Language Models | Large Language Models | MLflow | Machine Learning | Metrics | Orchestration | Python | Reinforcement Learning | SDK development | Simulation | Test Harness | Trace Evaluation
Education
Bachelor of Engineering | Bachelor of Science | Master of Science
Roles
Related jobs
-
Featured Feat. AI Engineer (MTS) USD 160K-300KAPI Development | AWS | Amazon Web Services | Deep learning | FastAPIMentoring | Open source contributions | Remote workMid-levelRemote R9h ago
-
Featured Feat. Data Engineer USD 80K-150KData Monitoring | Data Quality | Data Validation | ELT | ETLRemote workEntry-levelRemote R9h ago
-
Principal Data Engineer USD 200K-240KAWS | Agentic Workflows | Anomaly Detection | Batch pipelines | CCPA401k plan | Commuter benefits | Flexible vacation | Life insurance | Long-term disabilitySenior-level Full TimeBoulder, Colorado or New York City, … R5h ago
-
AI Solution Strategist AUD 130K-180KAI Agent | AI agent design | Agent Design | Artificial Intelligence | Conversational DesignMid-level Full TimeAustralia - Remote R6h ago
-
Senior Data Scientist / AI Engineer - Generative AI SEK 686K-838KAgentic patterns | Evaluation | Experimentation | Generative AI | LLM APIsDiversity and inclusion initiatives | Hybrid working model | Work in English speaking environmentSenior-level Full TimeStockholm, SE, 111 44 R7h ago
-
Senior Applied Scientist, Scheduling and Optimization CAD 130K-170KAPI Development | Asynchronous programming | CP-SAT | CPLEX | Constraint Programming401k retirement plan | Dental insurance | Healthcare | PTO | Remote workSenior-level Full TimeCanada (Remote) R12h ago
-
Senior Machine Learning Engineer USD 200K-230KBatching | Cloud Inference | Computer Vision | Deep learning | Edge ComputingDental insurance | Flexible PTO | Health insurance | Remote work | Vision insuranceSenior-level Full TimeRemote, US or Canada - NYC … R12h ago
-
Senior Data Engineer USD 150K-165KAPIs | AWS | Automation | CI/CD | Data Pipelines401k matching | Birthday day off | Fitness stipend | Floating holidays | Health benefitsSenior-level Full TimeUnited States R12h ago
-
Senior Embedded Systems Engineer USD 170K-226KACAP | Ansible | Bash | Cellular networking | DNS401k match | Dental insurance | Medical insurance | PTO | Sick days without limitSenior-level Full TimeChicago / Remote R13h ago
-
Senior Data Engineer, Data Foundations & AI Platform USD 153K-207KAPIs | Alerting | Apache Spark | CI/CD | Data Lineage401k with company match | Disability insurance | Flexible time off | Health, dental, and vision insurance | Leave of absenceSenior-level Full TimeUnited States (Remote) R13h ago
-
AI Governance | AWQ | AWS CDK | AWS SageMaker | Agent systemsEquipment and office stipend | Flexible PTO | Learning and development stipend | Medical insurance | Paid exams and certificationsMid-level Full TimeMEXICO R14h ago
-
AI/ML Engineering Manager USD 140K-185KAWS | AWS CDK | AWS CloudFormation | AWS Glue | Airflow100 percent remote work | Equipment and office stipend | Flexible PTO | Generous holidays | Individual professional development planMid-level Full TimeARGENTINA R14h ago
-
AWS | AWS Glue | Airflow | Amazon Bedrock | Amazon Kinesis100% remote work | Annual learning stipend | Equipment stipend | Flexible time off | Office stipendMid-level Full TimeBRAZIL R14h ago
-
AI/ML Engineering Manager USD 140K-215KAWS | AWS CDK | AWS CloudFormation | AWS Glue | Agent systems401k plan | Company laptop | Dental insurance | Equipment and office stipend | Flexible spending accountMid-level Full TimeUSA R14h ago
-
Applied AI Engineer, Agentic Systems INR 2755K-5500K.NET | API Development | Agent Orchestration | Anthropic API | C SharpSenior-level Full TimeRemote - India R14h ago
-
Machine Learning Lead (LLM) USD 165K-210KAgentic Workflows | Data Analysis | Deep learning | Docker | EmbeddingsDental coverage | Health coverage | Medical coverage | Meetups in NYC and DC | Remote-firstSenior-level Full TimeRemote R15h ago
-
Product Manager, AI Solutions CNY 304K-399KAnalytics | Backlog Management | Canva | Data Analysis | DemosDirect Access To Founding Team | Health insurance | High autonomy | Low bureaucracy | Optional hack weeksMid-level Full TimeShenzhen R15h ago
-
Sr. Data Engineer II (6516) USD 152K-188KAWS | Apache NiFi | Cloudera | Data Architecture | Data Preparation401k matching | Dental insurance | Dependent care | Employee Assistance and Wellness Programs | Flexible work arrangementsMid-level Full TimeRemote R15h ago
-
AI Engineer - Model Performance USD 165K-250KAttention Backend | Audio Processing | Batching | CUDA | CUDA graphAsync communication | Innovation-focused culture | Remote work | Startup environment | Supportive teamMid-level Full TimeSF Hybrid R15h ago
-
Senior Data Engineer USD 95K-135KAWS | Airflow | C++ | Cassandra | Cloud platform401k matching | Community service days | Dental insurance | Disability benefits | Fertility and adoption benefitsSenior-level Full TimeChicago, IL R16h ago
-
Senior Data Engineer USD 95K-135KAWS | Airflow | C++ | Cassandra | Cloud platform401k matching | Community service days | Dental insurance | Disability benefits | Fertility and adoption benefitsSenior-level Full TimeDenver, CO R16h ago
-
Senior Data Engineer USD 137K-170KAWS | Airflow | Apache Spark | Azure | C++401k matching | Community service days | Dental insurance | Disability benefits | Fertility and adoption benefitsSenior-level Full TimeHouston, TX R16h ago
-
Senior Data Engineer USD 137K-170KAWS | Airflow | Apache Spark | C plus plus | Cassandra401k matching | Community service days | Dental insurance | Disability benefits | Fertility and adoption benefitsSenior-level Full TimeDallas, TX R16h ago
-
Staff Software Engineer - Semantic Layer INR 3000K-4000KAPI Design | Amazon Redshift | Backwards Compatibility | BigQuery | CompilationSenior-level Full TimeIndia - Remote R17h ago
-
AI Research Engineer GBP 120K-200KCUDA | Data Analysis | Deep learning | Machine Learning | PyTorchAsync communication | Autonomy | Meeting-free days | Remote workMid-level Full TimeHybrid (UK) R17h ago