Senior AI/ML Engineer - AI Systems Evaluation
Tasks
- Build automated evaluation pipelines
- Build prompt testing and dataset generation tools
- Close the loop between evaluation insights and product improvements
- Create golden datasets and edge case suites
- Define AI quality evaluation systems
- Design evaluation architectures
- Detect regressions and enforce quality gates in CI CD
- Implement LLM as judge scoring
- Instrument traces outputs and debugging
- Monitor model performance in production
- Track experiments and evaluate model performance
- Translate model behavior into measurable signals
Perks/Benefits
- N/A
Skills/Tech-stack
A/B | A/B Testing | B testing | Benchmarking | CI/CD | Data Pipelines | Debugging | Evaluation | Experiment tracking | LLM | LLM-as-judge | Logging | MLflow | Machine Learning | Observability | OpenTelemetry | Prompt engineering | Python | RAG | Regression testing | Retrieval-Augmented Generation | Tracing
Education
N/A
Related jobs
-
Senior-level Full TimeHerzliya, Tel Aviv District, IL10h ago
-
Agent Frameworks | Agent systems | Anthropic | Benchmarking | Information RetrievalSenior-level Full Timeתל אביב יפו, מחוז תל אביב, …10h ago
-
Mid-level Full TimeTel Aviv-Yafo, Israel, IL11h ago
-
Access Management | Amazon Web Services | ArgoCD | Capacity Planning | Cloud platformSenior-level Full TimeTel Aviv-Yafo, Tel Aviv District, IL14h ago
-
A/B | A/B Testing | B testing | Conversation Analytics | Function CallingSenior-level Full TimeHerzliya, Tel Aviv District, Israel16h ago
-
APIs | Anomaly Detection | Data Modeling | Data Pipelines | DockerCareer growth opportunities | Flexible work environment | Remote workMid-level Full TimeIsrael R17h ago
-
Senior-level Full TimeTel Aviv-Yafo, Tel Aviv District, IL1d ago
-
Senior-level Full TimeTel Aviv-Yafo, Tel Aviv District, IL1d ago
-
Accelerator | Deep learning | Diffusers | EEG | Experiment designAutonomy and ownership | Career growth opportunities | Continuous learning culture | Flexible globally distributed work environment | Fully remote workMid-level Full TimeIsrael R1d ago
-
Agent Orchestration | Agentic Workflows | Deep learning | Evaluation | Fine TuningMid-level Full TimeTel Aviv, Israel, IL1d ago
-
Mid-level Full TimeTel Aviv-Yafo, Tel Aviv District, IL1d ago
-
Mid-level Full TimeHerzliya, Israel, IL1d ago
-
Computational optimization | Data Curation | Deep learning | Distributed Training | GPU TrainingCollaborative global culture | Flexible work location | Fully remote | High performance GPU access | Professional growth opportunitiesSenior-level Full TimeIsrael R1d ago
-
AWS | Airflow | Bash | CI/CD | CloudFormationClear growth paths | Company parties | Great coffee | Gym | Happy hoursMid-level Full TimeTel Aviv-Yafo, Tel Aviv District, IL1d ago
-
AWS | C++ | Data Pipelines | Debugging | EC2Senior-level Full TimeTel Aviv-Yafo, Tel Aviv, ISR1d ago
-
Mid-level Full TimeTel Aviv2d ago
-
Mid-level Full TimeTel Aviv District, Israel2d ago
-
Senior-level Full TimeTel Aviv District, Israel2d ago
-
AWS | Amazon SQS | Apache Airflow | Apache Spark | Data IngestionCareer coaching | Happy hours | Hybrid work | Learning opportunities | Team outingsSenior-level Full TimeTel Aviv-Yafo R2d ago
-
AWS | Amazon Web Services | Apache Kafka | Apache Spark | Computer VisionSenior-level Full TimeTel Aviv, Israel2d ago
-
Mid-level Full TimeTel Aviv, Israel2d ago
-
Mid-level Full TimeJerusalem, Israel2d ago
-
API Development | Argo CD | Argo Workflows | Cilium | GitOpsFull-time remote workMid-level Full TimeTel Aviv-Yafo, Tel Aviv District, IL R2d ago
-
Big Data | Cloud Computing | Computer Vision | Deep learning | Graph Neural NetworksMid-level Full TimeJerusalem, Israel2d ago
-
Deep Learning System Validation Engineer ILS 420K-504KBoard Design | Deep learning | Hardware validation | High speed | High speed interfacesOn-site work modelSenior-level Full TimeISR - Haifa, Israel2d ago