Site Reliability Engineer (AI)
Tasks
- Build and maintain monitoring and alerting layer for AI applications and pipelines
- Collaborate with engineering teams to improve release quality and system stability
- Define and implement SLIs alerts and operational dashboards
- Diagnose production issues and implement fixes
- Manage incidents including triage coordination root cause analysis and prevention
- Optimize CI CD pipelines and implement quality gates
- Standardize telemetry across systems
Perks/Benefits
- Comprehensive healthcare
- Fully remote
- International projects
- Long-term B2B contract
- Multinational environment
Skills/Tech-stack
Alerting | Azure | Azure DevOps | CI/CD | Cause analysis | Datadog | Grafana | Incident Management | Kubernetes | Monitoring | Operational dashboards | Root Cause Analysis | Root cause | SLI | Telemetry
Education
N/A
Related jobs
-
Mid-level Full Time北京 R12h ago
-
API Development | AWS | Artificial Intelligence | Caching | Data PipelinesAWS Partnership training | Advanced engineering resources | Early access to cloud capabilitiesSenior-level Full TimeBogota, Colombia (Remote Friendly) R13h ago
-
Senior AI Engineer GBP 75K-75KAWS | Agent systems | Artificial Intelligence | Azure | CI/CDAnnual bonus | Discounted gym membership | Electric vehicle leasing | Experience days | Hybrid workSenior-level Full TimeLondon, United Kingdom R14h ago
-
Mid-Level Data Engineer USD 90K-98KAPI Development | Azure Data | Azure Data Factory | Azure Data Lake | Azure Data Lake StorageRemote workMid-level Full TimeWork from home, VA, United States R14h ago
-
Senior Data Engineer USD 165K-180KAPIs | Anomaly Detection | Azure | Azure Data | Azure Data FactorySenior-level Full TimeWork from home, VA, United States R14h ago
-
Senior Software Developer, Data & MLOps CAD 120K-145KAWS | AWS CDK | Agile | Ansible | AzureFlexible hours | Paid sabbatical | Parental program | Training | Wellness spending accountSenior-level Full TimeMontreal (EN) R15h ago
-
AWS Glue | AWS Lambda | Airflow | Amazon S3 | AzureRemote workSenior-level Full TimeRemote R16h ago
-
Data Engineer (MS) (Remote) INR 2040K-3380KCI/CD | Data Transformation | Data Validation | Date normalization | ETLMentorship opportunities | Professional growth | Remote workSenior-level Full TimeMaharashtra, Pune, India R16h ago
-
ML / LLM Engineer (Remote) INR 2500K-3000KAmazon Web Services | Azure | Classification | Feature Engineering | Language ModelsRemote workMid-level Full TimeMaharashtra, Pune, India R17h ago
-
AI Engineer EUR 61K-79KAWS Agent | AWS Agent Core | AWS Bedrock | AWS SageMaker | Agent coreCareer development program | Mentoring program | Remote work | Training budget | WFH allowanceMid-level Full TimeEuropean Union R20h ago
-
Data Engineer Databricks (H/F) EUR 47K-55KAmazon Web Services | Apache Spark | Azure | Azure DevOps | CI/CDCareer development | Flexible remote work | Meal tickets | Paid time off | RTT daysSenior-level Full TimeSAINT OUEN, France R22h ago
-
Senior Databricks EUR 46K-55KAWS | Apache Spark | Azure | Azure DevOps | Batch ProcessingCareer coaching | Conference speaking opportunities | Flexible telework | Meal tickets | Paid time offSenior-level Full TimeSAINT OUEN, France R22h ago
-
Senior Azure Data Engineer - Agentic AI Platform EUR 51K-70KActive Directory | Azure | Azure Data | Azure Data Lake | Azure DatabricksFlexible working hours | Hybrid working | Mentoring | Proof of concept opportunities | Remote WorkingSenior-level Full TimeRemote job R1d ago
-
APIs | Anomaly Detection | Data Modeling | Data Pipelines | Docker100 percent remote work | Autonomous work environment | Career growth | Flexible work environment | International team cultureMid-level Full TimeHungary R1d ago
-
Anomaly Detection | Data Modeling | Data Pipelines | Docker | JavaScript100% remote work | Career growth opportunities | Flexible work environmentMid-level Full TimeCzechia R1d ago
-
APIs | Anomaly Detection | Data Modeling | Data Pipelines | Docker100% remote work | Career growth opportunities | Flexible work environment | International team cultureMid-level Full TimeNorway R1d ago
-
API Integration | Anomaly Detection | Data Modeling | Docker | Machine Learning100 percent remote work | Autonomy | Career growth | Flexible work environment | International team cultureMid-level Full TimeLuxembourg R1d ago
-
API Integration | Anomaly Detection | Data Modeling | Data Pipelines | DockerCareer growth | Flexible schedule | International team culture | Remote workMid-level Full TimeCroatia R1d ago
-
APIs | Anomaly Detection | Data Modeling | Data Pipelines | Docker100% remote work | Autonomy | Career growth | Flexible work environment | International team cultureMid-level Full TimeBulgaria R1d ago
-
Anomaly Detection | Data Modeling | Data Pipelines | Docker | Explainable AI100 percent remote work | Career growth opportunities | Flexible work environmentMid-level Full TimeDenmark R1d ago
-
Anomaly Detection | Data Modeling | Data Pipelines | Docker | JavaScript100% remote work | Career growth opportunities | Flexible work environment | International team collaboration opportunitiesMid-level Full TimeGreece R1d ago
-
Anomaly Detection | Data Modeling | Data Pipelines | Docker | JavaScript100% remote work | Career growth | Flexible work environmentMid-level Full TimePoland R1d ago
-
API Integration | Anomaly Detection | Data Modeling | Data Pipelines | DockerCareer growth opportunities | Flexible work environment | Remote workMid-level Full TimeAustria R1d ago
-
API Integration | Anomaly Detection | Data Modeling | Data Pipelines | DockerCareer growth | Flexible work environment | International team collaboration | Remote workMid-level Full TimeSweden R1d ago
-
APIs | Anomaly Detection | Data Modeling | Data Pipelines | DockerCareer growth opportunities | Flexible work environment | Remote workMid-level Full TimeIsrael R1d ago