Sr. ML Platform Engineer (Hybrid)
Tasks
- Build observability solutions
- Conduct post-mortems
- Configure alerting workflows
- Debug memory leaks
- Debug resource contention
- Debug scheduling conflicts
- Develop runbooks
- Diagnose distributed systems issues
- Implement automated health checks
- Improve HPC cluster utilization
- Maintain platform reliability metrics
- Mentor engineers on debugging techniques
- Optimize GPU allocation
- Optimize Ray clusters
- Optimize SLURM job scheduling
- Optimize Spark jobs
- Optimize resource allocation
- Perform root cause analysis
- Profile performance bottlenecks
- Resolve production incidents for inference pipelines
- Resolve production incidents for training pipelines
- Troubleshoot JupyterHub spawner issues
- Troubleshoot kernel crashes
Perks/Benefits
- Employee networks
- On-call support
- Paid adoption leave
- Paid parental leave
- Professional development
- Vacation and holidays
- Volunteer opportunities
- Wellness programs
Skills/Tech-stack
AWS | Airflow | Apache Spark | CUDA | Capacity Planning | Chaos Engineering | Debugging | Distributed tracing | Docker | Google Cloud | Grafana | JupyterHub | Kubeflow | Kubernetes | Linux | Log Aggregation | MLflow | Microsoft Azure | OCI | Observability | Performance Tuning | Profiling | Prometheus | Python | Ray | Slurm | Unix
Education
N/A
Related jobs
-
Senior-level Full TimePune City, India5h ago
-
Senior-level Full TimePune, Maharashtra, India5h ago
-
Data Engineer - Python AWS Databricks INR 1200K-2000KAWS | Agentic AI | Amazon Web Services | Cost Optimization | Data ArchitectureSenior-level Full TimeIN-MH-Pune5h ago
-
Abinitio Developer INR 1600K-2500KAbinitio | Abinitio Batch Graphs | Automated testing | Conduct>It | Continuous GraphsSenior-level Full TimeIN-KA-Bangalore5h ago
-
Senior Data Engineer INR 2000K-2400KAWS Glue | Amazon Redshift | Amazon S3 | Apache Airflow | Apache SparkSenior-level Full TimeHyderabad, Hyderabad, IN5h ago
-
Practice Customer Engineer, Data Analytics INR 1200K-2000KApache Spark | Batch Processing | C++ | Cloud platform | DNSSenior-level Full TimeBengaluru, Karnataka, India; Mumbai, Maharashtra, India6h ago
-
Practice Customer Engineer, Data Analytics INR 1200K-2000KApache Spark | C++ | DMZ | DNS | Data LakeEqual opportunity work environment | Travel as requiredSenior-level Full TimeMumbai, Maharashtra, India; Bengaluru, Karnataka, India7h ago
-
Senior-level Full Timebengaluru, India7h ago
-
Consultant - Python Developer with Gen AI INR 1500K-2000KAI Search | Agile | Azure OpenAI | CSS | DjangoMid-level Full TimeBangalore, Karnataka, India7h ago
-
Assistant Manager- AI Engineer with Cloude INR 2000K-3000KAWS | Autogen | Azure | Claude | Cloud platformLeadership opportunity | MentorshipMid-level Full TimeBangalore, Karnataka, India8h ago
-
AWS Data Engineer- Assistant Manager INR 1200K-2000KAWS Data | AWS Data Pipeline | AWS Glue | AWS Kinesis | AWS Lake FormationSenior-level Full TimeBangalore, Karnataka, India8h ago
-
GCP & Databricks Manager INR 1500K-2000KApache Spark | BigQuery | Bitbucket | Cloud Composer | Cloud DataflowMid-level Full TimeBangalore, Karnataka, India8h ago
-
Staff Engineer, Data Analytics Engineering INR 3000K-3380KAI Engineering | AWS | Azure | Azure Data | Azure Data LakeSenior-level Full TimeBengaluru, KA, India8h ago
-
Principal Engineer, Data Analytics Engineering INR 2500K-3900KAI Governance | API | Agent Orchestration | Agentic Workflows | ComplianceSenior-level Full TimeBengaluru, KA, India8h ago
-
Sr Data Engineering - GCP INR 1500K-2040KAccess Control | Analytics engineering | BigQuery | CI/CD | Cloud RunSenior-level Full TimeHyderabad, TS, India9h ago
-
Senior-level Full TimeHyderabad, TS, India9h ago
-
Senior Data Engineering (GCP) INR 2000K-2040KAccess Control | Anomaly Detection | BigQuery | CI/CD | Cloud RunSenior-level Full TimeHyderabad, TS, India9h ago
-
Azure AI | Containerization | DevOps | GenAI | GoHybrid workSenior-level Full TimeBengaluru, INDIA, India10h ago
-
AWS DevOps Engineer INR 2500K-2829KAWS CDK | Amazon CloudWatch | Amazon EKS | Amazon Web Services | Argo CDSenior-level Full TimeHyderabad, India10h ago
-
AWS DevOps Lead Engineer INR 2500K-2829KAI Copilot | AWS Cloud | AWS Cloud Development Kit | AWS CloudWatch | AWS IdentityDevelopment opportunities | Recognition programs | Reward and recognition | Volunteering opportunities | Wellness programsSenior-level Full TimeHyderabad, India10h ago
-
Senior Data Engineer INR 2516K-2829KAirflow | Apache Spark | Azure | Azure Data | Azure Data LakeEmployee assistance program | Flexible working environment | LinkedIn Learning | Volunteer time offSenior-level Full TimeChennai, TN, India10h ago
-
Data Engineer INR 938K-1200KAzure Data | Azure Data Factory | Azure Databricks | Data Factory | ETLEmployee Assistance Program (EAP) | Flexible working environment | LinkedIn Learning | Volunteer time offMid-level Full TimeChennai, TN, India10h ago
-
Software Engineer + Gen AI (Fresher) INR 300K-540KAI Model Integration | AI model | API Integration | Algorithms | Data StructuresEntry-level Full TimePune, MH, India11h ago
-
Applied AI ML Associate Senior INR 1050K-1250KAWS | Big Data | CI/CD | Continuous Delivery | Continuous integrationMid-level Full TimeHyderabad, Telangana, India12h ago
-
Senior-level Full TimeChennai, Tamil Nadu, India12h ago