Sr. ML Platform Engineer (Hybrid)
Tasks
- Build observability solutions
- Conduct post-mortems
- Configure alerting workflows
- Debug memory leaks
- Debug resource contention
- Debug scheduling conflicts
- Develop runbooks
- Diagnose distributed systems issues
- Implement automated health checks
- Improve HPC cluster utilization
- Maintain platform reliability metrics
- Mentor engineers on debugging techniques
- Optimize GPU allocation
- Optimize Ray clusters
- Optimize SLURM job scheduling
- Optimize Spark jobs
- Optimize resource allocation
- Perform root cause analysis
- Profile performance bottlenecks
- Resolve production incidents for inference pipelines
- Resolve production incidents for training pipelines
- Troubleshoot JupyterHub spawner issues
- Troubleshoot kernel crashes
Perks/Benefits
- Employee networks
- On-call support
- Paid adoption leave
- Paid parental leave
- Professional development
- Vacation and holidays
- Volunteer opportunities
- Wellness programs
Skills/Tech-stack
AWS | Airflow | Apache Spark | CUDA | Capacity Planning | Chaos Engineering | Debugging | Distributed tracing | Docker | Google Cloud | Grafana | JupyterHub | Kubeflow | Kubernetes | Linux | Log Aggregation | MLflow | Microsoft Azure | OCI | Observability | Performance Tuning | Profiling | Prometheus | Python | Ray | Slurm | Unix
Education
N/A
Related jobs
-
Mid-level Full TimeGurugram, India1h ago
-
Data Operation Engineer INR 2040K-3000KAsset bundles | Azure | Cloud platform | Confluence | DBTHybrid work arrangementMid-level Full TimeChennai, Tamil Nadu, India2h ago
-
AWS | Azure | Cloud platform | Deep learning | Generative AIContinuous learning and growth | Fully remote work | Technical excellence and experimentation cultureSenior-level Full TimeIndia R2h ago
-
Senior Data Engineer, Payments INR 3050K-4350KAWS Glue | Amazon Redshift | Apache Airflow | Apache Flink | Apache KafkaSenior-level Full TimeBangalore, India3h ago
-
AWS | Access Control | Apache Airflow | Azure | CI/CDAutonomy | Career growth | Inclusive culture | Learning and developmentSenior-level Full TimeIndia3h ago
-
Senior-level Full TimeAnywhere in India R3h ago
-
Mid-level Full TimeBengaluru, Karnataka, India3h ago
-
Senior Software Engineer (Python, Pyspark, Databricks, AWS, SQL) INR 2475K-2829KAWS | AWS Glue | AWS Lambda | Amazon EMR | Amazon S3Employee assistance program | Flexible working environment | LinkedIn Learning | Volunteer time offSenior-level Full TimePune, MH, India4h ago
-
Senior-level Full TimeHyderabad, India6h ago
-
Senior-level Full TimeHyderabad, India7h ago
-
Senior-level Full TimeHyderabad, India7h ago
-
Forward Deployed Engineer INR 2000K-3500KAI | APIs | Agent Orchestration | Agent-based | Agent-based AISenior-level Full TimeHyderabad, India7h ago
-
Senior AI Engineer (Agents) INR 2000K-3500KA/B | A/B Testing | Agent Assist | Anthropic | B testingEmployee benefits | Flexible culture | Remote work flexibilitySenior-level Full TimeHyderabad, India7h ago
-
Senior AI Engineer (Search/Retrieval) INR 2000K-3465KAWS | Azure | BM25 | CI/CD | Citation AccuracySenior-level Full TimeHyderabad, India7h ago
-
Computer Scientist ( Java Backend ) INR 2500K-3500KApache Kafka | CI/CD | Docker | Event Driven | Event-driven architectureSenior-level Full TimeBangalore, India R9h ago
-
Computer Scientist 2 ( Data Engineering + AI ) INR 2475K-3465KAWS | Apache Spark | Azure | CI/CD | DatabricksSenior-level Full TimeNoida, India R9h ago
-
Computer Scientist - II ( Data Engineering ) INR 1500K-2000KAWS | Apache Spark | Azure | CI/CD | Data ModelingMid-level Full TimeNoida, India R9h ago
-
Senior Data Engineer INR 3000K-4200KAWS | Automated testing | CI/CD | DBT | Data GovernanceEducational assistance | Employee assistance program | Flexible time off | Free Cab Transport Facility | Hybrid work modelSenior-level Full TimeIND - NonGBS-Pune-Kharadi, India9h ago
-
Senior-level Full TimeBengaluru, Karnataka, India9h ago
-
Staff Data Engineer INR 1500K-2400KAI code generation | AWS | AWS Glue | Agentic Frameworks | Apache IcebergSenior-level Full TimeGurugram12h ago
-
Sr Data Engineer II - Salesforce (Hybrid) INR 1500K-2700KAmazon Redshift | Apache Airflow | Boomi | Bulk API | DBTMid-level Full TimeBangalore, INDIA R12h ago
-
Senior Azure Data Engineer INR 3000K-3900KAPI Integration | Azure | Azure Data | Azure Data Factory | Azure Data LakeSenior-level Full TimeIndia16h ago
-
Artificial Intelligence | ETL | Git | Pandas | PySparkCollaborative team | Personal growth | Technical mentorship | Values-driven cultureEntry-level Full TimeHyderabad, Telangana17h ago
-
CI/CD | DBT | Data Analysis | Data Modeling | Data NormalizationCharity contribution match | Corporate Volunteering Hours | Dental insurance | Employee discounts | Medical insuranceMid-level Full TimeBengaluru, KA, IN17h ago
-
Bash | Cloud infrastructure | Cloud platform | Data Ingestion | Data Ingestion PipelineAsynchronous culture | Career impact | Competitive salary | Inclusive workplace | Remote-friendlyMid-level Full TimeHyderabad, India18h ago