Principal Big Data Site Reliability Developer (US Citizenship Required) US REMOTE
Tasks
- Define platform standards architectural direction and operational guardrails
- Design and evolve Ansible and Terraform automation framework
- Design platforms to behave predictably under load failure and change
- Drive long term platform evolution and reliability strategy
- Eliminate operational toil through reliability and safety automation
- Establish capacity models scaling strategies and operational best practices
- Lead incident prevention by eliminating failure classes
- Lead platform architecture and design reviews
- Operate and evolve stateful distributed systems
- Operate and maintain Kerberized platform security
- Own end to end reliability scalability operability of shared data platforms
- Own platform lifecycle events upgrades expansions decommissioning and recovery
- Reason about failure modes and recovery scenarios
- Serve as ultimate escalation point for complex incidents
Perks/Benefits
- 401k matching
- Commuter benefits
- Flexible spending accounts
- Life insurance
- Long-term disability
- Paid Holidays
- Paid parental leave
- Paid sick leave
- Paid time off
- Short-term disability
Skills/Tech-stack
Ansible | Apache Kafka | Apache Storm | Authentication | Authorization | Bash | Capacity Planning | HBase | HDFS | Hadoop | Infrastructure as Code | Kerberos | Linux | Networking | Observability | Python | Ruby | Terraform | YARN | “as-code”
Education
Related jobs
-
Mechanical Engineer, Data Centers - Remote (U.S.) USD 150K-175KAir Cooled Systems | Air Filtration | Air handlers | Air quality | Air quality monitoringOccasional travel | Remote work optionsSenior-level Full TimeArlington, VA, United States R8h ago
-
AI Developer - Model Creation & Full Stack USD 150K-175KAWS | Angular | Azure | CI/CD | D3.jsRemote work | USPS Public Trust Clearance eligibleMid-level Full TimeWork from home, VA, United States R12h ago
-
API Integration | AWS | AWS Glue | Batch Processing | Code reviewSenior-level Full TimeIndianapolis, IN, United States R13h ago
-
Applied AI Engineer, Agentic Systems USD 115K-192K.NET | APIs | Anthropic | CrewAI | Evaluation FrameworksAI and productivity tools access | Remote work accessSenior-level Full TimeRemote - United States R22h ago
-
Senior Industrial Engineer, Process Optimization USD 100K-120K5S | AutoCAD | Cause analysis | Cost modeling | Excel401k | Dental insurance | Disability insurance | Flexible spending account | Health savings accountSenior-level Full TimeBethlehem, PA, United States R1d ago
-
Machine Learning Engineer II GBP 124K-186KAWS | Anomaly Detection | Athena | Bedrock | C++Formal learning opportunities | Hybrid work | On-the-job learningMid-level Full TimeUSA – MN – Minneapolis, United … R1d ago
-
Edge AI Engineer USD 130K-200KBenchmarking | C++ | Core ML | Edge Computing | Embedded SystemsCareer growth | Health benefits | Remote workSenior-level Full TimeUnited States - Remote R1d ago
-
AI Research Engineer (Applied AI) USD 150K-222KAccelerator hardware | Agentic Systems | Data Quality | Data quality monitoring | Deep learningCareer growth | Remote workMid-level Full TimeUnited States - Remote R1d ago
-
Distinguished Engineer, Applied AI USD 150K-300KAWS | Agentic AI | Algorithms | Artificial Intelligence | Auto-failover401k match | Adoption Assistance | Career mentorship | Certification assistance | Employee trainingSenior-level Full TimeCA Palo Alto Office, United States R1d ago
-
AI Data Infrastructure Engineer USD 146K-189KApache Beam | CI/CD | Code review | Data Lineage | Data ModelingBenefits package | Remote workMid-level Full TimeUnited States - Remote R1d ago
-
AI Data Infrastructure Engineer USD 146K-189KActive Learning | Apache Beam | CI/CD | Caching | Code reviewMid-level Full TimeUnited States - Remote R1d ago
-
LLM Fine-Tuning Engineer USD 150K-270KAdapter-Tuning | DPO | Dataset curation | Distributed Training | Evaluation methodologyCareer growth | Mentorship | Remote workMid-level Full TimeUnited States - Remote R1d ago
-
LLM Platform Engineer (Windchill / Teamcenter) USD 116K-177KAWS | Ansible | Azure | CAD Integration | CI/CDCareer growth opportunities | Remote workMid-level Full TimeUnited States - Remote R1d ago
-
AI Performance Optimization Engineer USD 136K-258KC++ | Continuous batching | Deep learning | Distributed Systems | FSDPMid-level Full TimeUnited States - Remote R1d ago
-
Quantitative Developer (Fintech) USD 121K-213KAudit trails | Backtesting | C++ | Cloud Native | Cloud Native ArchitectureMid-level Full TimeUnited States - Remote R1d ago
-
Storage Engineer (NetApp / Pure / Ceph) USD 151K-228KAnsible | Backup | CRUSH maps | Capacity Planning | CephRemote workSenior-level Full TimeUnited States - Remote R1d ago
-
Storage Engineer (NetApp / Pure / Ceph) USD 141K-228KAnsible | Backups | CRUSH map | CSI | Capacity PlanningRemote workSenior-level Full TimeUnited States - Remote R1d ago
-
Robotics Software Engineer USD 125K-169KBehavior Trees | C++ | Cameras | Concurrent Systems | Control SystemsCareer growth | Mentorship | Remote workMid-level Full TimeUnited States - Remote R1d ago
-
Mid-level Full Time6314 Remote/Teleworker US, United States R1d ago
-
Senior AI Engineer USD 107K-195KAI Evaluation | AI Safety | API Integration | Agent systems | AutogenSenior-level Full Time6314 Remote/Teleworker US, United States R1d ago
-
Senior Engineer, Data Science USD 111K-178KCloud Computing | Data Engineering | Data Governance | Data Pipelines | DatabricksSenior-level Full TimeOklahoma City, OK, United States R1d ago
-
Machine Learning Engineer 3-7881 USD 99K-172KAPI | AWS | Agile | Amazon Managed Airflow | Amazon SageMaker100 percent remote workMid-level Full TimePA - Philadelphia, 1701 John F … R1d ago
-
AWS | Cloud Data | Cloud data warehousing | Data Modeling | Data WarehousingSenior-level Contract Full TimeRemote, OR, United States R1d ago
-
Deployment DevOps Engineer USD 135K-155KAKS | ArgoCD | Containers | DNS | DevSecOps401k matching | Dental insurance | Health insurance | Mental health support | Unlimited PTOEntry-level Full TimeNew York Office R1d ago
-
Edge AI Engineer USD 141K-200KC++ | Core ML | Edge inference | Energy optimization | Federated LearningSenior-level Full TimeUnited States - Remote R2d ago