Principal Site Reliability Engineer

AMER - Canada - Ontario - Toronto - University Ave

Autodesk

Autodesk is a global leader in design and make technology, with expertise across architecture, engineering, construction, design, manufacturing, and entertainment.

View all jobs at Autodesk

Apply now Apply later

Job Requisition ID #

25WD85835

We are seeking a highly motivated and experienced Principal Site Reliability Engineer (SRE) to manage critical cloud infrastructure and site reliability operations for Autodesk's global Product Access journey. This pivotal role focuses on ensuring the highest reliability, availability, and performance of our AWS-hosted cloud infrastructure.

Reporting to the Engineering Manager, you will be leading design and development of resilient and scalable architecture and innovative solutions for the platform. You will independently manage and deliver end-to-end solutions while engaging with key stakeholders and partners.

Responsibilities

  • Lead architecture, solution design, development and maintenance of cloud infrastructure for micro-services architecture.
  • Independently manage requirement analysis, solution design, implementation, and release planning.
  • Ensure high adherence to trust and security compliance, guidelines and standards.
  • Streamline CI/CD processes, improve system reliability, and ensure infrastructure scalability and security.
  • Automate infrastructure deployment, scaling, and management using modern DevOps tools and practices.
  • Implement and maintain configuration management and infrastructure as code (IaC) using Terraform.
  • Lead Disaster Recovery (DR) strategies, failover exercises, gamedays, and period maintenance activities.
  • Contribute to critical vulnerability (CVEs) remediation efforts.
  • Promote and document security and best practices across all pillars of DevOps/SRE throughout system design.
  • Provide real-time operational support and collaborate across functions to resolve system, infrastructure, and CI/CD issues.
  • Participate in on-call rotations, providing critical 24x7 support for production systems. ​

Minimum Qualifications

  • Bachelor’s degree or higher in Computer Science, Engineering, or a related field.
  • 8+ years of progressive experience in Site Reliability Engineering, DevOps, or a similar field.
  • Proficiency with managing AWS resources and understanding of networking and security protocols.
  • Expertise in infrastructure as code (IaC) and cloud automation tools such as Terraform, Serverless, and CloudFormation.
  • Expertise in defining and building CI/CD processes with tools like Jenkins, GitHub, and Artifactory.
  • Experience with container-based technologies like Docker and AWS ECS.
  • Experience with monitoring and logging tools such as Dynatrace, Grafana, DataDog, ELK Stack, and CloudWatch.
  • Experience in Linux Systems Administration, scripting, and troubleshooting in a production environment.
  • Proficiency in programming languages such as UNIX, Python, Go, Bash, Groovy, and Node.js.
  • Technology Stack: Java/SpringBoot, AWS (ECS Fargate, Elastic Cache, Lambda, Kinesis, DynamoDB, VPC, IAM policies, API Gateway, NLB/ALB, Route 53, CloudWatch, Kibana, Open Search), Kafka, GoLang, Node.js, Groovy, Python, Jenkins, GitHub, Jira, ServiceNow, and Splunk.

Preferred Qualifications

  • Knowledge in applying AI and ML solutions for engineering processes and/or DevOps automation.
  • Knowledge of standardized observability frameworks such as OpenTelemetry.
  • Relevant certifications (e.g., AWS Certified DevOps Engineer, AWS Site Reliability Engineer).
  • Broad knowledge of AWS, Redis, server programming, databases, and cloud architectures.
  • Broad knowledge with data streaming pipelines like Kinesis, Firehose, and Kafka.
  • Knowledge on core Java and SpringBoot concepts in JVM optimization.
  • Knowledge on build tools, e.g. Gradle.
  • Strong interpersonal and communication skills to effectively collaborate in an Agile/Scrum-oriented environment.
  • Self-directed team player and independent contributor, demonstrating accountability and end-to-end ownership.

Learn More

About Autodesk
Welcome to Autodesk! Amazing things are created every day with our software – from the greenest buildings and cleanest cars to the smartest factories and biggest hit movies. We help innovators turn their ideas into reality, transforming not only how things are made, but what can be made.

We take great pride in our culture here at Autodesk – our Culture Code is at the core of everything we do. Our values and ways of working help our people thrive and realize their potential, which leads to even better outcomes for our customers.

When you’re an Autodesker, you can be your whole, authentic self and do meaningful work that helps build a better future for all. Ready to shape the world and your future? Join us!

Salary transparency

Salary is one part of Autodesk’s competitive compensation package. Offers are based on the candidate’s experience and geographic location. In addition to base salaries, we also have a significant emphasis on discretionary annual cash bonuses, commissions for sales roles, stock or long-term incentive cash grants, and a comprehensive benefits package.

Diversity & Belonging
We take pride in cultivating a culture of belonging and an equitable workplace where everyone can thrive. Learn more here: https://www.autodesk.com/company/diversity-and-belonging

Are you an existing contractor or consultant with Autodesk?

Please search for open jobs and apply internally (not on this external site).

Apply now Apply later

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats:  0  0  0

Tags: Agile APIs Architecture AWS CI/CD CloudFormation Computer Science DevOps Docker DynamoDB ECS ELK Engineering Firehose GitHub Golang Grafana Java Jenkins Jira Kafka Kibana Kinesis Lambda Linux Machine Learning Node.js Pipelines Python Scrum Security Splunk Streaming Terraform

Perks/benefits: Competitive pay Team events

Region: North America
Country: Canada

More jobs like this