Agentic AI Ops Engineer - Serverless & CI/CD (AWS) - Rethem

India - Remote

Apply now Apply later

At Rethem, we're revolutionizing the sales landscape by putting buyer outcomes at the forefront. We understand that customers buy outcomes, and our AI-driven platform empowers your sales reps to deliver those outcomes, helping them crush their quotas.

What Sets Us Apart

  • Deep AI Integration: Our platform leverages advanced AI that acts as a personal coach for your reps, adapting to your business processes to automate complex tasks and provide real-time guidance.
  • Outcome-Driven Approach: By focusing on delivering measurable outcomes, we enable your sales team to build trust and foster long-term customer relationships.
  • Market Leadership: Positioned at the cutting edge of buyer-centric sales transformation, we're leading the shift towards more meaningful and effective sales interactions.
  • Proven Expertise: Our leadership and team consist of industry veterans with a track record of driving substantial growth and innovation in sales.

Our Mission

To redefine the sales process by aligning it with buyer needs, leveraging AI to empower sales teams to deliver outcomes that drive mutual success.

Transform Your Sales Strategy with AI

  • Rethem turns your sales playbook into an intelligent, always-on guide that adapts in real-time. By harnessing the power of AI, we provide your team with:
  • Real-Time Coaching: Enhance performance with actionable insights during every buyer interaction.
  • Enhanced Efficiency: Automate key processes so your reps can focus on building relationships and delivering value.
  • Outcome Alignment: Ensure your offerings are perfectly aligned with customer objectives, leading to higher satisfaction and loyalty.
  • Accelerate Growth: Drive higher win rates and larger deals through a buyer-focused approach.

Vision for the Future

We envision a future where AI and human expertise collaborate seamlessly to create unparalleled sales experiences. By continuously innovating, we aim to stay at the forefront of buyer-centric sales transformation.

Join the Sales Revolution

Emerging from stealth mode, Rethem invites a select group of visionary organizations to pilot our groundbreaking platform. If you're ready to elevate your sales team, deliver exceptional customer outcomes, and empower your reps to crush their quotas, visit our website to learn more and apply.

Be Part of Our Journey

We're assembling a team of innovators passionate about reshaping the sales industry. Explore career opportunities with Re:them and help shape the future of outcome-driven, AI-powered sales.

Experience the Power of AI-Driven Sales Transformation with Re:them.

The Role

We are seeking a hands-on Agentic AI Ops Engineer who thrives at the intersection of cloud infrastructure, AI agent systems, and DevOps automation. In this role, you will build and maintain the CI/CD infrastructure for Agentic AI solutions using Terraform on AWS, while also developing, deploying, and debugging intelligent agents and their associated tools. This position is critical to ensuring scalable, traceable, and cost-effective delivery of agentic systems in production environments.

The Responsibilities

CI/CD Infrastructure for Agentic AI

  • Design, implement, and maintain CI/CD pipelines for Agentic AI applications using Terraform, AWS CodePipeline, CodeBuild, and related tools.
  • Automate deployment of multi-agent systems and associated tooling, ensuring version control, rollback strategies, and consistent environment parity across dev/test/prod.

Agent Development & Debugging

  • Collaborate with ML/NLP engineers to develop and deploy modular, tool-integrated AI agents in production.
  • Lead the effort to create debuggable agent architectures, with structured logging, standardized agent behaviors, and feedback integration loops.
  • Build agent lifecycle management tools that support quick iteration, rollback, and debugging of faulty behaviors.

Monitoring, Tracing & Reliability

  • Implement end-to-end observability for agents and tools, including runtime performance metrics, tool invocation traces, and latency/accuracy tracking.
  • Design dashboards and alerting mechanisms to capture agent failures, degraded performance, and tool bottlenecks in real-time.
  • Build lightweight tracing systems that help visualize agent workflows and simplify root cause analysis.

Cost Optimization & Usage Analysis

  • Monitor and manage cost metrics associated with agentic operations including API call usage, toolchain overhead, and model inference costs.
  • Set up proactive alerts for usage anomalies, implement cost dashboards, and propose strategies for reducing operational expenses without compromising performance.

Collaboration & Continuous Improvement

  • Work closely with product, backend, and AI teams to evolve the agentic infrastructure design and tool orchestration workflows.
  • Drive the adoption of best practices for Agentic AI DevOps, including retraining automation, secure deployments, and compliance in cloud-hosted environments.
  • Participate in design reviews, postmortems, and architectural roadmap planning to continuously improve reliability and scalability.

Requirements

  • 2+ years of experience in DevOps, MLOps, or Cloud Infrastructure with exposure to AI/ML systems.
  • Deep expertise in AWS serverless architecture, including hands-on experience with:
    • AWS Lambda – function design, performance tuning, cold-start optimization.
    • Amazon API Gateway – managing REST/HTTP APIs and integrating with Lambda securely.
    • Step Functions – orchestrating agentic workflows and managing execution states.
    • S3, DynamoDB, EventBridge, SQS – event-driven and storage patterns for scalable AI systems.
  • Strong proficiency in Terraform to build and manage serverless AWS environments using reusable, modular templates.
  • Experience deploying and managing CI/CD pipelines for serverless and agent-based applications using AWS CodePipeline, CodeBuild, CodeDeploy, or GitHub Actions.
  • Hands-on experience with agent and tool development in Python, including debugging and performance tuning in production.
  • Solid understanding of IAM roles and policies, VPC configuration, and least-privilege access control for securing AI systems.
  • Deep understanding of monitoring, alerting, and distributed tracing systems (e.g., CloudWatch, Grafana, OpenTelemetry).
  • Ability to manage environment parity across dev, staging, and production using automated infrastructure pipelines.
  • Excellent debugging, documentation, and cross-team communication skills.

Benefits

  • Health Insurance, PTO, and Leave time
  • Ongoing paid professional training and certifications
  • Fully Remote work Opportunity
  • Strong Onboarding & Training programs

Are you ready to Join the Revolution?

If you're ready to take on this exciting challenge and believe you meet our requirements, we encourage you to apply. Let's shape the future of AI-driven sales together! See more about us at https://www.rethem.ai/

EEO Statement

All qualified applicants to Expedite Commerce are considered for employment without regard to race, color, religion, age, sex, sexual orientation, gender identity, national origin, disability, veteran’s status or any other protected characteristic.

Apply now Apply later

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats:  2  0  0

Tags: APIs Architecture AWS CI/CD DevOps DynamoDB GitHub Grafana Lambda Machine Learning MLOps Model inference NLP Pipelines Python R Step Functions Terraform

Perks/benefits: Career development Health care Insurance Startup environment

Regions: Remote/Anywhere Asia/Pacific
Country: India

More jobs like this