AI Agent Software Engineer - Agent Performance Engineering

San Francisco, CA

Full Time Mid-level / Intermediate USD 150K - 300K

Assembled

Assembled is a support operations platform that combines modern workforce management and AI-powered issue resolution to scale exceptional customer support. Leading brands use Assembled to optimize omnichannel staffing, gain visibility into...

View all jobs at Assembled

Apply now Apply later

Posted 1 day ago

About Assembled

Assembled builds the infrastructure that underpins exceptional customer support, empowering companies like CashApp, Etsy, and Robinhood to deliver faster, better service at scale. With solutions for workforce management, BPO collaboration, and AI-powered issue resolution, Assembled simplifies the complexities of modern support operations by uniting in-house, outsourced, and AI-powered agents in a single operating system. Backed by $70M in funding from NEA, Emergence Capital, and Stripe, and driven by a team of experts passionate about problem-solving, we’re at the forefront of support operations technology.

The Team

Our Agent Intelligence team is focused on maximizing AI agent performance and driving volume automation across voice, chat, and email channels. This team is vital for our vision of creating the ultimate omnichannel AI customer support agent and represents one of the fastest growing areas both inside our company and across the industry.

We're currently one of the largest companies deploying AI agents in production at massive scale, working with enterprise clients like Canva, Etsy, and Ramp. The engineering challenges are significant and largely unsolved: building sophisticated evaluation systems, automating knowledge generation, optimizing AI agent accuracy, and creating the infrastructure needed to continuously improve agent intelligence at scale.

Your Impact

As part of the Agent Performance Engineering team, you'll be working on the core systems that make our AI agents smarter, more accurate, and more capable of handling complex customer interactions. You'll be building the evaluation frameworks, knowledge automation systems, and intelligence optimization tools that directly impact our ability to automate customer support at enterprise scale.

Responsibilities

Build foundational evaluation infrastructure: Develop comprehensive evaluation systems from the ground up, including golden dataset creation, automated benchmarking, and model comparison tools. You'll help create the frameworks that enable us to measure and optimize AI agent performance across all communication channels.
Automate knowledge generation: Design and implement systems that automatically create synthetic guides, documentation, and metadata to improve agent knowledge bases. You'll work on cutting-edge approaches to knowledge extraction and augmentation.
Optimize AI agent accuracy: Enhance our retrieval systems, implement advanced prompt optimization techniques, and build tools that continuously improve agent responses through automated evaluation and refinement.
Develop intelligence infrastructure: Architect systems that enable rapid model upgrades, A/B testing of different AI approaches, and scalable evaluation pipelines that support enterprise deployment.
Drive volume automation: Focus on the north star goal of maximizing automated resolutions across voice, chat, and email by building the intelligence systems that make it possible.

About You

You might be a good fit if you:

Have 5+ years of experience in software engineering as an individual contributor
Have experience with AI evaluation systems, data pipelines, or AI model optimization
Are passionate about building systems that measure and improve AI performance
Have worked with retrieval systems, knowledge bases, or information extraction
Enjoy building tools and infrastructure that enable other engineers and AI systems to perform better
Are highly ambitious and driven, setting high goals for yourself and others
Put customers first, focusing on solving real problems that impact support quality
Enjoy fast-paced environments and can quickly adjust when new insights emerge
Have a bit of a maverick streak that helps you come up with creative solutions
Have made a noticeable impact on small teams and have solid experience in startups or smaller companies
Stay humble and open to feedback, value teamwork, and are always ready to learn and grow

Technologies You'll Work With

Python/Golang for evaluation systems and data processing
LLMs and prompt optimization frameworks
Vector databases and retrieval systems
AI evaluation and benchmarking tools
Data pipeline and automation infrastructure

Our U.S. benefits

Generous medical, dental, and vision benefits
Paid company holidays, sick time, and unlimited time off
Monthly credits to spend on each: professional development, general wellness, Assembled customers, and commuting
Paid parental leave
Hybrid work model with catered lunches everyday (M-F), snacks, and beverages in our SF & NY offices
401(k) plan enrollment

Apply now Apply later

Job stats: 0 0 0

Categories: Deep Learning Jobs Engineering Jobs

Tags: A/B testing Data pipelines Engineering Golang LLMs Pipelines Python Testing