AI Researcher – Datadog AI Research
New York, New York, USA
Full Time Senior-level / Expert USD 130K - 265K
Datadog has recently expanded its AI Research initiatives. Building on our proven track record of AI-powered solutions (e.g., Bits AI, Watchdog, and Toto), our research team is tackling high-risk, high-reward projects grounded in real-world challenges in cloud observability and security.
We are currently focused on three key research areas:
-
Observability Foundation Models – Building state-of-the-art models for advanced forecasting, anomaly detection, and multi-modal telemetry analysis (logs, metrics, traces, etc.). These models will also provide the foundation for our agents (described below) to natively analyze telemetry data.
-
Site Reliability Engineering (SRE) Autonomous Agents – Creating AI agents to automatically detect, diagnose, and resolve incidents in production environments, pushing the boundaries of multi-step planning, reasoning, and domain-specific knowledge.
-
Production Code Repair Agents – Developing agents and models that leverage code, logs, runtime data, and other signals to identify, fix, and even preempt performance issues and security vulnerabilities in production code.
As a researcher on our team, you will help drive these efforts—working on fundamental research problems and collaborating with Datadog’s Product and Engineering teams to help translate research advances into tangible benefits for our customers.
What You’ll Do:
-
Conduct cutting-edge research in Generative AI and Machine Learning, aiming to build specialized Foundation Models and AI Agents for observability, site reliability engineering, and code repair
-
Leverage large-scale distributed training infrastructure to train and fine-tune state-of-the-art models on diverse, real-world telemetry data
-
Lead and contribute to research publications, present findings at top-tier conferences (e.g., NeurIPS, ICLR, ICML), and help open-source key model artifacts and benchmarks
-
Collaborate with cross-functional teams (e.g., Product, Engineering) to integrate advanced AI capabilities—like multi-modal analysis or automated incident resolution planning—into Datadog’s product ecosystem
-
Stay at the forefront of LLMs, Foundation Models, and Generative AI research and engage with the external research community
-
Foster a culture of scientific rigor, innovation, and practical impact, e.g., by actively participating in reading groups and mentoring interns
Who You Are:
-
You hold a PhD in Computer Science, Machine Learning, or a related field—with deep expertise in areas like generative modeling, AI agents, reinforcement learning, or natural language processing (or have equivalent experience)
-
You possess extensive experience in designing and implementing deep learning models, and have a strong background in distributed training frameworks (e.g., DeepSpeed, Megatron-LM) and ML libraries (PyTorch, TensorFlow)
-
You have a proven track record of conducting impactful research in the field with publications at top-tier venues (e.g., NeurIPS, ICLR, ICML, TMLR)
-
You're familiar with efficient training, fine-tuning, and inference techniques for large foundation models
-
You excel at explaining complex models and research findings to both technical and non-technical audiences
-
You have strong interest in open-science and open-source contributions, including establishing rigorous benchmarks and sharing research with the community
Bonus Points (any of the following):
-
You have a demonstrated ability to bridge cutting-edge research and real-world product applications—ideally with an emphasis on large foundation models, generative AI agents, or domain-specific LLM deployments.
-
You’re passionate about pushing the boundaries of AI while maintaining a strong focus on customer impact, scalability, and responsible deployment of new technologies
-
You have hands-on experience with GPU programming and optimization, including experience in CUDA
-
You have experience writing production data pipelines and applications
Datadog values people from all walks of life. We understand not everyone will meet all the above qualifications on day one. That's okay. If you’re passionate about AI Research and want to grow your skills, we encourage you to apply.
Benefits and Growth:
-
Competitive global benefits
-
New hire stock equity (RSUs) and employee stock purchase plan (ESPP)
-
Opportunity to collaborate closely with colleagues across the Datadog offices in New York City and Paris
-
Opportunity to attend and present at conferences and meetups
-
Intra-departmental mentor and buddy program for in-house networking
-
An inclusive company culture, ability to join our Community Guilds (Datadog employee resource groups)
Benefits and Growth listed above may vary based on the country of your employment and the nature of your employment with Datadog.
Datadog offers a competitive salary and equity package, and may include variable compensation. Actual compensation is based on factors such as the candidate's skills, qualifications, and experience. In addition, Datadog offers a wide range of best in class, comprehensive and inclusive employee benefits for this role including healthcare, dental, parental planning, and mental health benefits, a 401(k) plan and match, paid time off, fitness reimbursements, and a discounted employee stock purchase plan.
The reasonably estimated yearly salary for this role at Datadog is:$130,000—$265,000 USDAbout Datadog:
Datadog (NASDAQ: DDOG) is a global SaaS business, delivering a rare combination of growth and profitability. We are on a mission to break down silos and solve complexity in the cloud age by enabling digital transformation, cloud migration, and infrastructure monitoring of our customers’ entire technology stacks. Built by engineers, for engineers, Datadog is used by organizations of all sizes across a wide range of industries. Together, we champion professional development, diversity of thought, innovation, and work excellence to empower continuous growth. Join the pack and become part of a collaborative, pragmatic, and thoughtful people-first community where we solve tough problems, take smart risks, and celebrate one another. Learn more about #DatadogLife on Instagram, LinkedIn, and Datadog Learning Center.
Equal Opportunity at Datadog:
Datadog is an Affirmative Action and Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more. We also consider qualified applicants regardless of criminal histories, consistent with legal requirements. Here are our Candidate Legal Notices for your reference.
Your Privacy:
Any information you submit to Datadog as part of your application will be processed in accordance with Datadog’s Applicant and Candidate Privacy Notice.
Tags: Computer Science CUDA Data pipelines Deep Learning Engineering Excel Generative AI Generative modeling GPU ICLR ICML LLMs Machine Learning NeurIPS NLP Open Source PhD Pipelines Privacy PyTorch Reinforcement Learning Research Security TensorFlow
Perks/benefits: 401(k) matching Career development Competitive pay Conferences Equity / stock options Health care Salary bonus Startup environment
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.