Senior Applied Data Scientist

Remote - Canada

Full Time Senior-level / Expert CAD 66K - 122K *

Abnormal Security

Advanced email protection to prevent credential phishing, business email compromise, account takeover, and more.

View all jobs at Abnormal Security

Apply now Apply later

Posted 1 week ago

About the Role

Abnormal Security is looking for an Applied Data Scientist to join the Message Detection - Attack Detection team. At Abnormal, we protect our customers against nefarious adversaries who are constantly evolving their techniques and tactics to outwit and undermine the traditional approaches to Security. That’s what makes our novel behavioral-based approach so…Abnormal. Abnormal has constantly been named as one of the top cybersecurity startups and our behavioral AI system has helped us win various cybersecurity accolades resulting in being trusted to protect more than 17% of the Fortune 1000 ( and ever growing ).

In a landscape where a single successful attack can lead to financial losses of millions of dollars, the Attack Detection team plays the central role of building an extremely high recall Detection Engine that can operate on hundreds of millions of messages at milliseconds latency. The Attack Detection team’s mission statement is to provide world-class detector efficacy to tackle changing attack landscape using a combination of generalizable and auto trained models as well as specific detectors for high value attack categories.

This team is solving a multi-layered detection problem, which involves modeling communication patterns to establish enterprise-wide baselines, incorporating these patterns as robust signals, and combining these signals with contextual information to create extremely precise systems. The team builds discriminative signals at various levels including message level (eg. presence of particular phrases), sender-level (eg.frequency of sender) and recipient level (eg.likelihood of receiving a safe message). These signals are then combined and utilized to train highly accurate model based as well as heuristic detectors. Additionally, to continuously adapt to new unseen attacks, the team builds out different stages in our automated model retraining pipelines including data analytics and generation stages, modeling stages, production evaluation stages as well as automated deployment stages.

This role would also have an opportunity to have a significant impact on the overall charter, direction and roadmap of the team. The Applied Data Scientist would be expected to deeply understand the domain of false negatives i.e. the current and future attacks which can cause significant customer workflow disruption and form a strong understanding of our features to They would help define the technical roadmap required to address the most pressing customer problems and simultaneously operate our detection decisioning system at an extremely high recall.

What you will do

Deep inspection and row level data analysis of our false negatives and false positives, and produce data and feature insights to iteratively improve our detection efficacy.
Understand features that distinguish safe emails from email attacks, and utilize them effectively into our models stack and engine.
Train models and develop detectors on well-defined datasets to improve model efficacy on specialized attacks
Identify and recommend new features groups or ML model approaches that can significantly improve detection efficacy for a product. Work with infrastructure & systems engineers to productionize signals to feed into the detection system.
Writes code with testability, readability, edge cases, and errors in mind.
Actively monitor and improve FN rates and efficacy rates for our message detection product attack categories, through feature engineering, rules and ML modeling.

Contribute in other areas of the stack: building and debugging data pipelines, or presenting results back to customers in our tools when the occasion arises

Must Haves

5+ years experience designing, building product machine learning applications in one of the domains of text understanding, entity recognition, NLP experience, computer vision, recommendation systems, or search.
Experience with data analytics and wielding SQL+ pandas framework to both build metric and evaluation pipelines, and answer critical questions about counterfactual treatments.
Ability to understand business requirements thoroughly and bias toward designing a simplest yet generalizable ML model / system that can accomplish the goal.
Ability to rapidly iterate on 0-to-1 model prototypes, interpret results, and pivot an approach, in order to evaluate most promising solutions as new problems arise.
Uses a systematic approach to debug data issues within both ML and heuristics models.
Fluent with Python and machine learning toolkits like numpy, sklearn, pytorch and tensorflow.
Effective programming skills which enable them to quickly add incremental logic to our codebase with readable, well tested and efficient code.
BS degree in Computer Science, Applied Sciences, Information Systems or other related engineering field