Senior Data Scientist – AI Developer (Flexible Hybrid)

Washington, DC, United States

Full Time Senior-level / Expert USD 121K - 226K * ^est.

Fannie Mae

We facilitate equitable and sustainable access to homeownership and quality, affordable rental housing across America.

View all jobs at Fannie Mae

Apply now Apply later

Posted 1 month ago

Company Description

At Fannie Mae, futures are made. The inspiring work we do helps make a home a possibility for millions of homeowners and renters. Every day offers compelling opportunities to use tech to tackle housing’s biggest challenges and impact the future of the industry. You’ll be a part of an expert team thriving in an energizing, flexible environment. Here, you will grow your career and help create access to fair, affordable housing finance.

Job Description

Fannie Mae is expanding its Data Science talent to further push the frontiers of modeling, AI and advanced analytics. Are you passionate about advanced analytics algorithms, AI techniques and about creating new AI solutions and technologies? Do you have creative and innovative approaches to developing new AI products? We’re seeking data scientists who have domain knowledge or an interest in Generative AI, large language models, machine learning, natural language processing, image processing and an interest to apply it to solve the most complex problems in business.

If you are ready for an exciting opportunity working hands on with the world’s most advanced data science technologies and thrive in a super dynamic environment where you are being counted on to develop advanced analytics and AI products, this role is for you:

THE IMPACT YOU WILL MAKE
The Senior Data Scientist – AI Developer role will offer you the flexibility to make each day your own, while working alongside people who care so that you can deliver on the following responsibilities:

Collaborate with product and/or business owners, data engineers, and platform teams to understand business needs and current capabilities, data availability, and alternative uses.
Implement new statistical modeling capabilities.
Apply analytic capabilities and build upon advanced analytic capabilities to enhance the delivery of business applications, and support the integration of data and statistical models or algorithms. Apply industry practices in research and testing to product development, deployment, and maintenance.
Design new modeling applications to support risk measurement, financial valuation, decision making, and business performance.
Design data visualizations, technical documentation, and non-technical presentation materials to communicate complex ideas and solutions to business partners.

Qualifications

THE EXPERIENCE YOU BRING TO THE TEAM

Minimum Required Experiences:

2 years of relevant experience in building large scale machine learning or deep learning models and/or systems
Bachelors degree in Business Analytics, Computer Science, Data Science, Engineering Finance, Math, Physics, Statistics, or a related field
Work or educational background in one or more of the following areas: machine learning, computational linguistics, deep learning, ratification intelligence, data science and/or data analytic, generative AI, symbolic AI, causal AI, operations research, computer science, Mathematics, business analytics, or knowledge management
Demonstrated experience programming with R / Python, Linux, and Spark in AWS cloud environment, or knowledge and algorithmic design experience in Python (3+ years)
Proficient with Amazon AWS Sagemaker, Jupyter Notebook and Python Scikit, Deep Learning, Machine Learning tools such as TensorFlow
Experience with image processing models such as Coco, CLIP, ResNet or comparable models
Demonstrated experience with machine learning techniques including natural language processing, and Large language Models (GPTv4-o1, o3, OpenAI APIs, Llama, Claude, etc).
Experience developing AI agents and development proficiency using agentic programming
Proficient in Natural language processing (NLP) and Natural language generation (NLG) including prior projects in any of the following categories: top modeling of text, sentiment analysis of text, part of speech tagging, Name Entity Recognition (NER), Bag of Words, text extraction
Experience building and working with any of these components: Vector DB, BERT, RoBERTa (or comparable tools), Spacy, LLM and GenAI tools
Experience with LoRA, LangChain, RAG, LLM Fine Tuning and PEFT, Knowledge Graphs.
Strong skills in developing GraphRAG, Chain of Thought (CoT), Tree of Thought (ToT), Reinforcement learning and AI development architectures with Human-in-the-Loop (HITL)
Demonstrated experience with SQL and any relational database technologies, such as Oracle, PostgreSQL, MySQL, RDS, Redshift, Hadoop EMR, Hive, etc.
Demonstrated experience processing structured and unstructured data sources, data cleansing, data normalization and prep for analysis
Demonstrated experience with code repositories and build/deployment pipelines, specifically Jenkins and/or Git/GitHub/GitLab.
Demonstrated experience using Tableau, or Kibana, Quicksights or other similar data visualizations tools.
Very comfortable working with ambiguity (e.g. imperfect data, loosely defined concepts, ideas, or goals)

Desired Experiences:

Education: MS in Computer Science, Statistics, Math, Engineering, or related field, PhD preferred
3+ years of relevant experience in building large scale machine learning or deep learning models and/or systems
1+ year of experience specifically with deep learning (e.g., CNN, RNN, LSTM)
1+ year of experience building NLP and NLG tools.
Experience with wide range of LLMs (Llama, Claude, OpenAI, Cohere, etc.), LoRA, LangChain, RAG, LLM Fine Tuning and PEFT are preferred.
Demonstrated skills with Jupyter Notebook, AWS Sagemaker, or Domino Datalab or comparable environments
Passion for solving complex data problems and generating cross-functional solutions in a fast-paced environment
Knowledge in Python and SQL, object oriented programming, service oriented architectures
Strong scripting skills with Shell script and SQL
Strong coding skills and experience with Python (including SciPy, NumPy, and/or PySpark) and/or Scala.
Knowledge and implementation experience with NLP techniques (topic modeling, bag of words, text classification, TF/IDF, Sentiment analysis) and NLP technologies such as Python NLTK, or Spacy or comparable technologies
Knowledge and implementation experience with statistical and machine learning models (regression, classification, clustering, graph models, etc.)
Hands on experience building models with deep learning frameworks like Tensorflow, Keras, Caffe, PyTorch, Theano, H2O, or similar
Experience with LLM Agents, Agentic programming
Experience with search architecture (for instance: Solr, ElasticSearch, AWS OpenSearch)
Experience with building querying ontologies such as Zeno, OWL, RDF, SparQL or comparable are preferred
Knowledge & experience with microservices, service mesh, API development and test automation are preferred
Demonstrated experience using Docker, Kubernetes, and/or other similar container frameworks are preferred

Skills

Ability to translate business ideas into analytics models that have major business impact
Demonstrated experience working with multiple stakeholders
Demonstrated communication skills, e.g. explaining complex technical issues to more junior data scientists, in graphical, verbal, or written formats.
Demonstrated experience developing tested, reusable and reproducible work.
Transparently documenting code and methodologies
Ability to work in Agile, Lean and rapid development processes

Additional Information

The future is what you make it to be. Discover compelling opportunities at careers.fanniemae.com.

Fannie Mae is a flexible hybrid company. We embrace flexibility for our employees to work where they choose, while also providing office space for in-person work if desired. At times, business need may call for on-site collaboration, which means proximity within a reasonable commute to your designated office location is preferred unless job is noted as open to remote.

Fannie Mae is an Equal Opportunity Employer, which means we are committed to fostering a diverse and inclusive workplace. All qualified applicants will receive consideration for employment without regard to race, religion, national origin, gender, gender identity, sexual orientation, personal appearance, protected veteran status, disability, age, or other legally protected status. For individuals with disabilities who would like to request an accommodation in the application process, email us at careers_mailbox@fanniemae.com.

The hiring range for this role is set forth on each of our job postings located on Fannie Mae's Career Site. Final salaries will generally vary within that range based on factors that include but are not limited to, skill set, depth of experience, certifications, and other relevant qualifications. This position is eligible to participate in a Fannie Mae incentive program (subject to the terms of the program). As part of our comprehensive benefits package, Fannie Mae offers a broad range of Health, Life, Voluntary Lifestyle, and other benefits and perks that enhance an employee’s physical, mental, emotional, and financial well-being. See more here.

Apply now Apply later

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats: 6 0 0

Categories: Data Science Jobs Deep Learning Jobs Engineering Jobs

Tags: Agile API Development APIs Architecture AWS BERT Business Analytics Caffe Classification Claude Clustering CoHere Computer Science Deep Learning Docker Elasticsearch Engineering Finance Generative AI Git GitHub GitLab Hadoop Jenkins Jupyter Keras Kibana Kubernetes LangChain Linguistics Linux LLaMA LLMs LoRA LSTM Machine Learning Mathematics Microservices ML models MySQL NLG NLP NLTK NumPy OpenAI OpenSearch Oracle PhD Physics Pipelines PostgreSQL PySpark Python PyTorch R RAG RDBMS RDF Redshift Reinforcement Learning Research ResNet RNN RoBERTa SageMaker Scala Scikit-learn SciPy spaCy Spark SQL Statistical modeling Statistics Tableau TensorFlow Testing Theano Topic modeling Unstructured data