[Job - 22203] Senior Data Scientist (NLP), Brazil
Brazil
We are seeking a highly skilled Senior Data Scientist with a strong focus on Natural Language Processing (NLP) to drive AI initiatives within the American health industry. This role emphasizes the development of agentic intelligence systems, building Retrieval-Augmented Generation (RAG) frameworks, and creating ontologies that enhance business outcomes through advanced data-driven insights.
Responsibilities:Conduct thorough data exploration to validate requirements for NLP contexts and ensure data quality.Perform NLP pre-processing tasks, including tokenization, lexical analysis, syntactic analysis, semantic analysis, and pragmatic analysis.Define and implement optimal NLP models that align with expected business outcomes.Contribute to building agentic intelligence systems and RAG frameworks that enhance data-driven decision-making.Develop and manage ontologies to support effective data utilization and enhance understanding across teams.Train and validate models using rigorous experimentation to evaluate and enhance their performance.Document model development processes, methodologies, and results for both internal and external stakeholders.Engage in text classification and sentiment analysis, employing both traditional machine learning classifiers and deep learning models.Continuously evaluate and improve NLP model performance through systematic experimentation and analysis.
Requirements for this challenge:Solid experience as a Data Scientist, specifically in NLP projects.Proficiency in programming with Python, particularly using libraries such as NLTK, spaCy, and Gensim.Strong understanding of NLP techniques, including but not limited to: Topic Extraction, Summarization, Categorization,Sentiment Analysis.Demonstrated experience with sequence-to-sequence models for tasks such as machine translation, text summarization, and question answering.Familiarity with advanced topic modeling techniques, including Latent Dirichlet Allocation (LDA) and Non-negative Matrix Factorization (NMF).Data Science Pipeline:Comprehensive understanding of the entire data science pipeline, including data gathering, preprocessing, model development, validation, and deployment.Experience with model mathematical validation techniques, such as accuracy, precision, recall, F1-score, and ROUGE score.Ethical Considerations:Awareness of ethical considerations in NLP, including biases in data and models, privacy concerns, and potential societal impacts.Problem Solving and Creativity:Strong critical thinking skills with the ability to troubleshoot issues and creatively apply different NLP techniques to solve real-world problems.Communication Skills:Advanced oral and written communication skills in English, with the ability to clearly convey complex concepts to diverse audiences.Experience working on international projects, demonstrating adaptability and cultural awareness.Collaboration and Innovation:Ability to work collaboratively with cross-functional teams to define the best NLP models aligned with business objectives.Commitment to leveraging state-of-the-art techniques for handling, analyzing, and visualizing large datasets.
Nice to Have: Experience with Databricks.Familiarity with Transformers, BERT, and Named Entity Recognition (NER).Background in data engineering and MLOps, including knowledge in Azure ML and Azure DevOps.Knowledge of data protection regulations (e.g., PII, CCPA, HIPAA) and best practices.
#MidSenior#LI-JP3CI&T is an equal-opportunity employer. We celebrate and appreciate the diversity of our CI&Ters’ identities and lived experiences. We are committed to building, promoting, and retaining a diverse, inclusive, and equitable company and culture focused on creating a better tomorrow. At CI&T, we recognize that innovation and transformation only happen in diverse, inclusive, and safe work environments. Our teams are most impactful when people from all backgrounds and experiences collaborate to share, create, and hear ideas. Before applying for our opportunities take a look at Conflict of Interest Policy on website. We strongly encourage candidates from diverse and underrepresented communities to apply for our vacancies.
CI&T is an equal-opportunity employer. We celebrate and appreciate the diversity of our CI&Ters’ identities and lived experiences. We are committed to building, promoting, and retaining a diverse, inclusive, and equitable company and culture focused on creating a better tomorrow.
At CI&T, we recognize that innovation and transformation only happen in diverse, inclusive, and safe work environments. Our teams are most impactful when people from all backgrounds and experiences collaborate to share, create, and hear ideas.Before applying for our opportunities take a look at Conflict of Interest Policy on website.
We strongly encourage candidates from diverse and underrepresented communities to apply for our vacancies.
Our benefits:
-Health and dental insurance-Meal and food allowance-Childcare assistance-Extended paternity leave-Wellhub (Gympass)-TotalPass-Profit-sharing (PLR)-Life insurance-CI&T University-Discount club-Free online platform dedicated to physical, mental, and overall well-being-Pregnancy and responsible parenting course-Partnerships with online learning platforms-Language learning platformAnd many more!More details about our benefits here: https://ciandt.com/br/pt-br/carreiras
Collaboration is our superpower, diversity unites us, and excellence is our standard. We value diverse identities and life experiences, fostering a diverse, inclusive, and safe work environment. We encourage applications from diverse and underrepresented groups to our job positions.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Azure BERT Classification Databricks Data quality Deep Learning DevOps Engineering Machine Learning ML models MLOps NLP NLTK Privacy Python RAG Semantic Analysis spaCy Topic modeling Transformers
Perks/benefits: Career development Fitness / gym Health care
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.