Middle Machine Learning Engineer (Document Management System)
Warsaw, Poland
Sigma Software
Sigma Software is multinational IT company that provides the custom software development solutions. Become one of us!Company Description
We invite you to join our ML Competence Centre, a key part of Sigma Software’s dynamic organizational structure that integrates diverse clients, intriguing projects, and opportunities to enhance your professional skills.
Your initial project will place you in the ML R&D Centre on the client’s side, where you will help verify and implement groundbreaking ideas driven by advancements in ML, particularly NLP, to enhance the customer’s products and deliver added value to their clients.
CUSTOMER
Our client is a leading provider of high-quality IT products in the Swedish and Danish public sectors, with over 12 years of successful cooperation with us. Their automation solutions serve 80% of government agencies in Sweden. The company specializes in document management, enterprise content management (ECM), data sharing, digital preservation, GDPR compliance, ERP solutions, and more. They cater to various sectors, including government, banking, retail, manufacturing, and life sciences.
PROJECT
Currently, our client is establishing a PoC initiative to explore innovative ideas for potential advancements and improvements
Job Description
- Work under the supervision of the client’s research team to validate various ideas through PoCs, and to implement, train, test, and tune different learning models
- Collaborate with the research team to brainstorm and experiment with optimal models and open-source alternatives for NLP-based ideas to achieve the best possible results
- Apply best practices to effectively and securely implement pipelines for integrating models and ML-based solutions
Qualifications
- 3+ years of hands-on experience in ML, with a focus on NLP (text extraction & classification, anonymization and pseudonymization, document processing, clustering and other NLP tasks)
- Proficiency in Python
- Proven experience with LLMs APIs and/or open-weight models (e.g.: LLaMA, Mistral), with the use of frameworks like Langchain/Llamaindex for implementing solutions/features
- Proven experience with NLP techniques such as tokenization, stemming, lemmatization, and named entity recognition (NER)
- Experience with NLP libraries and frameworks like NLTK, spaCy, and Hugging Face Transformers
- Excellent communication skills for effective collaboration with cross-functional teams
- Strong problem-solving skills and ability to troubleshoot issues in model development and deployment
- At least an Upper-Intermediate level of English
WOULD BE A PLUS
- Bachelor's or Master's degree in Computer Science, Data Science, Machine Learning, or a related field
- Experience with PoC development and prototyping in enterprise solutions
- Experience with LLM evaluation frameworks (e.g.: DeepEval, MLFlow, RAGAs, Deepchecks, etc.)
- Experience with fine-tuning open-source LLMs for domain and tasks
- Knowledge of MLOPs pipeline and instruments
- Familiarity with deep learning frameworks such as TensorFlow, PyTorch, or Keras
- Familiarity with the deployment of NLP models in production environments
- Knowledge of cloud platforms (AWS, Google Cloud, Azure) and their machine-learning services
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: APIs AWS Azure Banking Classification Clustering Computer Science Deep Learning GCP Google Cloud Keras LangChain LLaMA LLMs Machine Learning MLFlow ML models MLOps NLP NLTK Open Source Pipelines Prototyping Python PyTorch R R&D Research spaCy TensorFlow Transformers
Perks/benefits: Career development
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.