Data scientist

Chennai, Tamil Nadu, India

Full Time USD 51K - 94K * ^est.

Ford Motor Company

Since 1903, we have helped to build a better world for the people and communities that we serve. Welcome to Ford Motor Company.

View all jobs at Ford Motor Company

Apply now Apply later

Posted 1 day ago

Comfort level in following Python project management best practices (use of setup.py, logging, pytests, relative module imports,sphinx docs,etc.,)
Familiarity in use of Github (clone, fetch, pull/push,raising issues and PR, etc.,)
High familiarity in the use of DL theory/practices in NLP applications
Comfort level to code in Huggingface, LangChain, Chainlit, Tensorflow and/or Pytorch, Scikit-learn, Numpy and Pandas
Comfort level to use two/more of open source NLP modules like SpaCy, TorchText, fastai.text, farm-haystack, and others
Knowledge in fundamental text data processing (like use of regex, token/word analysis, spelling correction/noise reduction in text, segmenting noisy unfamiliar sentences/phrases at right places, deriving insights from clustering, etc.,)
Have implemented in real-world BERT/or other transformer fine-tuned models (Seq classification, NER or QA) from data preparation, model creation and inference till deployment
Use of GCP services like BigQuery, Cloud function, Cloud run, Cloud Build, VertexAI,
Good working knowledge on other open source packages to benchmark and derive summary
Experience in using GPU/CPU of cloud and on-prem infrastructures
Skillset to leverage cloud platform for Data Engineering, Big Data and ML needs.
Use of Dockers (experience in experimental docker features, docker-compose, etc.,)
Familiarity with orchestration tools such as airflow, Kubeflow
Experience in CI/CD, infrastructure as code tools like terraform etc.
Kubernetes or any other containerization tool with experience in Helm, Argoworkflow, etc.,
Ability to develop APIs with compliance, ethical, secure and safe AI tools.
Good UI skills to visualize and build better applications using Gradio, Dash, Streamlit, React, Django, etc.,
Deeper understanding of javascript, css, angular, html, etc., is a plus.

Design NLP/LLM/GenAI applications/products by following robust coding practices,
Explore SoTA models/techniques so that they can be applied for automotive industry usecases
Conduct ML experiments to train/infer models; if need be, build models that abide by memory & latency restrictions,
Deploy REST APIs or a minimalistic UI for NLP applications using Docker and Kubernetes tools
Showcase NLP/LLM/GenAI applications in the best way possible to users through web frameworks (Dash, Plotly, Streamlit, etc.,)
Converge multibots into super apps using LLMs with multimodalities
Develop agentic workflow using Autogen, Agentbuilder, langgraph
Build modular AI/ML products that could be consumed at scale.

Data Engineering:

Skillsets to perform distributed computing (specifically parallelism and scalability in Data Processing, Modeling and Inferencing through Spark, Dask, RapidsAI or RapidscuDF)
Ability to build python-based APIs (e.g.: use of FastAPIs/ Flask/ Django for APIs)
Experience in Elastic Search and Apache Solr is a plus, vector databases.

Education: Bachelor’s or Master’s Degree in Computer Science, Engineering, Maths or Science

Performed any modern NLP/LLM courses/open competitions is also welcomed.

Apply now Apply later

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats: 0 0 0

Category: Data Science Jobs

Tags: Airflow Angular APIs BERT Big Data BigQuery CI/CD Classification Clustering Computer Science Django Docker Engineering fastai Flask GCP Generative AI GitHub GPU Gradio Haystack Helm HuggingFace JavaScript Kubeflow Kubernetes LangChain LLMs Machine Learning NLP NumPy Open Source Pandas Plotly Python PyTorch React Scikit-learn spaCy Spark Streamlit TensorFlow Terraform Vertex AI