Data Scientist
HOUSTON, TX, United States
GHD
GHD is a global multidisciplinary professional services network offering integrated solutions across digital, engineering, environmental, design and construction.Data Scientist –
Join a global professional services leader. We are committed to solving the world’s biggest challenges in the areas of environment, water, waste, energy, and urbanization.
Job Description
GHD is seeking a passionate and energetic Data Scientist for our team. The successful candidate will be responsible for analyzing complex datasets using statistical / machine learning methodologies. The candidate needs to be highly experienced in leveraging Computer Vision (CV) and Natural Language Processing (NLP), to extract and model the knowledge from unstructured data in multiple forms (point cloud, video, documents, tables, figures, images, etc.). The ideal candidate will have experience in applied data science and machine learning, develop, train, and validate modern AI frameworks, and has experience leveraging Large Language Models (LLMs) and Graph Database. The candidate should be highly self-motivated and intensely curious, an effective and proactive communicator, and generate many testable ideas.
Responsibilities
- Own and manage existing CV project codebases, which involve model development and inference, data manipulations, and end user applications
- Develop new CV solutions with masking, alignment, resampling, object detection, image classification, contour detection and approximation, morphological and affine transformations.
- Perform NLP tasks such as text summarization, extractive question answering, text classification, and topic modeling using transformers, NMF, LDA, Named-Entity Recognition in Python.
- Combine CV and NLP, and off-the-shelf document parsing libraries (e.g. Adobe API) to provide comprehensive knowledge extraction from arbitrarily complex PDF documents
- Develop and own maintainable solutions through expertise in object-oriented programming (OOP), code documentation, version control, testing, and other good development practices
- Advise others on data science and programming related architectural and design questions.
- Demonstrate strong ability to pick up on new tools, techniques, and domain knowledge quickly, with minimal formal training. This includes previously unseen data science technology stacks and low-code tools for data manipulation and data engineering.
- Serve as internal thought leader on how to experiment with, explore and implement emerging data science technologies, especially related to unstructured data.
- Deliver highly engaging and persuasive presentations, utilizing skills in visual and verbal storytelling, whether in the form presentations, interactive dashboards, blog articles, or other inventive media.
- Document technical as well as high-level methodology for own and teams’ data science deliverables in a beautiful, client-ready way.
- Create superb illustrations and visualizations of data communicating statistical modeling results as measures of the business’s impact.
Requirements:
- Ph.D. or master’s in STEM, or Quantitative Environmental Science, Civil Engineering related fields with 3 years of experience in applying data science and machine learning to real-world problems.
- 1 to 2 years hands-on experience implementing CV and NLP solutions for processing unstructured data. Must be proficient in Python and Python-based data science libraries, such as OpenCV, NLTK, SpaCy, huggingface etc.
- Foundation of machine learning theory and application, ranging from conventional multivariate statistical learning models to deep learning and beyond. Mastery in putting ML theory into practice, using libraries such as scikit-learn, tensorflow, pytorch, keras
- Experience in spatial-temporal data analysis and visualizations libraries such as GeoPandas, leaflet, matplotlib, seaborn, plotnine, plotly, or similar is a plus
- Experience in optimizing model performance through proper model selection, hyperparameter tuning, feature engineering, model validation and testing is a plus
As a diverse and inclusive organization, we encourage individual achievement and recognize the strength of a diverse workforce. GHD is an equal-opportunity employer. Upon request, GHD will provide reasonable accommodation for applicants with disabilities throughout the recruitment and selection process.
#LI-TP1
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: APIs Classification Computer Vision Data analysis Deep Learning Engineering Feature engineering HuggingFace Keras LLMs Machine Learning Matplotlib ML models NLP NLTK OOP OpenCV Plotly Python PyTorch Scikit-learn Seaborn spaCy Statistical modeling Statistics STEM TensorFlow Testing Topic modeling Transformers Unstructured data
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.