Data Extractor

Mexico City, Mexico

⚠️ We'll shut down after Aug 1st - try foo🦍 for all jobs in tech ⚠️

Citian

Citian immerses the transportation planner and engineer directly in your built environment through our advanced digital twin technology.

View all jobs at Citian

Apply now Apply later

Citian is a fast growing, venture backed SaaS technology company based in Washington, DC. Our software solutions revolutionize how our transportation systems – roads, rail, transit, bicycle, pedestrian – operate.

Our tech solutions:

  • Reduce traffic fatalities
  • Enhance pedestrian accessibility
  • Empower system operators to save time & money

You’ll work with some of the brightest minds in the software and transportation industries. Our software engineers and data scientists apply the latest in emerging tech, Artificial Intelligence, and Machine Learning to build smarter, more advanced tools for our diverse client base. We work with clients across the United States, with global ambitions in the years ahead.

 

Who You Are:

We are looking to hire a Data Extractor to join our growing Data support team in the collection, labeling, and analysis of open-source data. In this role, you will analyze unstructured and semi-structured data and the application of algorithms on distributed, clustered, and cloud-based infrastructures.

Expected Duties:

  • Assist in the collection, preprocessing, and labeling of large-scale open-source datasets from various structured and unstructured sources.
  • Help develop and maintain data pipelines to ensure efficient data flow and integration across different platforms.
  • Perform exploratory data analysis to uncover patterns and insights in open-source or client datasets that align with the company's objectives.
  • Collaborate with data engineers, software developers, and our GIS team to understand data requirements
  • Write clear, well-documented code to automate data processing tasks and streamline workflows.
  • Generate reports and visualizations that communicate findings to both technical and non-technical stakeholders.
  • Stay up to date with trends in open-source data tools, technologies, and methodologies to continuously improve data collection and pipelining processes.

Required Experience:

  • Bachelor’s degree in Computer Science, Data Science, Statistics, or related field, or equivalent work experience.
  • 2+ years of experience working with a focus on open-source data collection, cleaning, and pipelining.
  • Programming skills in Python, with familiarity with libraries such as Pandas, NumPy, and requests for data manipulation and processing.
  • Experience working with APIs and web scraping tools to collect and integrate data from open-source platforms.
  • Understanding of data cleaning, transformation, and storage best practices.
  • Strong problem-solving skills, with an ability to manage multiple tasks and projects simultaneously

Preferred Skills:

  • Experience working with cloud platforms (e.g., AWS, Google Cloud, Azure) for data storage and processing.
  • Familiarity with SQL or NoSQL databases for querying and managing datasets.
  • Experience with natural language processing (NLP) and Large Language Models (LLMs) is a plus, but not required.
  • Prior experience working on government or public sector projects.

 

Your Citian Advantage:

  • Opportunity to gain valuable experience and make a significant impact in a fast growing, venture-backed tech startup
  • Competitive salary and benefits package including medical, dental, and vision insurance and generous paid time off.
  • 401(k) company match and monthly commuter transportation benefit
  • On-site gym and free snacks in the office
  • High-growth potential and opportunities for advancement and more!

Let's build smarter cities together! Learn more about Citian here: www.citian.co

** Citian is an organization committed to diversity and inclusion to drive our business results and create a better future every day for our diverse employees, clients, partners, and communities. We believe a diverse workforce allows us to match our growth ambitions and drive inclusion across the business. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, age, national origin, or protected veteran status and will not be discriminated against on the basis of disability.

Equal Opportunity/Affirmative Action Employer Minorities/Females/Protected Veterans/Persons with Disabilities

Apply now Apply later

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats:  1  0  0

Tags: APIs AWS Azure Computer Science Data analysis Data pipelines EDA GCP Google Cloud LLMs Machine Learning NLP NoSQL NumPy Open Source Pandas Pipelines Python SQL Statistics

Perks/benefits: 401(k) matching Career development Competitive pay Fitness / gym Health care Insurance Startup environment

Region: North America
Country: Mexico

More jobs like this