Junior Data Engineer/Scientist
London, United Kingdom
M&C Saatchi Group
We are a global creative solutions company of five specialist divisions, connected through data, technology & culture, to deliver Meaningful Change for clients.Fluency (Part of M&C Saatchi Group) are seeking a talented Junior Data Engineer/Scientist to join our team. You will design, build, and maintain efficient web scrapers using Python. You will collect, process, and analyze large volumes of data from various websites, ensuring data quality and compliance.
This is a 3 month Fixed Term Contract.
What you'll do:
- Design, develop, and maintain web scrapers using Python to collect data from various websites
- Analyze data requirements and develop efficient scraping strategies to ensure data quality and completeness
- Utilize HTML/CSS knowledge to identify and extract relevant information from web pages
- Implement and customize web scraping tools and libraries such as BeautifulSoup, Scrapy, and Selenium
- Collaborate with cross-functional teams to understand data needs and provide insights
- Adhere to ethical and legal standards while scraping data, ensuring compliance with terms of service and robots.txt files
- Process and store scraped data in appropriate formats (e.g., CSV, JSON, databases) for further analysis
- Monitor and maintain existing web scrapers to ensure optimal performance
- Stay up to date with industry developments and emerging technologies to improve web scraping capabilities
What you'll bring:
- Programming Knowledge:
- Python: (libraries like BeautifulSoup, Scrapy, and Selenium)
- JavaScript: for JavaScript heavy sites (Puppeteer for example)
- Understanding of HTML/CSS:
- Structure of web pages to locate and extract the desired information effectively.
- Familiarity with Web Scraping Tools and Libraries:
- BeautifulSoup: For parsing HTML and XML documents.
- Scrapy: web scraping framework for handling large volumes of data efficiently.
- Selenium: For interacting with web pages that require JavaScript rendering.
- Knowledge of HTTP Protocols:
- Understanding how requests and responses work is crucial for accessing web content and handling sessions or authentication.
- Data Cleaning and Processing:
- Skills in tools like Pandas or NumPy to clean and process the scraped data effectively.
- Ethical and Legal Awareness:
- Understand the legal implications of web scraping, including terms of service and robots.txt files, to ensure compliance with a site's policies.
- Handling Data Formats:
- Proficiency in working with CSV, JSON, or databases, depending on where and how the scraped data will be stored and used.
- Problem-Solving Skills:
- Ability to troubleshoot issues, such as handling CAPTCHAs, dynamic content, or rate limits imposed by websites. Puppeteer
ABOUT FLUENCY:
Fluency is an award-winning global data consultancy that helps brands make smarter, faster business decisions that deliver stronger outcomes. Combined with the M&C Saatchi Group’s intellectual capital in brand, marketing, design, and CRM, Fluency specialises in delivering insight, foresight, and oversight solutions fuelled by high quality data, analytics, and technologies.
Our team of creative minds operate out of hubs in New York, London and Sydney. Clients include Amazon, NIKE, UK Government, and Ford.
https://fluency-mcsaatchi.com/
WHAT YOU'LL GET
For the right candidate, we will offer a competitive salary and benefits package which includes 27 days annual holiday, private healthcare, employer contributory pension, life assurance and income protection.
Our commitment to Diversity, Equity and Inclusion sees us offer inclusive bank holidays, learning opportunities around DE&I, targeted mentoring programmes and the opportunity to participate in active Employee Led Networks and associated events.
CLOSING DATE: Wednesday, 11th December. NO RECRUITERS.
ABOUT M&C SAATCHI GROUP
M&C Saatchi Group is a creative group of companies who navigate, create, and lead meaningful change for clients. Across five specialist divisions connected through people, culture, data, technology and creativity, M&C Saatchi Group aims to unlock new value for clients and leave a positive impact on the world.
M&C Saatchi Group’s work is informed by two core principles - Brutal Simplicity of Thought and Diversity of Thought. Together they guide how problems are solved and integrated specialist teams are built. Headquartered in London, M&C Saatchi Group have circa 2,500 employees globally of which 720 employees are UK based. Our operations span 23 countries with major hubs in the UK, Europe, Middle East & Africa, Asia and Australia. Having floated in 2004, M&C Saatchi plc is a constituent of the FTSE AIM. It has a market capitalisation of £250 million and ambitious growth plans.
M&C Saatchi Group is an Equal Opportunity Employer which does not discriminate, celebrates diversity and bases all hiring and promotion decisions solely on talent and capability, without regard for any personal characteristics.
All employee information is kept confidential according to General Data Protection Regulation (GDPR).
#LI-BM1
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: CSV Data quality JavaScript JSON NumPy Pandas Python Selenium XML
Perks/benefits: Career development Competitive pay Equity / stock options Team events
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.