Data Engineer (Webint automation engineer) Poland
Warsaw, Masovian Voivodeship, PL
ActiveFence
Protect your platform with AI safety solutions built to detect harmful content, manage AI risks, and ensure secure, compliant user experiences.Description
Are you a Python master? Do you have a creative mind? Do you like challenges? Do you have a hacker-like mentality? Do you enjoy Web Intelligent?
Join our Extremism Department and help us detect at scale malicious activities in the fields of counter-terrorism, hate speech, child safety, and many more.
Responsibilities:
- Independently build, maintain, and use various automation tools that will support gathering data from all corners of the internet
- Process and analyze large amounts of data for report generation and other deliverables for our customers
- Work with internal stakeholders on automated data collection, product, research, etc.
- Build a collection strategy that will not only address the current requirements, but will also demonstrate innovation and forward-thinking regarding future goals and needs
Requirements
- Strong Python development experience, especially for automation and web scraping
- Deep knowledge of HTTP, HTML parsing, and web structures
- Proficiency with libraries such as requests, BeautifulSoup, and regular expressions
- Understanding of APIs, network protocols, and data parsing techniques
- Experience in building robust scraping pipelines that handle timeouts, retries, proxies, rate-limiting, and headless scraping
- Solid understanding of CSV, JSON, and data transformation workflows using pandas
- Analytical and decision-making skills
- Multitasking abilities, being able to deliver under tight deadlines
- Fluent English (reading, writing)
- Ability to work independently while being a team player
- Proactive problem-solving skills
- Out-of-the-box thinking, creative mind
Nice-to-haves:
- Previous experience in building OSINT tools in the context of threat detection, counter-fraud, disinformation analysis, or scam detection
- Experience in handling multi-step scraping flows (navigating to shortened URLs and extracting final destinations, ...)
- Familiarity with tools like Selenium, Scrapy, or undetected-chromedriver for bypassing bot protection
- Knowledge of asynchronous scraping techniques for efficient data collection
- Basic knowledge of Google Sheets API, or tools like gspread for automated reporting
- Previous experience with tools such as Postman or Insomnia for API development
- Experience in designing prompts or leveraging LLMs through API (like OpenAI or Claude) for content moderation or entity extraction
About ActiveFence
ActiveFence is the leading provider of security and safety solutions for online experiences, safeguarding more than 3 billion users, top foundation models, and the world’s largest enterprises and tech platforms every day.
As a trusted ally to major technology firms and Fortune 500 brands that build user-generated and GenAI products, ActiveFence empowers security, AI, and policy teams with low-latency Real-Time Guardrails and a continuous Red Teaming program that pressure-tests systems with adversarial prompts and emerging threat techniques. Powered by deep threat intelligence, unmatched harmful-content detection, and coverage of 117+ languages, ActiveFence enables organizations to deliver engaging and trustworthy experiences at global scale while operating safely and responsibly across all threat landscapes.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: API Development APIs Claude CSV Generative AI JSON LLMs OpenAI Pandas Pipelines Python Research Security Selenium
Perks/benefits: Team events
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.