Data Analyst
Hanoi, Hanoi, VN
ActiveFence
ActiveFence empowers Trust & Safety and online security professionals in their quest to keep platform users and the public safe from harm.Description
1. Research & Development of GenAI Tools
- Identify, evaluate, and benchmark state-of-the-art Generative AI tools (e.g., OpenAI, Stability AI, RunwayML, ElevenLabs, Pika Labs, other open source).
- Stay updated with emerging AI models in video synthesis, face manipulation, deep fake detection, and text-to-video technologies.
- Experiment with new model architectures, APIs, and frameworks for scalable content generation.
2. Orchestration & Automation
- Design automated workflows for orchestrating multiple GenAI tools to generate videos at scale.
- Develop pipelines integrating text, audio, and video generation models (e.g., combining LLMs with synthetic media tools).
- Optimize GPU/Cloud-based processing for efficient batch generation of synthetic datasets.
- Ensure seamless data pipeline integration for AI/ML model training.
3. Synthetic Data Generation for Deep Fake Classification
- Generate large-scale synthetic deep fake datasets using various AI-driven tools.
- Develop procedural rules to create diverse video content, mimicking real-world deep fake patterns.
- Implement labeling and annotation workflows to tag deep fake and real video content accurately.
- Work with ML engineers to improve dataset quality for deep fake classifiers.
4. Data Analysis & Performance Monitoring
- Analyze video synthesis outputs to assess realism, quality, and AI model bias.
- Conduct data-driven experiments to measure the effectiveness of generated datasets.
- Develop dashboards, reports, and insights to track synthetic data performance.
- Identify and troubleshoot model weaknesses and anomalies in deep fake detection.
5. Data Feed Development
- Design and implement data feeds from external sources, ensuring accuracy, reliability, and efficiency.
- Develop automated processes for data collection and ingestion, utilizing appropriate tools and technologies.
6. Data Scraping and Analysis
- Conduct data scraping from diverse external sources to gather relevant information.
- Perform heuristic analysis and data exploration to derive insights for better prioritization of tasks and projects.
7. Collaboration and Communication:
- Work effectively with a diverse team of multicultural freelancers, fostering a collaborative and inclusive work environment.
- Maintain clear and open communication channels to facilitate seamless coordination and feedback exchange.
Requirements
Technical Skills:
- At least 3 years of project management experience: The role demands excellent project management capabilities, including planning, execution, and tracking project progress. The ability to manage timelines, resources, and stakeholder expectations is crucial.
- Proficient in Python programming: The ultimate person must have extensive experience in Python, capable of writing efficient, clean, and well-documented code.
- Expertise in SQL: applicants should possess strong SQL skills, able to design, query, and manage databases effectively, with a focus on data manipulation and optimization.
- Experience with data processing libraries: Candidates should have practical experience with PySpark and/or Pandas for data processing and analysis. Proficiency in handling large datasets and performing complex data transformations is essential.
- Familiarity with AWS Services: Knowledge of AWS cloud services is required, including but not limited to EC2, S3, Lambda.
- Gen AI experience
Soft Skills:
- Independent: The ideal candidate should be able to work independently, with minimal supervision, efficiently managing their workload and making informed decisions.
- Proactive: We are looking for individuals who are proactive in nature, always looking for ways to improve processes, solve problems before they escalate, and take initiative in their work.
- Self-Learner: The ability to learn new technologies and methodologies quickly and effectively is essential. Candidates should demonstrate a strong capacity for self-directed learning and staying current with industry trends.
Advantages:
- candidates with experience in using Databricks for data engineering and analysis will have an advantage. Candidates who’ve served in the intelligence force as malware / cyber security analysts have an advantage.
- Candidates who’ve worked in athe diverse team of professionals from different nationalities,
About ActiveFence
ActiveFence is the leading tool stack for Trust & Safety teams, worldwide. By relying on ActiveFence’s end-to-end solution, Trust & Safety teams – of all sizes – can keep users safe from the widest spectrum of online harms, unwanted content, and malicious behavior, including child safety, disinformation, fraud, hate speech, terror, nudity, and more.
Using cutting-edge AI and a team of world-class subject-matter experts to continuously collect, analyze, and contextualize data, ActiveFence ensures that in an ever-changing world, customers are always two steps ahead of bad actors. As a result, Trust & Safety teams can be proactive and provide maximum protection to users across a multitude of abuse areas, in 70+ languages.
Backed by leading Silicon Valley investors such as CRV and Norwest, ActiveFence has raised $100M to date; employs 300 people worldwide, and has contributed to the online safety of billions of users across the globe.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: APIs Architecture AWS Classification Data analysis Databricks EC2 Engineering Generative AI GPU Lambda LLMs Machine Learning Model training OpenAI Open Source Pandas Pipelines PySpark Python R&D Research Security SQL
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.