Data Engineer AI/ML
Brazil (Remote)
Rocket Lawyer
Rocket Lawyer makes the law affordable and simple. Create and sign legal documents online, get legal advice from attorneys, incorporate your business, and more!About the Role
We are seeking a highly skilled and passionate Data Engineer to join our growing team focused on building and deploying cutting-edge AI/ML solutions. As a Data Engineer, you will play a crucial role in designing, building, and maintaining the data infrastructure powering the AI models for Rocket Copilot, our AI legal assistant. You will work closely with Machine Learning Engineers, Data Scientists, and Product Managers to ensure the availability of high-quality data for training, fine-tuning, and evaluating generative models. This role requires a strong understanding of data engineering principles, experience with large-scale data processing, and a passion for pushing the boundaries of AI.
We value a fun, collaborative, team-oriented work environment, where we celebrate our accomplishments.
Responsibilities
- Design, develop, and maintain robust, scalable, and efficient data pipelines for ingesting, processing, transforming, and storing large datasets used for training and evaluating generative AI models.
- Perform data cleaning, normalization, transformation, and feature engineering to prepare data for model training. This includes handling unstructured data like text, images, and audio.
- Build and manage the data infrastructure, including data lakes, data warehouses, and databases, optimized for AI workloads.
- Implement data quality checks and monitoring systems to ensure data accuracy, completeness, and consistency.
- Contribute to the development and implementation of MLOps best practices for data management and model deployment.
- Work with GCP and Snowflake and their data and AI offering.
- Optimize data pipelines and infrastructure for performance, scalability, and cost-effectiveness.
Requirements
- 5+ years of python experience.
- 3+ experience of leveraging technologies such as Airflow, Apache Spark.
- Experience working with large language models (LLMs), diffusion models, or other generative models.
- Experience with MLOps tools and practices.
- Strong understanding of data architectures and patterns.
- Experience with containerization technologies (e.g., Docker, Kubernetes).
- Contributions to open-source projects.
- Strong understanding of data architectures and patterns.
- Experience in DataOps implementation and support.
- Experience in MLOps implementation and support.
- Experience in building and supporting AI/ML platform.
Benefits & Perks
- Private health insurance
- Life insurance
- Dental insurance
- Meal/Food voucher
- Totalpass & Wellhub partnership
- Mental health assistance
- Birthday off
- Daycare assistance
- Financial support for those who have children with special needs and disabilities
- Employee referral program
- Free Rocket Lawyer account with online access to an extensive legal documents library and brilliant licensed attorneys at discounted rates
Actual compensation packages are determined by various factors unique to each candidate, including but not limited to skill set, depth of experience, certifications, specific work location, and performance during the interview process.
Regime de contratação: CLT
Brazil Monthly CompensationR$22.000—R$24.500 BRLBy applying for this position, your data will be processed as per Rocket Lawyer Privacy Policy.
Tags: Airflow Architecture Copilot Data management DataOps Data pipelines Data quality Diffusion models Docker Engineering Feature engineering GCP Generative AI Generative modeling Kubernetes LLMs Machine Learning MLOps Model deployment Model training Open Source Pipelines Privacy Python R Snowflake Spark Unstructured data
Perks/benefits: Career development Health care
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.