Data Engineer
United States
Varonis
The world's only fully automated DSPM. Continuously discover and classify critical data, remove exposures, and stop threats in real-time with AI-powered automation.
Position Overview:
We seek an experienced Data Engineer with expertise in modern data architectures, pipeline engineering, and cloud-based data ecosystems. In this role, you will design and build efficient, scalable data pipelines that enable robust data integration for AI, ML, SLM, LLM, and advanced analytics initiatives. You will also work closely with our data scientists and engineers to ensure high-quality, high-performance data flow and contribute to our data-driven culture.Responsibilities:
We seek an experienced Data Engineer with expertise in modern data architectures, pipeline engineering, and cloud-based data ecosystems. In this role, you will design and build efficient, scalable data pipelines that enable robust data integration for AI, ML, SLM, LLM, and advanced analytics initiatives. You will also work closely with our data scientists and engineers to ensure high-quality, high-performance data flow and contribute to our data-driven culture.Responsibilities:
- Design, build, and maintain scalable ETL/ELT pipelines to integrate data from diverse sources, optimizing for performance and cost efficiency.
- Leverage Databricks and other modern data platforms to manage, transform, and process data for ML and AI models, supporting both real-time and batch processing workflows.
- Work with cross-functional teams to implement data solutions that support model training, monitoring, and production deployment.
- Collaborate with a cybersecurity research team to understand emerging threats and develop solutions that leverage advanced data analytics.
- Design and develop innovative prompts and instruction sets to enhance our autonomous LLM-based labeling platform.
- Optimize prompts to generate high-quality, coherent, and contextually relevant responses.
- Collaborate with software and data engineers to integrate ML/LLM techniques into production systems.
- 3+ years of experience in data engineering, including cloud-based data solutions
- Proven expertise in implementing large-scale data solutions.
- Proficiency in Python. PySpark is a plus.
- Experience with cloud and big data technologies such as Databricks and Azure Data factory.
- Experience with prompt engineering techniques.
- Experience with vector databases and embedding techniques is a plus.
- Experience with MLOps is a plus.
- Strong analytical and problem-solving skills, with the ability to evaluate and interpret complex data.
- Excellent communication and collaboration skills, with the ability to work effectively in a multidisciplinary team.
- Proven track record of delivering high-quality results in a fast-paced and dynamic environment.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Job stats:
4
0
0
Category:
Engineering Jobs
Tags: Architecture Azure Big Data Data Analytics Databricks Data pipelines ELT Engineering ETL LLMs Machine Learning MLOps Model training Pipelines Prompt engineering PySpark Python Research
Regions:
Remote/Anywhere
North America
Country:
United States
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.
Principal Data Scientist jobsPrincipal Data Engineer jobsData Scientist II jobsStaff Data Scientist jobsBI Developer jobsData Manager jobsJunior Data Analyst jobsResearch Scientist jobsData Science Manager jobsBusiness Data Analyst jobsLead Data Analyst jobsSenior AI Engineer jobsData Engineer III jobsData Science Intern jobsSr. Data Scientist jobsData Specialist jobsSoftware Engineer II jobsData Analyst Intern jobsSoftware Engineer, Machine Learning jobsJunior Data Engineer jobsData Analyst II jobsBI Analyst jobsSenior Data Scientist, Performance Marketing jobsSr Data Engineer jobsPrincipal Software Engineer jobs
Economics jobsSnowflake jobsLinux jobsHadoop jobsComputer Vision jobsOpen Source jobsJavaScript jobsMLOps jobsPhysics jobsBanking jobsRDBMS jobsKafka jobsAirflow jobsNoSQL jobsData Warehousing jobsScala jobsR&D jobsGoogle Cloud jobsKPIs jobsStreaming jobsData warehouse jobsClassification jobsGitHub jobsOracle jobsCX jobs
SAS jobsPostgreSQL jobsScikit-learn jobsData Mining jobsScrum jobsE-commerce jobsPandas jobsTerraform jobsDistributed Systems jobsPySpark jobsLooker jobsBigQuery jobsRobotics jobsJira jobsIndustrial jobsJenkins jobsUnstructured data jobsdbt jobsRedshift jobsReact jobsData strategy jobsMicroservices jobsMySQL jobsPharma jobsNumPy jobs