Staff Product Engineer / Product Specialist - Spark SME
Bengaluru
Acceldata
Maximize the ROI on your data investments by ensuring reliability, eliminating operational blind spots, and reducing spend with Acceldata data observability.
Position Summary:We are seeking an Apache Spark - Subject Matter Expert (SME) who will be responsible for designing, optimizing, and scaling Spark-based data processing systems. This role involves hands-on experience in Spark architecture and core functionalities, focusing on building resilient, high-performance distributed data systems. You will collaborate with engineering teams to deliver high-throughput Spark applications and solve complex data challenges in real-time processing, big data analytics, and streaming.
If you’re passionate about working in fast-paced, dynamic environments and want to be part of the cutting edge of data solutions, this role is for you.
If you’re passionate about working in fast-paced, dynamic environments and want to be part of the cutting edge of data solutions, this role is for you.
We’re looking for someone who can:
- Design and optimize distributed Spark-based applications, ensuring low-latency, high-throughput performance for big data workloads.
- Troubleshooting: Provide expert-level troubleshooting for any data or performance issues related to Spark jobs and clusters.
- Data Processing Expertise: Work extensively with large-scale data pipelines using Spark's core components (Spark SQL, DataFrames, RDDs, Datasets, and structured streaming).
- Performance Tuning: Conduct deep-dive performance analysis, debugging, and optimization of Spark jobs to reduce processing time and resource consumption.
- Cluster Management: Collaborate with DevOps and infrastructure teams to manage Spark clusters on platforms like Hadoop/YARN, Kubernetes, or cloud platforms (AWS EMR, GCP Dataproc, etc.).
- Real-time Data: Design and implement real-time data processing solutions using Apache Spark Streaming or Structured Streaming.
What makes you the right fit for this position:
- Expert in Apache Spark: In-depth knowledge of Spark architecture, execution models, and the components (Spark Core, Spark SQL, Spark Streaming, etc.)
- Data Engineering Practices: Solid understanding of ETL pipelines, data partitioning, shuffling, and serialization techniques to optimize Spark jobs.
- Big Data Ecosystem: Knowledge of related big data technologies such as Hadoop, Hive, Kafka, HDFS, and YARN.
- Performance Tuning and Debugging: Demonstrated ability to tune Spark jobs, optimize query execution, and troubleshoot performance bottlenecks.
- Experience with Cloud Platforms: Hands-on experience in running Spark clusters on cloud platforms such as AWS, Azure, or GCP.
- Containerization & Orchestration: Experience with containerized Spark environments using Docker and Kubernetes is a plus.
Good to have:
- Certification in Apache Spark or related big data technologies.
- Experience working with Acceldata's data observability platform or similar tools for monitoring Spark jobs.
- Demonstrated experience with scripting languages like Bash, PowerShell, and Python.
- Familiarity with concepts related to application, server, and network security management.
- Possession of certifications from leading Cloud providers (AWS, Azure, GCP), and expertise in Kubernetes would be significant advantages.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Job stats:
0
0
0
Categories:
Engineering Jobs
Leadership Jobs
Product Jobs
Tags: Architecture AWS Azure Big Data Data Analytics Data pipelines Dataproc DevOps Docker Engineering ETL GCP Hadoop HDFS Kafka Kubernetes Pipelines Python Security Spark SQL Streaming
Region:
Asia/Pacific
Country:
India
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.
Business Intelligence Developer jobsData Engineer II jobsPrincipal Data Scientist jobsPrincipal Data Engineer jobsBI Developer jobsStaff Data Scientist jobsCopywriter - Freelance AI Tutor jobsData Scientist II jobsData Manager jobsJunior Data Analyst jobsData Science Manager jobsResearch Scientist jobsBusiness Data Analyst jobsLead Data Analyst jobsSr. Data Scientist jobsData Science Intern jobsSr Data Engineer jobsSenior Artificial Intelligence/Machine Learning Engineer - Remote, Latin America jobsJunior Data Scientist jobsJunior Data Engineer jobsBI Analyst jobsSenior AI Engineer jobsData Engineer III jobsSoftware Engineer, Machine Learning jobsData Analyst Intern jobs
Snowflake jobsLinux jobsEconomics jobsHadoop jobsPhysics jobsOpen Source jobsRDBMS jobsJavaScript jobsComputer Vision jobsAirflow jobsKafka jobsScala jobsMLOps jobsNoSQL jobsData Warehousing jobsBanking jobsData warehouse jobsKPIs jobsSAS jobsGoogle Cloud jobsOracle jobsClassification jobsPostgreSQL jobsGitHub jobsScrum jobs
Scikit-learn jobsR&D jobsCX jobsStreaming jobsTerraform jobsData Mining jobsPandas jobsDistributed Systems jobsLooker jobsIndustrial jobsJira jobsPySpark jobsRobotics jobsBigQuery jobsJenkins jobsRedshift jobsReact jobsMySQL jobsMatlab jobsMicroservices jobsdbt jobsUnstructured data jobsData strategy jobsE-commerce jobsNumPy jobs