Principal Engineer, Data Analytics Engineering

Bengaluru, India

Full Time Senior-level / Expert USD 58K - 109K * ^est.

Sandisk

Leaders in NVMe SSD, USB Flash, and Memory Cards; a new Sandisk is coming soon.

View all jobs at Sandisk

Apply now Apply later

Posted 2 hours ago

Company Description

Sandisk understands how people and businesses consume data and we relentlessly innovate to deliver solutions that enable today’s needs and tomorrow’s next big ideas. With a rich history of groundbreaking innovations in Flash and advanced memory technologies, our solutions have become the beating heart of the digital world we’re living in and that we have the power to shape.

Sandisk meets people and businesses at the intersection of their aspirations and the moment, enabling them to keep moving and pushing possibility forward. We do this through the balance of our powerhouse manufacturing capabilities and our industry-leading portfolio of products that are recognized globally for innovation, performance and quality.

Sandisk has two facilities recognized by the World Economic Forum as part of the Global Lighthouse Network for advanced 4IR innovations. These facilities were also recognized as Sustainability Lighthouses for breakthroughs in efficient operations. With our global reach, we ensure the global supply chain has access to the Flash memory it needs to keep our world moving forward.

Job Description

We are seeking a passionate candidate dedicated to building robust data pipelines and handling large-scale data processing. The ideal candidate will thrive in a dynamic environment and demonstrate a commitment to optimizing and maintaining efficient data workflows. The ideal candidate will have hands-on experience with Python, MariaDB, SQL, Linux, Docker, Airflow administration, and CI/CD pipeline creation and maintenance. The application is built using Python Dash, and the role will involve application deployment, server administration, and ensuring the smooth operation and upgrading of the application.

Key Responsibilities:

Minimum of 9+ years of experience in developing data pipelines using Spark.
Ability to design, develop, and optimize Apache Spark applications for large-scale data processing.
Ability to implement efficient data transformation and manipulation logic using Spark RDDs and Data Frames.
Manage server administration tasks, including monitoring, troubleshooting, and optimizing performance. Administer and manage databases (MariaDB) to ensure data integrity and availability.
Ability to design, implement, and maintain Apache Kafka pipelines for real-time data streaming and event-driven architectures.
Development and deep technical skill in Python, PySpark, Scala and SQL/Procedure.
Working knowledge and understanding on Unix/Linux operating system like awk, ssh, crontab, etc.,
Ability to write transact SQL, develop and debug stored procedures and user defined functions in python.
Working experience on Postgres and/or Redshift/Snowflake database is required.
Exposure to CI/CD tools like bit bucket, Jenkins, ansible, docker, Kubernetes etc. is preferred.
Ability to understand relational database systems and its concepts.
Ability to handle large table/dataset of 2+TB in a columnar database environment.
Ability to integrate data pipelines with Splunk/Grafana for real-time monitoring, analysis, and Power BI visualization.
Ability to create and schedule the Airflow Jobs.

Qualifications

Minimum of a bachelor’s degree in computer science or engineering. Master’s degree preferred.
AWS developer certification will be preferred.
Any certification on SDLC (Software Development Life Cycle) methodology, integrated source control system, continuous development and continuous integration will be preferred.

Additional Information

Sandisk thrives on the power and potential of diversity. As a global company, we believe the most effective way to embrace the diversity of our customers and communities is to mirror it from within. We believe the fusion of various perspectives results in the best outcomes for our employees, our company, our customers, and the world around us. We are committed to an inclusive environment where every individual can thrive through a sense of belonging, respect and contribution.

Sandisk is committed to offering opportunities to applicants with disabilities and ensuring all candidates can successfully navigate our careers website and our hiring process. Please contact us at jobs.accommodations@sandisk.com to advise us of your accommodation request. In your email, please include a description of the specific accommodation you are requesting as well as the job title and requisition number of the position for which you are applying.

Apply now Apply later

* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰

Job stats: 0 0 0

Categories: Analyst Jobs Engineering Jobs

Tags: Airflow Ansible Architecture AWS CI/CD Computer Science Data Analytics Data pipelines Docker Engineering Grafana Jenkins Kafka Kubernetes Linux MariaDB Pipelines PostgreSQL Power BI PySpark Python RDBMS Redshift Scala SDLC Snowflake Spark Splunk SQL Streaming