Lead QA Engineer (Spark Data Integrity & Performance Testing)
Pune, Maharashtra, India
InfraCloud
InfraCloud helps companies build GPU Cloud, modernize applications and infrastructure with our expertise in cloud native technologies.Location: Pune,Maharashtra,India
Must-Have Skills:
Apache Spark expertise: design, configure, and optimize Spark clusters (or similar engines like Dremio)
Data Integrity QA: create and execute test cases to validate accuracy, consistency, and completeness of data; implement and maintain automated test scripts
Performance Testing: architect and run benchmark tests; analyze the impact of Spark cluster configurations on query/workflow performance
Leadership: mentor and guide other testers on best practices in both data integrity and performance testing
Good-to-Have Skills:
Familiarity with performance testing tools (e.g., JMeter, Gatling)
Experience integrating tests into CI/CD pipelines (e.g., Jenkins, GitLab CI)
Exposure to cloud-based Spark services (AWS EMR, Azure Synapse)
Who You Are
A data-driven QA leader passionate about ensuring both correctness and speed in big-data pipelines
Comfortable translating complex requirements into repeatable, automated test suites
Skilled at troubleshooting anomalous results, performing root-cause analysis, and optimizing system configurations
A collaborative mentor who elevates team practices and drives continuous improvement
What You’ll Do & Learn
Design & run comprehensive data integrity tests for Spark queries, investigating failures and ensuring zero data discrepancies
Implement automated validation scripts that integrate with our CI/CD workflows
Define & execute performance benchmarks across varied Spark cluster setups; report on metrics like throughput, latency, and resource utilization
Tune Spark configurations to meet SLAs for data freshness and query response times
Lead test planning sessions, coach junior testers, and document best practices for reproducible, scalable testing
Collaborate with data engineering, DevOps, and product teams to embed quality gates into the development lifecycle
Why Join Us?
Own the quality and performance of our core big-data platform powering mission-critical analytics
Work alongside experienced data engineers and architects on cutting-edge Spark deployments
Drive automation and efficiency in a growing, innovation-focused environment
Enjoy opportunities for professional growth, training, and attending industry conferences
Apply to this job
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: AWS Azure CI/CD Data pipelines DevOps Engineering GitLab Jenkins Pipelines Spark Testing
Perks/benefits: Career development Conferences
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.