Data Engineer
Kolkata, WB India
Lexmark
Lexmark creates innovative imaging solutions and technologies that help customers worldwide print, secure and manage information with ease, efficiency and unmatched value.Responsibilities :
Job Description/Responsibilities:
The candidate will work in Global IT as a Data Engineer who will be responsible for the creation, management, and optimization of data pipelines, data architectures, and systems that store and process large volumes of data. They will work collaboratively with data scientists and analysts to ensure that data is clean, well-organized, and available for decision-making processes. The ideal candidate will have a strong background in programming, database management, cloud technologies, and data systems.
Qualification: BE/B.Tech/ME/MCA with 2+ Years in IT Experience.
Key Responsibilities:-
1. Data Pipeline Development:
• Design, implement, and optimize scalable data pipelines for batch and real-time processing.
• Ensure seamless integration of data from various sources (APIs, databases, flat files, etc.) into data warehouses or data lakes.
• Automate repetitive tasks for data collection, transformation, and loading.
2. Database Design and Management:
• Design and manage relational and NoSQL databases (e.g., SQL, PostgreSQL, MongoDB, Cassandra, etc.).
• Maintain the integrity and security of the data within databases and data warehouses.
• Optimize queries and data storage techniques for performance and scalability.
3. ETL (Extract, Transform, Load) Processes:
• Develop and manage ETL processes that extract data from various sources, transform it into useful formats, and load it into a storage solution.
• Ensure data quality, consistency, and reliability during the ETL process.
4. Collaboration with Data Science/Analytics Teams:
• Work closely with Data Scientists and Analysts to understand their data needs and deliver appropriate solutions.
• Assist in preparing data for machine learning and predictive analytics.
5. Cloud Infrastructure & Big Data Technologies:
• Work with cloud platforms such as Azure and AWS for data storage, processing, and analytics.
• Utilize big data technologies (e.g., Hadoop, Spark, Kafka) to handle large datasets efficiently.
6. Data Quality Assurance:
• Monitor data quality, consistency, and accuracy throughout the data lifecycle.
• Implement data validation techniques to ensure the integrity of data used for reporting and analysis.
7. Performance Tuning and Optimization:
• Analyze system performance and optimize queries, data pipelines, and storage to improve speed and efficiency.
• Proactively address performance bottlenecks.
8. Documentation and Reporting:
• Document the design, architecture, and processes of data systems.
• Generate reports and metrics on data pipeline performance, issues, and optimization progress.
Must Have Skills/Skill Requirement:
Technical Skills:
• Programming Languages: Proficiency in Python, Java, Scala, or similar programming languages.
• SQL and NoSQL: Strong experience with SQL (MySQL, PostgreSQL, etc.) and NoSQL databases (MongoDB, Cassandra, etc.).
• Data Pipeline Tools: Experience with tools like Apache Kafka, Apache Airflow, Apache NiFi, and other ETL frameworks.
• Big Data Technologies: Familiarity with Hadoop, Spark, Hive, and related tools.
• Cloud Platforms: Knowledge of Azure (Data Lake, Databricks) and AWS (Redshift, S3, Lambda).
• Data Warehousing & Data Lakes: Experience with data warehousing technologies (e.g., Lakehouse, Snowflake, Redshift).
Soft Skills:
• Problem-Solving: Strong analytical and problem-solving skills.
• Collaboration: Ability to work cross-functionally with Data Scientists, Analysts, and other stakeholders.
• Attention to Detail: High level of precision in handling and manipulating data.
• Time Management: Ability to prioritize tasks and work in a fast-paced environment.
How to Apply ?
Are you an innovator? Here is your chance to make your mark with a global technology leader. Apply now!
Global Privacy Notice
Lexmark is committed to appropriately protecting and managing any personal information you share with us. Click here to view Lexmark's Privacy Notice.
* Salary range is an estimate based on our AI, ML, Data Science Salary Index 💰
Tags: Airflow APIs Architecture AWS Azure Big Data Cassandra Databricks Data pipelines Data quality Data Warehousing ETL Hadoop Java Kafka Lambda Machine Learning MongoDB MySQL NiFi NoSQL Pipelines PostgreSQL Privacy Python Redshift Scala Security Snowflake Spark SQL
More jobs like this
Explore more career opportunities
Find even more open roles below ordered by popularity of job title or skills/products/technologies used.