Job Description :

Title: GCP Data Engineer with Hive, Spark, BigQuery exp

Job Description:

Client is looking for a Data engineer with experience with ETL workflows and data pipelines using tools like Hive, Spark and Airflow. 
The ideal candidate should be an expert in developing, testing and deployment of data pipelines, data analytics efforts, proactive issue identification and resolution and alerting mechanism using traditional, new and emerging technologies. 
Excellent written and verbal communication skills and ability to liaise with technologists to executives is key to be successful in this role. 

Skills 

• Proven experience as a Data Engineer, preferably in a big data environment. 
• Expertise in Hive, Spark, and Apache Hudi for big data processing and storage. 
• Hands-on experience with BigQuery and Google Cloud Platform (GCP) services such as GCS, Dataflow, and Pub/Sub. 
• Strong programming skills in Scala and Python, with experience in building data pipelines and ETL processes. 
• Proficiency with workflow orchestration tools like Apache Airflow. 
• Solid understanding of data warehousing concepts, data modelling, and schema design. 
• Knowledge of distributed systems and parallel processing. 
• Strong problem-solving skills and ability to work with large datasets in a fast-paced environment. 

Responsibilities 

• Design, develop, and maintain robust and scalable ETL workflows and data pipelines using tools like Hive, Spark, and Airflow.
• Implement and manage data storage and processing solutions using Apache Hudi and BigQuery. 
• Develop and optimize data pipelines for structured and unstructured data in GCP environments, leveraging GCS for data storage. 
• Write clean, maintainable, and efficient code in Scala and Python to process and transform data. 
• Ensure data quality, integrity, and consistency by implementing appropriate data validation and monitoring techniques. 
• Work with cross-functional teams to understand business requirements and deliver data solutions that drive insights and decision-making. 
• Troubleshoot and resolve performance and scalability issues in data processing and pipelines. 
• Stay updated with the latest developments in big data technologies and tools and incorporate them into the workflow as appropriate. 

 

 

 

 

 We are an equal opportunity employer. All aspects of employment including the decision to hire, promote, discipline, or discharge, will be based on merit, competence, performance, and business needs. We do not discriminate on the basis of race, color, religion, marital status, age, national origin, ancestry, physical or mental disability, medical condition, pregnancy, genetic information, gender, sexual orientation, gender identity or expression, national origin, citizenship/ immigration status, veteran status, or any other status protected under federal, state, or local law.

             

Similar Jobs you may be interested in ..