Key Responsibilities:
· Lead the architecture and design of robust ETL pipelines to process, transform, and load large datasets (8 billion+ records).
· Design, implement, and optimize data workflows using AWS Glue, Step Functions, Python, and PySpark.
· Build and maintain scalable data architectures for cloud-based systems, focusing on performance, scalability, and reliability.
· Collaborate with data engineers, data analysts, and stakeholders to ensure that data is integrated, processed, and delivered with high quality and speed.
· Oversee the ETL process to manage data flow from source systems to MySQL RDS, Redshift, and other data storage solutions.
· Troubleshoot and resolve performance issues, optimizing resource utilization, and ensuring data pipelines run efficiently.
· Ensure best practices in data security, governance, and compliance are followed throughout the ETL process.
· Provide technical leadership, mentorship, and guidance to junior team members.
· Work in an agile environment, delivering high-quality solutions and meeting deadlines effectively.
Skills and Qualifications:
· Strong expertise in Python, PySpark, and cloud-based data engineering technologies.
· Experience with AWS Glue, AWS Step Functions, MySQL RDS, and Amazon Redshift.
· Proven track record in designing scalable and performant ETL systems for large-scale data
· Familiarity with data modeling, data warehousing, and cloud architectures.
· Deep understanding of data pipeline orchestration and automation.
· Experience in working with distributed systems and parallel processing.
· Strong problem-solving skills and ability to optimize data flows and processing.
· Equal Opportunity Employer
We are an equal opportunity employer. All aspects of employment including the decision to hire, promote, discipline, or discharge, will be based on merit, competence, performance, and business needs. We do not discriminate on the basis of race, color, religion, marital status, age, national origin, ancestry, physical or mental disability, medical condition, pregnancy, genetic information, gender, sexual orientation, gender identity or expression, national origin, citizenship/ immigration status, veteran status, or any other status protected under federal, state, or local law.