Data Engineer
Onsite Bentonville, AR - hybrid
Job Summary:
We are looking for an experienced GCP Data Engineer with 5-8 years of hands-on experience to join our team. The ideal candidate will have extensive practical knowledge in SQL, Python, data modeling, and solving complex ETL/ELT challenges. In this role, you will actively design, develop, and maintain data pipelines and architectures within Google Cloud Platform (GCP) to support data-driven decision-making.
Key Responsibilities:
- Data Modeling: Create and maintain data models and schemas to support business intelligence and reporting needs. Ensure data integrity and consistency across various sources.
- Data Engineering: Design, build, and manage scalable data pipelines and ETL/ELT processes using GCP tools such as BigQuery, Dataflow, Dataproc, and Pub/Sub.
- SQL Development: Write efficient and complex SQL queries for data extraction, transformation, and analysis. Optimize query performance for large datasets.
- Python Programming: Develop Python scripts and applications for data processing, automation, and integration with GCP services.
- Problem Solving: Troubleshoot and resolve data-related issues, ensuring data accuracy and performance. Implement best practices for data management and optimization.
- Collaboration: Work closely with data scientists, analysts, and other stakeholders to understand data requirements and deliver solutions that meet business needs.
- Documentation: Maintain comprehensive documentation for data pipelines, models, and processes. Ensure clear communication of technical concepts to non-technical stakeholders.
Qualifications:
- Education: Bachelor's degree in Computer Science, Data Engineering, Mathematics, or a related field. Advanced degree is a plus.
- Experience: 5-8 years of hands-on experience as a Data Engineer, particularly with GCP services and tools. Extensive practical experience in SQL and Python is essential.
Technical Skills:
- Proficiency in SQL for querying and managing large datasets.
- Strong programming skills in Python for data processing and automation.
- Experience with GCP services such as BigQuery, Dataflow, Dataproc, Pub/Sub, and Cloud Storage.
- Expertise in data modeling and designing data architectures.
- Solid understanding of ETL/ELT processes and best practices.
- Problem-Solving: Demonstrated ability to tackle complex data problems and implement effective solutions.
- Communication: Good verbal and written communication skills. Ability to convey technical information to a non-technical audience.
Preferred Skills:
- Experience with other cloud platforms (e.g. Azure) is a plus.
- Familiarity with data visualization tools (e.g., Looker, Power BI) is advantageous.