Full-time

Lead Data Engineer - Python, PySpark & SQL

Posted by Princeton IT Services, Inc • , , Canada, , , Canada, Canada

📍 , , Canada, , , Canada 🕒 March 01, 2026

Apply for this Job Similar Jobs

About the Role

Job Title Lead Data Engineer – Python, PySpark & SQL 
Location Canada 
Job Type Full time contract 
Responsibilities Build scalable data ingestion and transformation pipelines using Python, PySpark, and SQL. 
Process raw CSV/text files from AWS S3, including validating headers, schema checks, and malformed file detection. 
Convert raw data into structured DataFrames and implement reusable data quality checks. 
Develop advanced transformations using SQL/PySpark (Window functions, LAG(), grouping logic, date gap detection, etc.). 
Deploy and tune PySpark applications on AWS EMR, optimizing executor memory, cores, shuffle behavior, and cluster performance. 
Work with AWS services such as S3, EMR, Glue, Lambda, IAM. 
Debug performance issues (OOM errors, shuffle spill, GC problems) and improve pipeline reliability. 
Lead design discussions, code reviews, and mentor junio...
                

Job Details

Location , , Canada, , , Canada
Job Type Full-time
Category Other-General
Posted March 01, 2026
Deadline April 10, 2026

Ready to Apply?

Submit your application today and take the next step in your career journey with Princeton IT Services, Inc.

Apply Now