About the Job:
Data Engineers at AdTheorent collaborate with data science and analytics teams to design, build, and scale very large datasets and data pipelines that enable analytics, modeling, experimentation, and reporting. The Data Engineer position is ideal for a data-focused engineer with a strong Python, Spark, and SQL experience who is looking to work with massive amounts of data in a cutting-edge, cloud-based environment. You will be responsible for building data solutions using tools from Apache and AWS, including Redshift, Spark, S3, Lambda, Clickhouse, and Airflow.
This is a remote, permanent, full-time position.
Responsibilities:
- Using a variety of open-source technologies (Python, Spark, Airflow), build data pipelines to extract, cleanse, and integrate data, from a variety of sources and formats
- Design, develop, and implement dimensional data marts to support analytics and reporting requirements
- Develop and maintain ETL processes to populate dimensional data marts from various data sources
- Develop scalable and re-usable solutions that support real-time, batch, and event-based data processing
- Own the data quality for the data pipelines
- Fulfill ad-hoc requests from business partners for data residing in AWS Redshift, and/or S3
- Interact with business partners to lead technical solution discussions
- Experience with Agile/Scrum methodology framework for product development
Requirements
- Bachelor’s or Master’s Degree in Computer Science, Engineering, Math, or other related discipline; or equivalent work experience.
- Must have 2+ years of programming experience with Python
- Proven experience in data engineering with a focus on building dimensional data marts
- Solid understanding of dimensional modeling concepts and experience implementing star schemas, snowflake schemas, and slowly changing dimensions
- 1+ years of working experience with Amazon Web Services, specifically Redshift, S3, and EC2
- Strong ANSI SQL, NoSQL and SparkSQL query language skills
- Strong problem-solving skills; collaborator and data advocate
- Excellent written and oral communication skills & customer service skills
Recommended skills:
- Minimum 2 years demonstrated, hands-on experience with the AWS ecosystem of tools and technologies or other related cloud technologies
- Experience with Apache open-source tools (Airflow)
- Experience building Spark-based data pipelines using near/real-time data
- Ability to build ETL pipelines from scratch, using streaming, near real-time, and micro-batch data
- Experience with complex multi-server environments & high availability environments
- Experience with system monitoring, log management, and error notification
- Experience in Ad-Tech industry a plus
Benefits
Compensation range: $100-110K base + 20% bonus potential. We offer full health coverage, generous PTO, an award-winning office culture!
The base range provided is AdTheorent’s current assessment for this role. The confirmed salary will be commensurate with experience, education, skills, and other factors. This is subject to change, but will be no less than the minimum stated. We encourage all to apply, as applicants will be assessed on an individual basis. Job title and base salary will depend on qualifications and experience.
We are an Equal Opportunity Employer and seek to foster community, inclusion and diversity within the organization. We encourage all qualified candidates, regardless of racial, religious, sexual or gender identity, to apply. #LI-Remote
NO EXTERNAL RECRUITERS OR VENDORS PLEASE.
PREFERENCE WILL BE GIVEN TO CANDIDATES LOCAL TO THE JACKSONVILLE, FL REGION FOR THIS ROLE.
Agile amazon-s3 Lambdas Python Amazon Web Services (AWS) Streaming media Apache Spark Amazon Redshift Data Engineering SQL Airflow ETL Scrum NoSQL Apache HTTP Server clickhouse