Data Engineer (Remote – California Residents Only) We are looking for a data engineering expert to join our Infrastructure Team. As part of the Infrastructure Team, you will be a key strategic leader in the expansion, refinement, and continued development of our infrastructure utilizing Amazon Web Services (AWS), as well as continuing to build out our data analytics and engineering tools. This role will be instrumental in helping us scale our work statewide and open the door for expanded use of our higher education planning platform for K-12th grade students, parents, and educators. Qualifications SQL: 2 years (Required) Cloud infrastructure: 2 years (Required) US work authorization (Required) Data Engineer – 2 years Remote/Virtual – All candidates must live in California The ideal candidate for this team has extensive experience building out complex data architectures, including the testing, maintenance, and refinement of all data pipelines.They have also experience successfully scaling up an organization’s data intake to terabytes and petabytes of data, as well as optimizing data delivery and automating manual processes. This person is comfortable working with structured and unstructured data and can troubleshoot data loading and processing tools via SQL, Python, shell scripting, AWS, etc, and take on a leadership role in special projects when needed. Extensive experience developing robust data documentation and data governance protocols is required for this role.There are no direct supervisory responsibilities for this position, but you must be able to proactively and successfully partner and collaborate with other subject matter experts on a project management basis. You must be comfortable leading and project managing work with many unknowns and ambiguous solutions, as well as carry a deep knowledge and/or curiosity about the needs and behaviors of students, educators, and parents, and a passion for educational equity. What Will You Be Doing: Manage, refine, and enhance AWS cloud services infrastructure which include the following: EC2, VPC, RDS, ECS, CloudWatch, CloudFormation, CloudTrail Transfer for sFTP, S3, Lambda, Secrets Manager, and Route 53 Lead ETL/ELT processes which include the development, refinement, and implementation of data loading and processing tools, including Snowflake, Airflow, Python, SQL, dbt, and shell scripts Review, redesign, and expand existing analytics and data processing architecture to create an optimal data pipeline(s) Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, designing infrastructure for greater scalability and automation Maintain and update documentation of data architecture from both the macro view (i.e. architectural diagrams) and the micro view (i.e. script level tasks in Airflow) Lead meetings, research processes, collect data, analyze information Collaborate with the key stakeholders of your projects Develop and maintain expert knowledge of our platform and organization Continuously improve data pipelines and architecture by staying updated on industry trends and best practices. Necessary Technical Skills: Build processes supporting data transformation, data structures, metadata, dependency, and workload management. Advanced proficiency with cloud analytical tools: Snowflake, Redshift, Hadoop, Spark, Kafka, etc. Advanced proficiency with ETL tools like Matillion, dbt, Talend, etc. Advanced proficiency with data pipeline and workflow management tools: Airflow, Azkaban, Luigi, etc. Advanced proficiency building and optimizing data pipelines, cloud architectures, and data sets. Advanced proficiency with scripting languages like Python, R, Scala, etc. Advanced proficiency developing cloud infrastructure in AWS or currently has or is in the process of obtaining AWS Certification as a Solutions Architect – Associate Expert SQL knowledge (SQL Server, PostgreSQL, MySQL, etc.) and understanding of relational databases, query authoring and optimization, as well as working familiarity with a variety of databases. A successful history of manipulating, processing and extracting value from large structured and unstructured datasets. Your Strengths: Strong decision-making skills and collaborative spirit with the ability to take abstract brainstorming and generate concrete proposals for action Advance projects without detailed supervision, balancing multiple responsibilities, and providing colleagues with actionable proposals for advancing collective efforts Thrive in a fast-paced environment with changing priorities and deadlines Juggle multiple projects of various scopes with ease and grace Meticulous attention to detail Excellent verbal communication skills; ability to communicate with various levels of professionals. Strong organizational, project, and time management skills We are committed to providing an environment of mutual respect where equal employment opportunities (EEO) are available to all employees and applicants without regard to race, color, ancestry, national origin, genetic characteristics, sex, gender identity, gender expression, sexual orientation, marital/parental status, political affiliation, religion, age, disability, pregnancy, childbirth, breastfeeding or veteran status. In addition to federal law requirements, we comply with applicable state and local laws governing nondiscrimination in employment. We are committed to workplace policies and hiring practices that comply with federal, state, and local law. We are interested in hiring qualified candidates who are eligible and authorized to work in the United States. At this time, we are not able to sponsor visas. As a result, we cannot hire applicants that currently, or in the future, require immigration sponsorship for work authorization (i.e., H1B or F1 Student Visa).
Cloud infrastructure Scala shell Python Apache Spark Amazon Web Services (AWS) Amazon Redshift Apache Kafka Data Engineering R Matillion Talend azkaban snowflake-cloud-data-platform Solutions Architect Luigi SQL Airflow ETL Hadoop DBT