Analytics Engineer at Datumo #vacancy #remote

Datumo specializes in providing Data Engineering and Cloud Computing consulting services to clients from all over the world, primarily in Western Europe, Poland and the USA. Core industries we support include e-commerce, telecommunications and life science. Our team consists of exceptional people whose commitment allows us to conduct highly demanding projects

Our team members tend to stick around for more than 3 years, and when a project wraps up, we don’t let them go – we embark on a journey to discover exciting new challenges for them. It’s not just a workplace; it’s a community that grows together! 

What we expect: 

Must-have: 

● at least 3 years of commercial experience in programming

● proven record with a selected cloud provider GCP preferred, Azure or AWS 

● good knowledge of Scala/Java/JVM, SQL

● good knowledge of Python

● data modeling and data storage experience 

● ensuring solution quality through automatic tests, CI / CD and code review

● proven collaboration with businesses 

● English proficiency at B2 level, communicative in Polish 

Nice to have: 

● experience in Redshift/BigQuery/Snowflake/Databricks or similar 

● knowledge of dbt, Docker and Kubernetes, Apache Kafka 

● familiarity with Airflow or similar pipeline orchestrator 

● another JVM (Java/Scala/Kotlin) programming language 

● experience in Machine Learning projects 

● experience in Flink

● understanding of Spark or similar distributed data processing framework

● willingness to share knowledge (conferences, articles, open-source projects) 

What’s on offer: 

● 100% remote work, with workation opportunity 

● 20 free days 

● onboarding with a dedicated mentor

● project switching possible after a certain period 

● individual budget for training and conferences 

● benefits: Medicover private medical care, co-financing of the Medicover Sport card

● opportunity to learn English with a native speaker 

● regular company trips and informal get-togethers 

Development opportunities in Datumo: 

● participation in industry conferences 

● establishing Datumo’s online brand presence 

● support in obtaining certifications (e.g. GCP, Azure, Snowflake) 

● involvement in internal initiatives, like building technological roadmaps

● training budget 

● access to internal technological training repositories 

Discover our exemplary project: 

Cost optimization on Snowflake data platform

Datumo optimized a Snowflake-based platform for a pharmaceutical company, aiming to reduce costs and enhance ELT processes. Before we stepped in, the Client had to manage 1 petabyte across 200 tables. Airflow orchestrated the platform, using Python scripts for data extraction, focusing on data snapshots with hundreds of millions of records. Strategic use of deltas, external tables, and reduced time travel periods led to almost 50% cut in storage volume.

Analytics engineering on Google Cloud Platform

The project entails creating and improving data pipelines on Google Cloud Platform (GCP) to aid analytics and data science teams. The objective is to optimize data workflows utilizing Cloud Composer (Apache Airflow), BigQuery, and Dataproc (Apache Spark) for scheduling, warehousing, and processing respectively. Key responsibilities encompass optimizing SQL queries for better performance, developing internal libraries to streamline tasks, and advocating for data processing best practices. Additionally, the project offers opportunities for progression into data science or MLOps.

Recruitment process: 

● Quiz – 15 minutes 

● Soft skills interview – 30 minutes

● Technical interview – 60 minutes 

Find out more by visiting our website –  

If you like what we do and you dream about creating this world with us – don’t wait, apply now!

databricks apache-flink Code review Apache Spark Data Engineering Data Storage Apache Kafka Kotlin snowflake-cloud-data-platform cloud-computing data-modeling Docker Airflow Machine Learning DBT E-commerce CI/CD Scala Python Amazon Redshift JVM Telecommunication SQL Kubernetes Java google-bigquery

Leave a Reply