- Commercial experience 3+ years in data engineering.
- Strong programming skills in Python.
- Solid with distributed computing approaches, patterns, and technologies (PySpark).
- Experience working with any cloud platform (GCP, AWS, Azure) and its data-oriented components.
- Proficiency in SQL and query tuning.
- Understanding of data warehousing principles and modeling concepts (e.g., knowledge of data model types and terminology including OLTP/OLAP, SCD, (de)normalization, dimensional, star/snowflake modeling, etc.).
- Expertise in the use of any listed relational databases (PostgreSQL, MSSQL or MySQL).
- Experience with the orchestration of any data flows (e.g., Apache Airflow, Prefect, Glue, Azure Data Factory).
- A team player with excellent collaboration skills.
- Minimum English level B2.
- Expertise in stream processing using the current industry standards (e.g., AWS Kinesis, Kafka streams, Spark/PySpark, etc.).
- Expertise in data storage design principles. Understanding of pros and cons of SQL/NoSQL solutions, their types, and configurations (standalone/cluster, column/row-oriented, key-value/document stores).
- Experience in modern data warehouse building using Snowflake, AWS Redshift or BigQuery.
- Deep knowledge of Spark internals (tuning, query optimization).
- Experience with data integration and business intelligence architecture.
- Experience with data lakes and lake-houses (Azure Data Lake, Apache Hudi, Apache Iceberg, Delta Lake).
- Experience with containerized (Docker, ECS, Kubernetes) or serverless (Lambda) deployment.
- Good knowledge of popular data standards and formats (e.g., JSON, XML, Proto, Parquet, Avro, ORC, etc.).
- Experience with Platforms: Informatica, Databricks, Talend, Fivetran or similar.
- Experience in data science and machine learning with building Machine Learning models.
- Competitive compensation packages.
- Stable employment, based on a full-time employment contract.
- Private health insurance (Medicover Сlinic).
- AYCM sport pass, providing discounts at various sports facilities in Hungary.
- Interesting tasks and diverse opportunities for developing your skills.
- Free training courses, including English.
- Participation in internal and external thematic events, technical conferences.
- A spacious office in the heart of Budapest (13th district).
- All necessary devices and tools for your work.
- Active corporate life.
- The friendly and supportive atmosphere within the team.
WILL BE A PLUS:
NIX, a global supplier of software engineering and IT outsourcing services, is looking for a Medior Data Engineer Python in its office in Budapest (Vaci Greens, 13th district). You’ll be part of a team of professionals who are ready to find the best tailor-made IT solutions for their multinational clients in various industries and solve complex problems.
WHAT WE OFFER:
If you feel you’re ready to join the team, apply for this job now! We’re already looking forward to meeting you!
,[Collaborate with product owners and team leads to identify, design, and implement new features to support the growing data needs., Build and maintain optimal architecture to extract, transform, and load data from a wide variety of data sources, including external APIs, data streams and data lakes., Implement data privacy and data security requirements to ensure solutions stay compliant with security standards and frameworks., Monitor and anticipate trends in data engineering and propose changes in alignment with organizational goals and needs., Share knowledge with other teams on various data engineering or project-related topics., Collaborate with the team to decide which tools and strategies to use within specific data integration scenarios.] Requirements: Python, SQL, Data engineering, ETL, ELT, Database, Orchestration, Airflow, Big data, PySpark, Data flows, Docker, Kubernetes, Clouds Tools: Agile, Scrum, Kanban. Additionally: International projects, Paid English courses, Mentoring program, Bike parking, Free coffee, Playroom, Shower, Free snacks, Modern office, No dress code.
PostgreSQL databricks Informatica fivetran Amazon Web Services (AWS) Apache Spark Apache Kafka Data Engineering JSON Azure azure-data-lake avro Talend Amazon Kinesis snowflake-cloud-data-platform amazon-ecs Google Cloud Platform (GCP) Docker Airflow XML Machine Learning parquet aws-glue MySQL PySpark Business Intelligence (BI) Azure Data Factory Apache Iceberg OLAP Lambdas Data Science Python normalization Amazon Redshift OLTP scd delta-lake Microsoft SQL Server data-integration SQL Kubernetes apache-hudi google-bigquery Prefect