Data Engineer in Hartford, CT or Remote at Georgia IT Inc #vacancy #remote

Data Engineer Location: Hartford, CT or Remote Duration: Contract Rate: DOE US Citizens, GC, EAD ( H4, L2), E3 TNvisa holders preferred, NO third party corp to corp accepted for this job Required skills: Good experience on designing and developing data pipelines for data ingestion and transformation using Spark.

Distributed computing experience using Pyspark or Python
Good understanding of spark framework and spark architecture.
Experience working in Cloud based big data infrastructure.
Excellent in trouble shooting the performance and data skew issues.
Must have good understanding of spark run time metrics and tune applications based on metrics.
Deep knowledge in partitioning, bucketing concepts of data ingestion.
Good understanding of AWS services like Glue, Athena, S3, Lambda, Cloud formation.
Preferred working knowledge on the implementation of datalake ETL using AWS glue, Databricks etc.
Experience with data modelling techniques for cloud data stores and on prem databases like Teradata, Teradata Vantage (TDV)etc
Preferred working experience in ETL development in Teradata vantage and data migration from on prem to Teradata vantage.
Proficiency in SQL, relational and non-relational databases, query optimization and data modelling.
Experience with source code control systems like Gitlab.
Experience with large scale distributed relational and NoSQL database systems.

Technologies: Pyspark, Python, AWS services, Teradata Vantage, CI/CD technologies, Terraform, SQL

databricks GitLab PySpark CI/CD amazon-s3 Lambdas Python Terraform Apache Spark Amazon Web Services (AWS) Data Engineering Amazon Athena amazon-cloudformation SQL teradata NoSQL aws-glue

Leave a Reply Cancel reply