Staff Data Platform Engineer - GenAI (remote) at The Hartford #vacancy #remote

Staff Data Engineer – GE07CE We’re determined to make a difference and are proud to be an insurance company that goes well beyond coverages and policies. Working here means having every opportunity to achieve your goals – and to help others accomplish theirs, too. Join our team as we help shape the future. At the Hartford, we are seeking GEN AI Data Engineer who is responsible for building fault-tolerant infrastructure to support the Generative AI applications, and also designing, developing, and deploying data pipelines to solve complex problems and drive innovation at scale. We are driven by a strong determination to create a meaningful impact and take pride in being an insurance company that extends far beyond the realms of policies and coverages. When you choose to be a part of our team, you open the door to endless opportunities for personal and professional growth, as well as the chance to empower others in reaching their aspirations. You will help bring the transformative power of Generative AI capabilities to re-imagine the ‘art of possible’ and serve our internal customers and transform the businesses. We are founding a dedicated Generative AI platform engineering team to build our internal developer platform and are looking for an experienced Staff Data Platform Engineer – Generative AI, to help us build the foundation of our Generative AI capability. You will work on a wide range of initiatives, whether that’s building ETL pipeline, or training a retrieval re-ranker, or working with the DevSecOps team to build the CICD pipeline, or designing a Generative AI Infrastructure that conforms to our strict security standards/guardrails, or working with the data science team in their pursuit of improving the accuracy of the LLM models. This role requires versatility and expertise across a wide range of skills. Someone with a diverse background/experience and an engineer at heart will fit into this role seamlessly. The Generative AI team is comprised of a multiple cross-functional group that works in unison and ensures a sound move from our research activities to scalable solutions. You will collaborate closely with our cloud, security, infrastructure, enterprise architecture and data science team to conceive and execute essential functionalities. This role can have a Hybrid or Remote work arrangement. Candidates who live near one of our office locations () will have the expectation of working in an office 3 days a week (Tuesday through Thursday) Candidates who do not live near an office will have a remote work arrangement, with the expectation of coming into an office as business needs arise. Candidates must be eligible to work in the US without sponsorship now or in the future Responsibilities: Design and build fault-tolerant infrastructure to support the Generative AI Ref architecture (RAG, Summarization, Agent etc). Ensure code is delivered without vulnerabilities by enforcing engineering practices, code scanning, etc. Build and maintain IAC (terraform/Cloud Formation), CICD (Jenkins) scripts, CodePipeline, uDeploy, & GitHub Actions. Partner with our shared service teams like Architecture, Cloud, Security, etc to design and implement platform solutions. Collaborate with the DS team to develop a self-service internal developer Generative AI platform. Design and build the Data ingestion pipeline for Finetuning LLM Models. Create templates (Architecture As Code) implementing Ref architecture application’s topology. Build a feedback system using HITL for Supervised finetuning. Qualifications: Bachelor’s degree in Computer Science, Computer Engineering, or a technical field. 4+ years of experience with AWS cloud. At least 8 years of experience designing and building data-intensive solutions using distributed computing. 8+ years building and shipping software and/or platform infrastructure solutions for enterprises. Extensive programming experience with Python, Java. Experience with CI/CD pipelines, Automated Testing, Automated Deployments, Agile methodologies, Unit Testing and Integration Testing tools. Experience with building scalable serverless application (real-time / batch) on AWS stack (Lambda + step function) Knowledge of distributed NoSQL database systems. Preferred Qualifications: Proficiency in embeddings, ANN/KNN, vector stores, database optimization, & performance tuning. Experience with LLM orchestration frameworks like Langchain, LlamaIndex etc. Foundational understanding of Natural Language Processing, and Deep Learning. Experience with HPCs, vector embedding, and Hybrid/Semantic search technologies. Experience with data engineering, ETL technology, and conversation UX is a plus. Experience with AWS OpenSearch, Step/Lambda Functions, SageMaker, API Gateways, ECS/Docker Proficiency in customization techniques across various stages of the RAG pipeline, including model fine-tuning, retrieval re-ranking, and hierarchical navigable small-world graph (HNSW) Excellent problem-solving and communication skills and the ability to work in a collaborative team environment. Compensation The listed annualized base pay range is primarily based on analysis of similar positions in the external market. Actual base pay could vary and may be above or below the listed range based on factors including but not limited to performance, proficiency and demonstration of competencies required for the role. The base pay is just one component of The Hartford’s total compensation package for employees. Other rewards may include short-term or annual bonuses, long-term incentives, and on-the-spot recognition. The annualized base pay range for this role is: $123,280 – $184,920 Equal Opportunity Employer/Females/Minorities/Veterans/Disability/Sexual Orientation/Gender Identity or Expression/Religion/Age #J-18808-Ljbffr

GenAI Natural language processing (NLP) Agile CI/CD Python Amazon Web Services (AWS) Data Engineering devsecops automated testing amazon-ecs deep-learning Docker hpc Java ETL NoSQL Retrieval-Augmented Generation (RAG)

Залишити відповідь