(Remote) Sr Site Reliability Engineer at First American #vacancy #remote

Who We Are Join a team that puts its People First! Since 1889, First American (NYSE: FAF) has held an unwavering belief in its people. They are passionate about what they do, and we are equally passionate about fostering an environment where all feel welcome, supported, and empowered to be innovative and reach their full potential. Our inclusive, people-first culture has earned our company numerous accolades, including being named to the Fortune 100 Best Companies to Work For® list for eight consecutive years. We have also earned awards as a best place to work for women, diversity and LGBTQ+ employees, and have been included on more than 50 regional best places to work lists. First American will always strive to be a great place to work, for all. For more information, please visit What We Do We are looking for a Senior Site Reliability Engineer to support the reliability of First American’s mission-critical software systems. This transformative role involves automating IT infrastructure tasks and driving SRE best practices, tools, and processes. The ideal candidate should exhibit a growth mindset and proactively monitor and respond to incidents for optimal user experience. What You’ll Do Maintain and improve reliability of core software systems. Prioritize customer satisfaction in all efforts. Continuously learn and adapt to new technologies and methodologies. Collaborate effectively with stakeholders and other Engineers. Quickly respond to changes and resolve issues. Take accountability for issue resolution and prevention. Utilize automation tools to streamline processes and minimize manual intervention. What You’ll Bring (At least 5-7 years’ experience) Bachelor’s degree in Computer Science, Information Technology, or equivalent education and experience. Expertise in application performance monitoring, observability, and proactive alert correlation, including monitoring containers and failure-based alerting. Skilled in defining service level objectives, measuring service level indicators, and setting up error budgets. Strong understanding of SRE practices: incident response, change/release management, capacity planning, infrastructure automation, elastic environments, chaos engineering and blameless postmortems. Successful in improving CI/CD pipelines and build/release processes. Experienced in creating SRE adoption framework and onboarding procedure. Technology Stack Cloud Computing Platform: AWS (Lambda, EC2, ECS, EKS, Fargate, RDS, S3, Dynamo DB, SQS) Monitoring and Logging Tools(s): AppDynamics, Splunk, ELK Stack, DataDog, Prometheus, AWS Cloudwatch/X-Ray Networking Technology: Protocols, Load Balancers, Firewalls Programming: C# .NET, PowerShell, Python, YAML Code Repos: Azure Repos, GitHub Infrastructure as code: Terraform, Ansible Automation Tools: Jenkins, Chef, Puppet Pay Range: $87,945 – $182,655 Annually This hiring range is a reasonable estimate of the base pay range for this position at the time of posting. Pay is based on a number of factors which may include job-related knowledge, skills, experience, business requirements and geographic location.

  • tcorpit
  • techreferral

#LI-JC2 What We Offer By choice, we don’t simply accept individuality – we embrace it, we support it, and we thrive on it! Our People First Culture celebrates diversity, equity and inclusion not simply because it’s the right thing to do, but also because it’s the key to our success. We are proud to foster an authentic and inclusive workplace For All. You are free and encouraged to bring your entire, unique self to work. First American is an equal opportunity employer in every sense of the term. Based on eligibility, First American offers a comprehensive benefits package including medical, dental, vision, 401k, PTO/paid sick leave and other great benefits like an employee stock purchase plan.

Datadog puppet Terraform Amazon Web Services (AWS) appdynamics amazon-ecs Prometheus amazon-dynamodb YAML Information technology (IT) C# Elastic Stack Splunk amazon-s3 Lambdas Python powershell azure-repos Chef Infra amazon-cloudwatch amazon-eks x-ray GitHub aws-fargate amazon-sqs .NET amazon-rds Jenkins Site Reliability Engineering (SRE) Ansible amazon-ec2

Leave a Reply