Data Scientist at BlueConduit #vacancy #remote

About BlueConduit:

BlueConduit was founded in 2019 in response to the Flint water crisis. First in Flint, and then across the country, we pioneered the predictive modeling approach to lead service line identification and replacement, accelerating the removal of this significant health concern and saving communities millions of dollars in avoided digs. We are passionate about using data science and AI for public good, improving social equity, and protecting the environment, and are now working in new ways to help serve communities’ needs outside lead service line replacement. We are a remote-first company with opportunities to co-work in offices in Brooklyn, NY and Ann Arbor, MI.

About the Role:

BlueConduit is hiring a Mid-Level Data Scientist to play a key role in developing and implementing advanced predictive models and decision optimization algorithms that solve our municipal customers’ critical infrastructure issues. Leveraging your expertise in data analysis, statistical modeling, and machine learning, you will collaborate with our team to (1) deliver customers actionable insights, and (2) build the engine that will drive our software products. This role will report to the VP of Data Science.

Key Responsibilities:

Collaborate closely with BlueConduit’s Data Scientists and coordinate well with our Software Engineers to continuously improve the software that automates workflows and the “human in the loop” processes to deliver to customers
Implement our predictive models and decision optimization algorithms using our proprietary data science and machine learning processes
Work to improve model performance and interpretability for each customer.
Collaborate with BlueConduit’s Customer Success team (Solution Architects and Project Managers) to deliver model results into customer’s ESRI ArcGIS system, which includes configuring internal tools/code and presenting explainable model insights with data visualization
Understand customers’ needs and requirements, and be able to effectively communicate our processes, requirements and recommendations to non-technical audiences (internal and external).
Stay current with the latest advancements in data science, machine learning, and predictive modeling techniques.
Actively engage in R&D to identify how to continue to scale the impact of BlueConduit’s data science work

Qualifications:

Curiosity to learn and desire to value the human side of data science
Passion for data science for social good and environmental justice
Experience building and improving machine learning models and data pipelines
Experience writing reproducible data science code than has been the basis for production code in software products
Experience with building human-in-the-loop data science processes that blend automate-able and difficult-to-automate tasks
Customer-centric and service-driven around timeliness, attention to detail and quality
Bachelor’s or Master’s degree in Computer Science, Statistics, Mathematics, or related field.
5+ years of experience (post graduation) in data analysis, statistical modeling, and machine learning.
Proficiency and substantial experience in Python, especially Pandas, Scikit-learn, PySpark, and Numpy.
Proficiency in Git workflow
Solid understanding of machine learning and statistical models and techniques, including validation and evaluation of model performance.
Experience with issues related to modeling (e.g., selection biases, causal inference)
Experience working with messy data, iterating with clients on a shared dataset
Experience with geospatial data models and visualization tools such as ESRI.
Strong problem-solving skills and ability to work independently or collaboratively in a team environment.
Excellent data visualization and communication skills with the ability to explain complex technical concepts to non-technical customers.
Previous experience working in either a predictive modeling-/analytics-focused company or a software-as-a-service start-up company.

Preferred Qualifications:

Experience with data models and pipelines in Databricks or other cloud providers.
Experience with building production-level ML pipelines from scratch using PySpark, or other Spark or similar frameworks.
Experience training, evaluating, implementing, and communicating results of bespoke models in a client-facing context.
Previous experience working directly with customers.
Previous data science or analytics role at a software-as-a-service company

Nice-to-have Qualifications:

Experience using cloud computing systems to scale data science capabilities.
Experience with Agile product development (e.g., sprints, standups, scrums)
Familiarity with infrastructure, water quality, or government data

Location : Remote

Compensation:

Salary range ($125-145K), depending on level of experience
Stock options
Health benefits
Simple IRA benefit
Co-working space/work place stipend

PySpark Git scikit-learn Data Analyst mathematics statistics pandas Data Science Python Software Development Engineer data-visualization cloud-computing NumPy Machine Learning

Leave a Reply Cancel reply