Reporting to the Technical Manager, Digital Products, the Data Scientist analyzes structured and unstructured data, models complex problems, and identifies opportunities for process and product optimization by using statistical, algorithmic, mining, and visual techniques. This role develops machine learning (ML) predictive and prescriptive analytics models through the innovative understanding and use of large data sets and the verification of effectiveness to improve clinical processes and patient outcomes. The Data Scientist supports Providence Health Care (PHC) strategic priorities by understanding the clinical, financial, and operational issues to be solved and working closely with stakeholders, clinical and technical experts, and functional teams to leverage knowledge, interpret outputs, deploy solutions, and provide actionable insights. The role also serves a key role in developing a solid and sustainable machine learning foundation and competency for PHC.
Thorough knowledge of the principles, processes, procedures, and methods involved in data mining, data analysis, statistical methods, and machine learning.
Demonstrated skills in AI product design and the analysis of quantitative data for the purpose of creating actionable insights and measureable impact on organizational outcomes.
Proven ability to plan, organize, and coordinate AI product activities.
Demonstrated proficiency using machine learning methods and techniques (including neural networks, reinforcement learning, and adversarial learning) and machine learning software packages, and in manipulating large datasets.
Knowledge of supervised machine learning, decision trees, and logistic regression.
Display comprehensive understanding of, and skills using, statistical and data mining techniques such as GLM/Regression, Random Forest, Boosting, Trees, text mining, network analysis, simulation, scenario analysis, and clustering analysis.
Demonstrated ability to perform analytical functions and transform database structures including creating datasets and writing computer code to execute complex queries using statistical computer languages such as Python, R, and SQL.
Demonstrated proficiency working with large volumes of data across multiple servers using distributed data/computing tools such as Hadoop, Spark, MySQL, AWS, etc.
Demonstrated proficiency working with both relational (SQL) and non-relational databases (NoSQL).
Demonstrated understanding of data privacy, security and related tools such as anonymization and encryption
Demonstrated ability to use web services such as Redshift, S3, DigitalOcean, etc.
Demonstrated skills in using data visualization tools (such as Jupyter, Matplotlib, D3, ggplot, Periscope, Business Objects) and to visually present complex data to stakeholders for consideration.
Demonstrated skills in knowledge synthesis and translation activities including working with and sorting and manipulating unstructured data from different platforms.
Excellent oral and written communication skills and ability to clearly and fluently translate technical findings to non-technical partners and to communicate to multiple audiences using data storytelling and through graphics.
Demonstrated ability to work collaboratively in an interdisciplinary environment and to develop recommendations using facilitation and consensus building.
Strong analytical, critical thinking, and evaluation skills to discern and help solve the important problems facing health care, to identify new ways to leverage our data, and to direct efforts in the right direction.
A Masters’ Degree in Mathematics, Statistics, Computer Science, Engineering or other quantitative degree is required plus five (5) to seven (7) years’ experience working with large datasets and machine learning models including experience using statistical and data mining techniques, and distributed data/computing tools; writing computer code; querying databases; and using statistical computer languages.
1. Transforms data into critical information and knowledge by working with clinical management and staff, project/program managers, and members of the health informatics team to develop and implement ML Models. Uses these advanced ML models to identify patterns, trends, and opportunities to make predictions or reduce workload that will have a significant impact across various clinical domains within PHC.
2. Identifies, cleans, and integrates large sets of structured and unstructured datasets from disparate sources for use in ML models and products. Enhances data collection procedures to include information that is relevant for building advanced ML models. Provides input to applications, databases, and systems used to assess study data quality.
3. Uses advanced ML processes to convert data from non-functional forms, such as scanned image text, to functional forms ready for use in further ML models.
4. Develops predictive and prescriptive analytic models in support of the organization’s clinical and business initiatives and priorities by applying advanced statistical and computational methods and innovative use of data, collaborating with and guiding Developers in the construction of analytic models, and maintaining detailed project status plans to achieve ML development cycle timelines and avoid development delays.
5. Reviews clinical data at aggregate levels on a regular basis using analytical reporting tools to support the identification of risks and data patterns or trends. Creates analytical reports and presentations to facilitate review and adoption of data-driven choices. Collaborates with project/program teams to address data-related questions and to recommend potential solutions.
6. Makes recommendations to management regarding strategic actions to maintain the ML development pipeline, analytic architectures, and life cycle, to avoid potential negative consequences and system failures, and to increase the positive impacts of ML systems.
7. Works closely with clinical and management teams across PHC to strategize, develop, and implement artificial intelligence (AI) products that translate into improved quality of care, clinical outcomes, reduced costs, temporal efficiencies, and process improvements.
8. Identifies, engages, and collaborates with specific stakeholders as required for the development of AI products designed around PHC’s strategic priorities and clinical/business problems. Assesses and implements improvements to AI products as needed and creates anomaly detection systems to track performance and data accuracy.
9. Communicates analytic solutions to management and shares AI product status throughout the various stages of the product lifecycle.
10. Supports management in the development of strategies for scaling successful projects across the organization based on feedback from clinical/business clients and end-users by maintaining project and other documentation, reviewing findings, and presenting analysis and actionable insights for further discussion and decision.
11. Works to foster and develop a solid and sustainable machine learning foundation and competency for PHC. Assists management with the dissemination of successes and failures in an effort to increase analytics literacy and adoption across PHC.
12. Keeps up-to-date with the latest technology trends and methods by staying abreast of state-of-the-art literature in the fields of operations research, statistical modeling, statistical process control and mathematical optimization.
13. Performs other related duties as required.