Overview

Data Science is concerned with analyzing and reporting on a range of different kinds of data including structured data stored in organizational databases and unstructured data that is often text-rich and not collected according to a particular data model. Work in this field requires specialized techniques and tools that draw upon both statistical and computational methods to address complex real world problems and employ multidisciplinary analytics to derive knowledge from large sources of data (Big data).

The following Data Science webinar series will provide an introduction to this rapidly growing field with a particular focus on machine learning methods and analytic techniques that can serve the needs of health and environmental researchers working to understand trends in society, health and human behavior.  

The presentations are intended for those who are interested in a broad overview to basic data science analytics. The sessions will benefit health and environmental researchers, analysts and related professionals who want an introduction to data science approaches for data analytics using R software. (Python code will also be provided) The webinar series includes four modules that each include an introductory and practicum session. Each module will focus on the application of specific machine learning methods and analytic techniques with general formulas presented but will not delve into their statistical theory.

Requirements

To benefit from the webinar presentations, registrants should have knowledge of simple and multiple linear regression models and categorical data analysis such as logistic regression.

No prior working knowledge of R or Python is required, but some familiarity with R would be beneficial for following the practicum sessions.

As a supplemental resource for this series, you may wish to review our new free online resource: Data Management and Cleaning for Analysis with R software.

Webinar module resources

All modules include presentation recordings, slide decks, training data, R and Python code, and related references for further reading/study.

Module format

Each module includes two  webinar presentations:

  • Session 1: A one-hour introductory presentation
  • Session 2: A two-hour practicum session that includes a focus on applied analytics using training data with R code and supplementary Python code.