Learn to explore, visualize, and analyze data to understand natural phenomena, investigate patterns, model outcomes, and make predictions, and do so in a reproducible and shareable manner. Gain experience in data wrangling and munging, exploratory data analysis, predictive modeling, and data visualization, and effective communication of results. Work on problems and case studies inspired by and based on real-world questions and data. The course will focus on the R statistical computing language.
A practical introduction to statistical programming focusing on the R programming language. Students will engage with the programming challenges inherent in the various stages of modern statistical analyses including everything from data collection/aggregation/cleaning to visualization and exploratory analysis to statistical model building and evaluation. This course places an emphasis on modern approaches/best practices for programming including: source control, collaborative coding, literate and reproducible programming, and distributed and multicore computing.
Statistical programming, computation using selected languages and environments (Python, R, Matlab, and/or C/C++) and interfaces with custom code development for statistical models. Best practices and software development for reproducible results, selecting topics from: use of markup languages, understanding data structures, design of graphics, object oriented programming, vectorized code, scoping, documenting code, profiling and debugging, building modular code, and version control- all in contexts of specific applied statistical analyses.
Fall 2017, 2018
Obtaining and managing data using statistical software. Data visualization and graphics. Special challenges in working with high-dimensional data.
This course introduces students to the field of data science via the R programming environment. Students will learn to manipulate data objects, produce advanced graphics, tidy and wrangle data, and generate reproducible statistical reports using R markdown. Students will be exposed to the fundamentals of probability and statistical inference, and rudimentary machine learning techniques. Data science ethics and version control will also be introduced.
Summer 2019, 2018, 2017, 2015
A first course in probability and statistics primarily for business majors. Data analysis, probability models, random variables, confidence intervals, and tests of hypotheses with business applications.
Data analysis, probability models, random variables, estimation, tests of hypotheses, confidence intervals, and simple linear regression.
Calculus-based course in probability and statistics. Probability models, and random variables. Estimation, confidence intervals, tests of hypotheses, and simple linear regression with applications in sciences.