
One of the biggest holes in how data science (and computer science) is taught is the lack of vocational courses for how to use data. Most CS programs will have a database course which covers the core trade-offs around building a database – but will only provide a cursory glance toward how to write SQL. Similarly when teaching how to manipulate data in Python there is usually only a week or two spent on using Pandas, despite it being the most common method for exploring data.
The notes that can be found herein represent a fingers-on-the-keys approach to teaching these two topics. These note present a distinctly practical approach to learning how to use these two tools. This course has been taught at the undergraduate, graduate and executive certificate level in a variety of different course structures.
This page contains the most recent version of those notes which is currently taught to students at the MS-CAPP Program at the University of Chicago. Versions of this course have also been taught at the undergraduate and executive education levels.
The data used in this course can be found in the repository here. The table of contents with links to specific chapters can be found below. Please note that this is all a work in progress and each time I teach this there are both minor and major changes that occur.
You can find a combined PDF with all notes here. Specific chapters can be found below, but links do not currently work in the chapter specific PDFs.
.Table of Contents
Introduction and Errata | Introduction | |
Relational Databases | ||
Rows and Columns |
| Chapter 1 |
Basic Manipulations |
| Chapter 2 |
Subqueries, Distinct & Case |
| Chapter 3 |
Database Internals: Transactions |
| Chapter 4 |
Aggregations |
| Chapter 5 |
Dates and Types |
| Chapter 6 |
Averages |
| Chapter 7 |
Joins |
| Chapter 8 |
Advanced Joins |
| Chapter 9 |
Analytic Functions & CTE's |
| Chapter 10 |
Database Internals: Performance Evaluation |
| Chapter 11 |
Extensions [TBD] |
| Chapter 12 |
Interview Hints |
| Chapter 13 |
Pandas | ||
Introduction |
| Chapter 14 |
More Manipulations and Types |
| Chapter 15 |
Aggregations |
| Chapter 16 |
Joins |
| Chapter 17 |
Window Functions |
| Chapter 18 |
Appendix | ||
Data Dictionaries |
| Appendix A |
Connecting SQL to Python or R |
| Appendix B |
Assignments | ||
Example Exams |