The notes represent a distinctly practical, hands-on approach to learning both pandas and SQL. While these two topics may be touched upon in traditional database courses (or introduction to data science courses), delving deeply into how to efficiently write modern code using these tools is often a lower priority compared to other topics. These notes approach learning the basics of data management from a different angle — you need to be comfortable with directly manipulating data before you can easily internalize additional concepts.
This course has been taught at undergraduate, graduate, and executive certificate levels in a variety of course structures.
This page contains the most recent version of those notes, currently taught to students in the MS-CAPP Program at the University of Chicago.
The data used in this course can be found in the repository here. Below is the table of contents with links to specific chapters. Please note that some sections are works in progress or TBD.
A combined PDF with all the notes is available here. Specific chapters are listed below, but please be aware that links in the chapter-specific PDFs are not currently functional..
Table of Contents
|Introduction and Errata
|Rows and Columns
|Subqueries, Distinct & Case
|Database Internals: Transactions
|Dates and Types
|Analytic Functions & CTE's
|Database Internals: Performance Evaluation
|More Manipulations and Types
|Connecting SQL to Python or R