As part of the first annual NGCM Summer Academy, Chris Fonnesbeck of Vanderbilt University and Skipper Seabold of Civis Analytics delivered a two-day training course on the Pandas data analysis library for the Python programming language. Pandas is a Python package designed to handle tabular, matrix and time series data, providing structures and methods which make data analysis faster and more straightforward.
Participants were introduced to data handling and organisation techniques, plotting and visualisation tools, and techniques for statistical modelling in Pandas. Particular focus was given to handling missing data, a common issue in real-world data analysis problems. A more detailed description of the material covered can be viewed on the NGCM website; the complete material is available through GitHub.
Chris Fonnesbeck introduces background material at the beginning of the first day of the Pandas course. Image source
Skipper Seabold discusses on plotting and data handling in Pandas on the second day of the course. Image source