I’ve been using Python with pandas on and off to automate analysis of reading-test data at the non-profit where I work. If you’re regularly analysing data – even just running means and standard deviations on excel files – you can save huge amounts of time and frustration by automating things with Python.
These are (amazing, free) resources for pandas that I mostly came across via Tom Augspurger’s site. I’ve dipped in and out of them and it would have saved me a lot of time if I’d found them earlier and been a bit more systematic in learning the basics of Pandas.
Some familiarity with Python or other program languages is a big help before using these books (PY4E or Think Python are enough).
The list:
- Wes McKinney’s Python for Data Analysis, 3rd ed., open access version
- Greg Reda’s Intro to Pandas Data Structures
- Tom Augspurger’s Modern Pandas
- Easier data analysis in Python with pandas (video series)
- Think Stats (Allen Downey)
- DataCarpentry.org Ecology Workshop (this seems to be the best one – a good place to start for thinking about what good data looks like and how to format and handle it)