Choosing the right data file format for numerical computing

Presented by: David Hoese

Thursday, October 10 2019, 6:00 PM
Madison Public Library, 201 W Mifflin St, Room 301 [Map]
RSVP on Meetup (Note: This Meetup has already occurred.)

This talk will go over the pros and cons of various data file formats common in scientific python workflows. We'll cover various concerns when storing data on-disk and how popular formats address these challenges. The file formats covered will include CSV, flat binary, HDF5, NetCDF4, Parquet/Arrow, and Zarr.

Materials from this talk can be found on GitHub and a live version of the notebook presented is available here.

David Hoese is a software developer at the Space Science and Engineering Center at the University of Wisconsin-Madison. He graduated with a Bachelor's degree in Computer Engineering from UW-Madison. David works on writing software tools to assist atmospheric scientists with a focus on analyzing satellite and ground-based instrument data.