General Best Practices: Guides and Primers


Borer, ET, EW Seabloom, MB Jones, and M Schildhauer. 2009. Some simple guidelines for effective data management. Bulletin of the Ecological Society of America, April 2009.205-214. DOI: 10.1890/0012-9623-90.2.205

White, EP, E Baldridge, ZT Brym, KJ Locey, DJ McGlinn, and SR Supp. 2013. Nine simple ways to make it easier to (re)use your data. Ideas in Ecology and Evoltuion 6(2): 1-10. DOI: 10.7287/peerj.preprints.7v2

Whitlock, MC. 2011. Data archiving in ecology and evolution: best practices. Trends in Ecology and Evolution, Vol. 26, No. 2. pp. 61-65. DOI: 10.1016/j.tree.2010.11.006 (Best viewed in browser other than Chrome)

Cook, RB, RJ Olson, P Kanciruk, and LA Hook. 2001. Best practices for preparing ecological data sets to share and archive. Bulletin of the Ecological Society of America 82(2): 138-141. Available from ORNL as PDF

Strasser, C, R Cook, W Michener, and A Budden. 2012. Primer on Data Management: What You Always Wanted to Know. A DataONE publication, available via the California Digital Library. DOI: 10.5060/D2251G48

Goodman, A, A Pepe, A Blocker, C Borgman, K Cranmer, et al. 2014. Ten Simple Rules for the Care and Feeding of Scientific Data. PLoS Comput Biol 10(4): e1003542. DOI: 10.1371/journal.pcbi.1003542

Strasser, C. 2014. Slides on best practices: best practices tips start at slide No. 19. Available on SlideShare

Spreadsheet Best Practices


Harkins, S. 2011. Five tips for avoiding data entry errors in Excel. TechRepublic, published May 16 2011.

Bewig, PL. 2005. How do you know your spreadsheet is right? Principles, Techniques and Practice of Spreadsheet Style. July 28, 2005. arxiv.org/abs/1301.5878

Excel tips and tricks. Two blog posts from Data Pub, 2012. Part 1 and Part 2

Abandon all hope, ye who enter dates in Excel. Data Pub blog post from Kara Woo, 2014.

Spreadsheet Tools


OpenRefine: openrefine.org

From the website: OpenRefine (formerly Google Refine) is a powerful tool for working with messy data: cleaning it; transforming it from one format into another; extending it with web services; and linking it to databases like Freebase.


RightField: rightfield.org

From the website: RightField is an open-source tool for adding ontology term selection to Excel spreadsheets. RightField is used by a 'Template Creator' to create semantically aware Excel spreadsheet templates. The Excel templates are then reused by Scientists to collect and annotate their data; without any need to understand, or even be aware of, RightField or the ontologies used... RightField is a standalone Java application which uses Apache-POI for interacting with Microsoft documents. It enables users to import Excel spreadsheets, or generate new ones from scratch.


CSV Fingerprint: web application and blog post with more information

From ACRL blog dh+lib: Victor Powell has released CSV Fingerprints, a tool for spotting errors in CSV files, such as instances “when the data itself has a comma in it.” As Powell describes, "The idea is to provide a birdseye view of the file without too much distracting detail. The idea is similar to Tufte’s Image Quilts…a qualitative view, as opposed to a rendering of the data in the file themselves. In this sense, the CSV Fingerprint is a sort of meta visualization." Users can use the browser-based software or download the source code from GitHub.