2015-07-08
Requested retinal wave data from colleagues.
Beta 2 knockout mouse was a key focus.
Handled wide range of data formats.
Determined what meta data is crucial.
Converted to a common format . . .
Native compression
All meta data contained within the file.
h5ls -r ~/proj/carmen/waverepo/waverepo/hdf5/Wong1993/Wong1993_P0.h5 / Group /array Dataset {1} /epos Dataset {2, 39} /meta Group /meta/age Dataset {1} /meta/key Dataset {1} /meta/species Dataset {1} /names Dataset {39} /sCount Dataset {39} /spikes Dataset {13336} /summary Group /summary/N Dataset {1} /summary/duration Dataset {1} /summary/frate Dataset {39} /summary/totalspikes Dataset {1}
Most tables and graphs in our report were automated and recomputed dynamically, e.g. every time the database extended.
Provided all code and data online on our web page together with PDF.
Cook for one month…
(Aside: reviews were signed and are now public). http://www.gigasciencejournal.com/content/3/1/3/prepub
From: http://www.gigasciencejournal.com/manuscript/review/attachment/pdf/6813937211125312.pdf
I would use an ordinate log scale for this bottom right panel (as done in Fig. 3). But since the authors gave me everything, I can do it! by redefining fourplot as follows:
Journal were keen to see RR, so they wrote a press-release. Nature Neuroscience podcast a few weeks later.
(Hint: contact http://www.communications.cam.ac.uk/)
Since Oct 2014, Nature journals wish to see code relating to papers.
We are now working with Nature Neuroscience on how to check this.
Â
Started 2013 with reproducibility projects in Psychology and Cancer Biology.
Â
From its DOI we get to OSF storage.
e.g. Badge earners in Psychological Science.
I try to write my papers now in this format, so that I can bundle data and code with manuscript.
Docker also makes it easy to test on fresh systems, and for others to test:
docker run -d -p 8787:8787 sje30/waverepo open http://192.168.59.103:8787/
Login with "rstudio" as username and password. Open waverepo/paper/waverepo_paper.Rnw
and hit the "Compile PDF" button.
Anyone with docker installed on their laptop and want to try a live demo (apart from me)?
GUI available via https://kitematic.com/
People were willing to give away their published data.
Only one group could not find data to share.
Test sets for data important. How do you know you've got it right?
Data papers can be well-received. Media attention.
Reproducible research takes a bit longer in the short-term, but should benefit in long-term.
Designing meta data is hard. This version was deliberately minimal and lasted approximately two months.
… including these slides. Made with markdown in R. You can grab the source from
https://github.com/bioinformatics-core-shared-training/rep-research-masterclass
and regenerate them in R with:
Rscript -e 'rmarkdown::render("neurocase1.Rmd")'
or see the Makefile
rule to regenerate.
Building repositories allows for reproducible research.
Data papers are important and useful.
Embedding meta data in files is the future.
More generally:
Share your data
Share your code
Share your papers http://biorxiv.org
CARMEN project: (Evelyne Sernagor, Jennifer Simonotto, Mike Weeks, Mark Jessop, Tom Jackson)
Waverepo data providers.
Malin Sandström (INCF).
Ben Marwick (Docker)
Wellcome Trust, EPSRC, BBSRC.