## It's a challenging time
* increasing anthropogenic pressure on biosphere
* decreasing funding/anti-science
* complex scientific questions
## It's a challenging time
* collaboration faster and easier
* computing more powerful
* larger dataset easier to obtain and analyze
## It's a challenging time
* new skills needed
* Scientists recognize they lack bioinformatics skills and want to learn them
![](img/embl_poll.png)
## It's a challenging time
* new skills needed
* self-taught/"born with it"
* yet, can greatly accelerate research
## How to address this disconnect?
![](img/frustration1.png)
## How to address this disconnect?
![](img/happy.jpg)
What to learn?
- How to collect data and record metadata?
- How to obtain, format, combine large and heterogeneous data?
- How to document analysis for reproducibility?
- How and where to publish data and analysis alongside the manuscript?
## Who should learn?
* Students
* Collection managers
* PIs
* IT staff
## Why learning these skills?
* Accelerate science
* More robust results
* Data management skills
* Data re-use (citations)
## Why learning these skills?
![](img/automation.png)
## Where to learn these skills?
* Field to Database
* Data Carpentry
* Data sharing, data standards, and demystifying the IPT
* Managing Natural History Collections Data for Global Discoverability
(upcoming)
* Reproducible Science (upcoming)
Good quality data starts in the field
![](img/field.jpg)
How to clean up your data?
![](img/cleanup.jpg)
How to obtain publicly available data? (API)
![](img/api.png)
When it's clean it can be published
![](img/publishing.jpg)
How to use spreadsheets efficiently?
Basics of working with data (R or Python)
* Data visualization
* or other topics: text mining, HDF5, etc.
* Sister organization of Software Carpentry
* No programming knowledge required
* Request a workshop at your institution!
What is reproducibility? Why do you/we need it?
How to organize your research projects to make them reproducible?
Getting started with litterate programming
How to automate your workflow?
How and where to publish your research artifacts?
## The workshop model
* pre- and post-workshop survey + feedback
* good planning
* competent instructors and helpers
* testing of software/material/exercises in advance
* teaching material open and easily accessible via GitHub
* **CODE OF CONDUCT**