ONLINE COURSE – Reproducible and collaborative data analysis with R (RACR01) This course will be delivered live
31 October 2022 - 2 November 2022£350.00
Monday, October 31st, 2022
This is a ‘LIVE COURSE’ – the instructor will be delivering lectures and coaching attendees through the accompanying computer practical’s via video link, a good internet connection is essential.
TIME ZONE – CET – however all sessions will be recorded and made available allowing attendees from different time zones to follow.
Please email firstname.lastname@example.org for full details or to discuss how we can accommodate you).
About This Course
The computational part of a research is considered reproducible when other scientists (including ourselves in the future) can obtain identical results using the same code, data, workflow and software. Research results are often based on complex statistical analyses which make use of various software. In this context, it becomes rather difficult to guarantee the reproducibility of the research, which is increasingly considered a requirement to assess the validity of scientific claims. Moreover, reproducibility is not only important for findings published in academic journals. It also becomes relevant for sharing analyses within a team, with external collaborators and with one’s supervisor. During this three-day course, the participants will be introduced to a suite of tools they can use in combination with R to make reproducible the computational part of their own research. A strong emphasis is given to collaboration, and participants will learn how to set up a project to work with other people in an efficient way.
On day 1 the participants learn about the most important aspects that make research reproducible, which go beyond simply sharing R code. This includes problems arising from the use of different packages versions, R versions, and operating systems. The concept of research compendium is introduced and proposed as general framework to organise any research project. Day 2 is dedicated to version control with Git and GitHub which are fundamental tools for keeping track of code changes and for collaborating with other people on the same project. We will cover both, basic and more advanced features, like tagging, branching, and merging. On day 3 the participants are introduced to literate programming using RMarkdown with the focus on writing a scientific article. The aim is to bind the outputs of the R analysis (i.e. results, tables, and figures) together with the text of the article. Participants will also learn how to use templates to fulfil requirements of different journals.
Availability – 20 places
Duration – 3 days
Contact hours – Approx. 20 hours
ECT’s – Equal to 2 ECT’s
Language – English
On each day, participants will get an introduction to a different tool and practice its use together with the instructor. There will be lecture-style presentations to explain the different problems that make research not reproducible and provide possible solutions to the problem. Lectures will be alternated with hands-on sections guided by the instructor and group exercises to enhance collaboration skills.
Assumed quantitative knowledge
A basic knowledge of statistics is required.
Assumed computer background
Equipment and software requirements
A laptop computer with a working version of R or RStudio is required. R and RStudio are both available as free and open source software for PCs, Macs, and Linux computers. R may be downloaded by following the links here https://www.r-project.org/. RStudio may be downloaded by following the links here: https://www.rstudio.com/.
Participants should be able to install additional software on their own computer during the course (please make sure you have administration rights to your computer). Participants should also create a GitHub account in order to attend the second day of this course. Instructions on how to create the account and how to install Git will be provided during the first day.
A large monitor and a second screen, although not absolutely necessary, could improve the learning experience. Participants are also encouraged to keep their webcam active to increase the interaction with the instructor and other students.
A working webcam is desirable for enhanced interactivity during the live sessions, we encourage attendees to keep their cameras on during live zoom sessions.
Cancellations are accepted up to 28 days before the course start date subject to a 25% cancellation fee. Cancellations later than this may be considered, contact email@example.com. Failure to attend will result in the full cost of the course being charged. In the unfortunate event that a course is cancelled due to unforeseen circumstances a full refund of the course fees will be credited.
- – Intro to the reproducibility crisis
- – Examples of problems arising from different Operating Systems, R versions, andpackage versions
- – What happens when you start R
- – RStudio projects
- – Project organization
- – Code style
- – Reproducible R environment
- – Intro to Git and Github
- – Configure Git and GitHub
- – Git basic from command line
- – Create a local repository and push it on Github
- – Craft a good commit
- – Clone and fork a GitHub repository
- – Craft a pull request
- – Git branch, merge, and tag
- – Git checkout, reset, and revert
- – Use Git with RStudio
- – Ignore files
- DAY 3
- – Literate programming
- – RMarkdown to produce html, word, and pdf outputs
- – Manage references with Zotero
- – Use templates for word outputs
- – Write your scientific article with RMarkdown
- – Reference tables and figures in the text
Dr. Sergio Vignali
Sergio Vignali is a postdoctoral researcher at the University of Bern (Switzerland), in the division of Conservation Biology of the Institute of Ecology and Evolution. His research focuses on spatial predictive models for animal movements and distributions. Sergio combines his strong scientific interest in animal ecology, particularly birds, with his computational and statistical background to develop new methodological approaches. He is the developer of SDMtune, an R package to tune and evaluate species distribution models. Sergio is also an advocate of open source software and is committed to improving transparency and reproducibility in research.