Loading Events

« All Events

  • This event has passed.

ONLINE COURSE – Data visualization with ggplot2 using R and Rstudio (DVGG04) This course will be delivered live

20 February 2024 - 22 February 2024

£250.00
ONLINE COURSE – Data visualization with ggplot2 using R and Rstudio (DVGG04) This course will be delivered live

Event Date

Tuesday, March 26th, 2023

COURSE FORMAT

This is a ‘LIVE COURSE’ – the instructor will be delivering lectures and coaching attendees through the accompanying computer practical’s via video link, a good internet connection is essential.

COURSE PROGRAM

TIME ZONE – Central Time Zone – however all sessions will be recorded and made available allowing attendees from different time zones to follow.

Please email oliverhooker@prstatistics.com for full details or to discuss how we can accommodate you.

Course Details

During this course we provide a comprehensive introduction to data visualization in R using ggplot. We begin by providing a brief overview of the general principles data visualization, and an overview of the general principles behind ggplot. We then proceed to cover the major types of plots for visualizing distributions of univariate data: histograms, density plots, barplots, and Tukey boxplots. In all of these cases, we will consider how to visualize multiple distributions simultaneously on the same plot using different colours and “facet” plots. We then turn to the visualization of bivariate data using scatterplots. Here, we will explore how to apply linear and nonlinear smoothing functions to the data, how to add marginal histograms to the scatterplot, add labels to points, and scale each point by the value of a third variable. We then cover some additional plot types that are often related but not identical to those major types covered during the beginning of the course: frequency polygons, area plots, line plots, uncertainty plots, violin plots, and geospatial mapping. We then consider more fine grained control of the plot by changing axis scales, axis labels, axis tick points, colour palettes, and ggplot “themes”. Finally, we consider how to make plots for presentations and publications. Here, we will introduce how to insert plots into documents using RMarkdown, and also how to create labelled grids of subplots of the kind seen in many published articles.

Intended Audiences

This course is aimed at anyone who is interested in using R for data science or statistics. R is widely used in all areas of academic scientific research, and also widely throughout the public, and private sector.

 

Venue

Delivered remotely

Course Information

Time zone – GMT+1

Availability – TBC

Duration – 2 days

Contact hours – Approx. 15 hours

ECT’s – Equal to 1 ECT’s

Language – English

Teaching Format

This course will be largely practical, hands-on, and workshop based. For each topic, there will first be some lecture style presentation, i.e., using slides or blackboard, to introduce and explain key concepts and theories. Then, we will cover how to perform the various statistical analyses using R. Any code that the instructor produces during these sessions will be uploaded to a publicly available GitHub site after each session. For the breaks between sessions, and between days, optional exercises will be provided. Solutions to these exercises and brief discussions of them will take place after each break.

Assumed quantitative knowledge

None needed.

Assumed computer background

Some familiarity with R.

Equipment and software requirements

A laptop computer with a working version of R or RStudio is required. R and RStudio are both available as free and open source software for PCs, Macs, and Linux computers.

Participants should be able to install additional software on their own computer during the course (please make sure you have administration rights to your computer). 

A large monitor and a second screen, although not absolutely necessary, could improve the learning experience. Participants are also encouraged to keep their webcam active to increase the interaction with the instructor and other students.

Tickets

The numbers below include tickets for this event already in your cart. Clicking "Get Tickets" will allow you to edit any existing attendee information as well as change ticket quantities.
DVGG04 ONLINE
DVGG04 ONLINE
£ 250.00
Unlimited

PLEASE READ – CANCELLATION POLICY

Cancellations are accepted up to 28 days before the course start date subject to a 25% cancellation fee. Cancellations later than this may be considered, contact oliverhooker@prstatistics.com. Failure to attend will result in the full cost of the course being charged. In the unfortunate event that a course is cancelled due to unforeseen circumstances a full refund of the course fees will be credited.

If you are unsure about course suitability, please get in touch by email to find out more oliverhooker@prstatistics.com

COURSE PROGRAMME

Tuesday 26th

Classes from 12:00 to 16:00 (Central Time Zone)

DAY 1

Topic 1: What is data visualization. Data visualization is a means to explore and understand our data and should be a major part of any data analysis. Here, we briefly discuss why data visualization is so important and what the major principles behind it are.

Topic 2: Introducing ggplot. Though there are many options for visualization in R, ggplot is simply the best. Here, we briefly introduce the major principles behind how ggplot works, namely how it is a layered grammar of
graphics.

Wednesday 27th

Classes from 12:00 to 16:00 (Central Time Zone)

DAY 2

Topic 3: Visualizing univariate data. Here, we cover a set of major tools for visualizing distributions over single variables: histograms, density plots, barplots, Tukey boxplots. In each case, we will explore how to plot multiple groups of data simultaneously using different colours and also using facet plots.

Topic 4: Scatterplots. Scatterplots and their variants are used to visualize bivariate data. Here, in addition to covering how to visualize multiple groups using colours and facets, we will also cover how to provide marginal plots on the scatterplots, labels to points, and how to obtain linear and nonlinear smoothing of the plots.

Topic 5: More plot types. Having already covered the most widely used general purpose plots on Day 1, we now turn to cover a range of other major plot types: frequency polygons, area plots, line plots, uncertainty plots, violin plots, and geospatial mapping. Each of these are important and widely used types of plots, and knowing them will expand your repertoire.

Thursday 28th

Classes from 12:00 to 16:00 (Central Time Zone)

DAY 3

Topic 6: Fine control of plots. Thus far, we will have mostly used the default for the plot styles and layouts. Here, we will introduce how to modify things like the limits and scales on the axes, the positions and nature of the axis ticks, the colour palettes that are used, and the different types of ggplot themes that are available.

Topic 7: Plots for publications and presentations: Thus far, we have primarily focused on data visualization as a means of interactively exploring data. Often, however, we also want to present our plots in, for example, published articles or in slide presentations. It is simple to save a plot in different file formats, and then insert them into a document. However, a much more efficient way of doing this is to use RMarkdown to run the R code and automatically insert the resulting figure into a, for example, Word document, pdf document, html page, etc. In addition, here we will also cover how to make labelled grids of subplots like those found in many scientific articles.

 

Course Instructor

Dr. Rafael De Andrade Moral

  • Rafael is an Associate Professor of Statistics at Maynooth University, Ireland. With a background in Biology and a PhD in Statistics from the University of São Paulo, Rafael has a deep passion for teaching and conducting research in statistical modelling applied to Ecology, Wildlife Management, Agriculture, and Environmental Science. As director of the Theoretical and Statistical Ecology Group, Rafael brings together a community of researchers who use mathematical and statistical tools to better understand the natural world. As an alternative teaching strategy, Rafael has been producing music videos and parodies to promote Statistics in social media and in the classroom. His personal webpage can be found here

ResearchGate
GoogleScholar
ORCID
GitHub

 

Details

Start:
20 February 2024
End:
22 February 2024
Cost:
£250.00
Event Categories:
, ,

Venue

Delivered remotely (United Kingdom)
Western European Time, United Kingdom + Google Map

Tickets

The numbers below include tickets for this event already in your cart. Clicking "Get Tickets" will allow you to edit any existing attendee information as well as change ticket quantities.
DVGG04 ONLINE
DVGG04 ONLINE
£ 250.00
Unlimited