Loading Events

« All Events

Data visualization using GG plot 2 (R and Rstudio) (DVGGPR)

1st January 2030

£250.00
Data visualization using GG plot 2 (R and Rstudio) (DVGGPR)

Course Format

Pre Recorded

About This Course

In this two day course, we provide a comprehensive introduction to data visualization in R using ggplot. On the first day, we begin by providing a brief overview of the general principles data visualization, and an overview of the general principles behind ggplot. We then proceed to cover the major types of plots for visualizing distributions of univariate data: histograms, density plots, barplots, and Tukey boxplots. In all of these cases, we will consider how to visualize multiple distributions simultaneously on the same plot using different colours and “facet” plots. We then turn to the visualization of bivariate data using scatterplots. Here, we will explore how to apply linear and nonlinear smoothing functions to the data, how to add marginal histograms to the scatterplot, add labels to points, and scale each point by the value of a third variable. On Day 2, we begin by covering some additional plot types that are often related but not identical to those major types covered on Day 1: frequency polygons, area plots, line plots, uncertainty plots, violin plots, and geospatial mapping. We then consider more fine grained control of the plot by changing axis scales, axis labels, axis tick points, colour palettes, and ggplot “themes”. Finally, we consider how to make plots for presentations and publications. Here, we will introduce how to insert plots into documents using RMarkdown, and also how to create labelled grids of subplots of the kind seen in many published articles.

Intended Audiences

This course is aimed at anyone who is interested in using R for data science or statistics. R is widely used in all areas of academic scientific research, and also widely throughout the public, and private sector.

Course Details

Last Up-Dated – 08:04:2021

Duration – Approx. 15 hours

ECT’s – Equal to 1 ECT’s

Language – English

Teaching Format

This course will be largely practical, hands-on, and workshop based. For each topic, there will first be some lecture style presentation, i.e., using slides or blackboard, to introduce and explain key concepts and theories. Then, we will cover how to perform the various statistical analyses using R. Any code that the instructor produces during these sessions will be uploaded to a publicly available GitHub site after each session. For the breaks between sessions, and between days, optional exercises will be provided. Solutions to these exercises and brief discussions of them will take place after each break.

Although not strictly required, using a large monitor or preferably even a second monitor will make the learning experience better, as you will be able to see my RStudio and your own RStudio simultaneously.

All the sessions will be video recorded, and made available immediately on a private video hosting website. Any materials, such as slides, data sets, etc., will be shared via GitHub.

Assumed quantitative knowledge

We will assume only a very minimal amount of familiarity with some general statistical concepts. Anyone who has taken any undergraduate (Bachelor’s) level introductory course on (applied) statistics can be assumed to have sufficient familiarity with these concepts.

Assumed computer background

Minimal prior experience with R and RStudio is required. Attendees should be familiar with some basic R syntax and commands, how to write code in the RStudio console and script editor, how to load up data from files, etc.

Equipment and software requirements

A laptop computer with a working version of R or RStudio is required. R and RStudio are both available as free and open source software for PCs, Macs, and Linux computers. R may be downloaded by following the links here https://www.r-project.org/. RStudio may be downloaded by following the links here: https://www.rstudio.com/.

All the R packages that we will use in this course will be possible to download and install during the workshop itself as and when they are needed, and a full list of required packages will be made available to all attendees prior to the course.

A working webcam is desirable for enhanced interactivity during the live sessions, we encourage attendees to keep their cameras on during live zoom sessions.

Although not strictly required, using a large monitor or preferably even a second monitor will improve he learning experience

Tickets

The numbers below include tickets for this event already in your cart. Clicking "Get Tickets" will allow you to edit any existing attendee information as well as change ticket quantities.
DVGGPR (PRE RECORDED)
DVGGPR (PRE RECORDED)
£ 250.00
Unlimited

PLEASE READ – CANCELLATION POLICY

Cancellations/refunds are accepted as long as the course materials have not been accessed,.

There is a 20% cancellation fee to cover administration and possible bank fess.

If you need to discuss cancelling please contact oliverhooker@prstatistics.com.

If you are unsure about course suitability, please get in touch by email to find out more oliverhooker@prstatistics.com

COURSE PROGRAMME

Day 1

Approx. 6 Hours

Topic 1: What is data visualization. Data visualization is a means to explore and understand our data and should be a major part of any data analysis. Here, we briefly discuss why data visualization is so important and what the major principles behind it are.

Topic 2: Introducing ggplot. Though there are many options for visualization in R, ggplot is simply the best. Here, we briefly introduce the major principles behind how ggplot works, namely how it is a layered grammar of
graphics.

Topic 3: Visualizing univariate data. Here, we cover a set of major tools for visualizing distributions over single variables: histograms, density plots, barplots, Tukey boxplots. In each case, we will explore how to plot multiple groups of data simultaneously using different colours and also using facet plots.

Topic 4: Scatterplots. Scatterplots and their variants are used to visualize bivariate data. Here, in addition to covering how to visualize multiple groups using colours and facets, we will also cover how to provide marginal plots on the scatterplots, labels to points, and how to obtain linear and nonlinear smoothing of the plots.

 

Day 2

Approx. 6 Hours

Topic 5: More plot types. Having already covered the most widely used general purpose plots on Day 1, we now turn to cover a range of other major plot types: frequency polygons, area plots, line plots, uncertainty plots, violin plots, and geospatial mapping. Each of these are important and widely used types of plots, and knowing them will expand your repertoire.

Topic 6: Fine control of plots. Thus far, we will have mostly used the default for the plot styles and layouts. Here, we will introduce how to modify things like the limits and scales on the axes, the positions and nature of the axis ticks, the colour palettes that are used, and the different types of ggplot themes that are available.

Topic 7: Plots for publications and presentations: Thus far, we have primarily focused on data visualization as a means of interactively exploring data. Often, however, we also want to present our plots in, for example, published articles or in slide presentations. It is simple to save a plot in different file formats, and then insert them into a document. However, a much more efficient way of doing this is to use RMarkdown to run the R code and automatically insert the resulting figure into a, for example, Word document, pdf document, html page, etc. In addition, here we will also cover how to make labelled grids of subplots like those found in many scientific articles.

Course Instructor

    • Dr. Mark Andrews

    Works At
    Senior Lecturer, Psychology Department, Nottingham Trent University, England

    • Teaches
    • Free 1 day intro to r and r studio (FIRR)
    • Introduction To Statistics Using R And Rstudio (IRRS03)
    • Introduction to generalised linear models using r and rstudio (IGLM)
    • Introduction to mixed models using r and rstudio (IMMR)
    • Nonlinear regression using generalized additive models (GAMR)
    • Introduction to hidden markov and state space models (HMSS)
    • Introduction to machine learning and deep learning using r (IMDL)
    • Model selection and model simplification (MSMS)
    • Data visualization using gg plot 2 (r and rstudio) (DVGG)
    • Data wrangling using r and rstudio (DWRS)
    • Reproducible data science using rmarkdown, git, r packages, docker, make & drake, and other tools (RDRP)
    • Introduction/fundamentals of bayesian data analysis statistics using R (FBDA)
    • Bayesian data analysis (BADA)
    • Bayesian approaches to regression and mixed effects models using r and brms (BARM)
    • Introduction to stan for bayesian data analysis (ISBD)
    • Introduction to unix (UNIX01)
    • Introduction to python (PYIN03)
    • Introduction to scientific, numerical, and data analysis programming in python (PYSC03)
    • Machine learning and deep learning using python (PYML03)
    • Python for data science, machine learning, and scientific computing (PDMS02)

     

    Personal website

Details

Date:
1st January 2030
Cost:
£250.00
Event Category:
Event Tags:

Venue

Recorded
United Kingdom + Google Map

Tickets

The numbers below include tickets for this event already in your cart. Clicking "Get Tickets" will allow you to edit any existing attendee information as well as change ticket quantities.
DVGGPR (PRE RECORDED)
DVGGPR (PRE RECORDED)
£ 250.00
Unlimited