Loading Events

« All Events

ONLINE COURSE – Data wrangling using R and Rstudio (DWRS03) This course will be delivered live

21 November 2023 - 23 November 2023

£125.00 – £250.00
ONLINE COURSE – Data wrangling using R and Rstudio (DWRS03) This course will be delivered live
Event Date

Tuesday, November 21st, 2023

COURSE FORMAT

This is a ‘LIVE COURSE’ – the instructor will be delivering lectures and coaching attendees through the accompanying computer practical’s via video link, a good internet connection is essential.

COURSE PROGRAM

TIME ZONE – Central Time Zone – however all sessions will be recorded and made available allowing attendees from different time zones to follow.

Please email oliverhooker@prstatistics.com for full details or to discuss how we can accommodate you.

Course Details

During this course we provide a comprehensive practical introduction to data wrangling using R. In particular, we focus on tools provided by R’s tidyverse, including dplyr, tidyr, purrr, etc. Data wrangling is the art of taking raw and messy data and formatting and cleaning it so that data analysis and visualization etc may be performed on it. Done poorly, it can be time consuming, laborious, and error-prone. Fortunately, the tools provided by R’s tidyverse allow us to do data wrangling in a fast, efficient, and high-level manner, which can have dramatic consequences for ease and speed with which we analyse data. We start with how to read data of different types into R, we then cover in detail all the dplyr tools such as select, filter, mutate, etc. Here, we will also cover the pipe operator (%>%) to create data wrangling pipelines that take raw messy data on the one end and return cleaned tidy data on the other. We then cover how to perform descriptive or summary statistics on our data using dplyr’s summarize and group_by functions. We then turn to combining and merging data. Here, we will consider how to concatenate data frames, including concatenating all data files in a folder, as well as cover the powerful SQL like join operations that allow us to merge information in different data frames. The final topic we will consider is how to “pivot” data from a “wide” to “long” format and back using tidyr’s pivot_longer and pivot_wider.

Intended Audiences
This course is aimed at anyone who is interested in using R for data science or statistics. R is widely used in all areas of academic scientific research, and also widely throughout the public, and private sector.
Venue
Delivered remotely
Course Information

Time zone – GMT+1

Availability – TBC

Duration – 3 x 1/2 days

Contact hours – Approx. 12 hours

ECT’s – Equal to 1 ECT’s

Language – English

Teaching Format

This course will be largely practical, hands-on, and workshop based. For each topic, there will first be some lecture style presentation, i.e., using slides or blackboard, to introduce and explain key concepts and theories. Then, we will cover how to perform the various statistical analyses using R. Any code that the instructor produces during these sessions will be uploaded to a publicly available GitHub site after each session. For the breaks between sessions, and between days, optional exercises will be provided. Solutions to these exercises and brief discussions of them will take place after each break.

The course will take place online using Zoom. On each day, the live video broadcasts will occur during UK local time (GMT+0) at:
• 12pm-2pm
• 3pm-5pm
• 6pm-8pm

All sessions will be video recorded and made available to all attendees as soon as possible, hopefully soon after each 2hr session.

If some sessions are not at a convenient time due to different time zones, attendees are encouraged to join as many of the live broadcasts as possible. For example, attendees from North America may be able to join the live sessions from 3pm-5pm and 6pm-8pm, and then catch up with the 12pm-2pm recorded session once it is uploaded. By joining live sessions attendees will be able to benefit from asking questions and having discussions, rather than just watching prerecorded sessions.

At the start of the first day, we will ensure that everyone is comfortable with how Zoom works, and we’ll discuss the procedure for asking questions and raising comments.

Although not strictly required, using a large monitor or preferably even a second monitor will make the learning experience better, as you will be able to see my RStudio and your own RStudio simultaneously.

All the sessions will be video recorded, and made available immediately on a private video hosting website. Any materials, such as slides, data sets, etc., will be shared via GitHub.

Assumed quantitative knowledge

We will assume familiarity with only the most basic of statistical concepts, such as descriptive statistics. We will not even assume that participants will have taken university level courses on statistics.

Assumed computer background

Minimal prior experience with R and RStudio is required. Attendees should be familiar with some basic R syntax and commands, how to write code in the RStudio console and script editor, how to load up data from files, etc.

Equipment and software requirements

Attendees of the course will need to use a computer on which RStudio can be installed. This includes Mac, Windows, and Linux, but not tablets or other mobile devices. Instructions on how to install and configure all the required software, which is all free and open source, will be provided before the start of the course. We will also provide time during the workshops to ensure that all software is installed and configured properly.

UNSURE ABOUT SUITABLILITY THEN PLEASE ASK oliverhooker@prstatistics.com

Assumed quantitative knowledge
Coming soon..
Assumed computer background
Coming soon..
Equipment and software requirements
Attendees will need to install/update R/RStudio and various additional R packages.

This can be done on Macs, Windows, and Linux.

R – https://cran.r-project.org/

RStudio – https://www.rstudio.com/products/rstudio/download/

Tickets

The numbers below include tickets for this event already in your cart. Clicking "Get Tickets" will allow you to edit any existing attendee information as well as change ticket quantities.
DWRS03 ONLINE EARLY BIRD
DWRS03 ONLINE EARLY BIRD BOOK BEFORE 8th NOVEMBER
£ 125.00
Unlimited
DWRS03 ONLINE
DWRS03 ONLINE
£ 250.00
Unlimited
PLEASE READ – CANCELLATION POLICY

Cancellations are accepted up to 28 days before the course start date subject to a 25% cancellation fee. Cancellations later than this may be considered, contact oliverhooker@prstatistics.com. Failure to attend will result in the full cost of the course being charged. In the unfortunate event that a course is cancelled due to unforeseen circumstances a full refund of the course fees will be credited.

If you are unsure about course suitability, please get in touch by email to find out more

info@clovertraining.co.uk

COURSE PROGRAMME

Tuesday 21st

Classes from 12:00 to 16:00 (Central Time Zone)

Topic 1: Reading in data. We will begin by reading in data into R using tools such as readr and readxl. Almost all types of data can be read into R, and here we will consider many of the main types, such as csv, xlsx, sav, etc. Here, we will also consider how to contol how data are parsed, e.g., so that they are read as dates, numbers, strings, etc.

Topic 2: Wrangling with dplyr. For the remainder of Day 1, we will next cover the very powerful dplyr R package. This package supplies a number of so-called “verbs” — select, rename, slice, filter, mutate, arrange, etc. — each of which focuses on a key data manipulation tools, such as selecting or changing variables. All of these verbs can be chained together using “pipes” (represented by %>%). Together, these create powerful data wrangling pipelines that take raw data as input and return cleaned data as output. Here, we will also learn about the key concept of “tidy data”, which is roughly where each row of a data frame is an observation and each column is a variable.

Wednesday 22nd

Classes from 12:00 to 16:00 (Central Time Zone)

Topic 2 continued:

Topic 3: Summarizing data. The summarize and group_by tools in dplyr can be used with great effect to summarize data using descriptive statistics.

Thursday 23rd

Topic 4: Merging and joining data frames. There are multiple ways to combine data frames, with the simplest being “bind” operations, which are effectively horizontal or vertical concatenations. Much more powerful are the SQL like “join” operations. Here, we will consider the inner_join, left_join, right_join, full_join operations. In this section, we will also consider how to use purrr to read in and automatically merge large sets of files.

Topic 5: Pivoting data. Sometimes we need to change data frames from “long” to “wide” formats. The R package tidyr provides the tools pivot_longer and pivot_wider for doing this.

Course Instructor

    • Dr. De Andrade Moral

Let’s connect

Lorem ipsum dolor sit amet, consectetuer adipiscing elit.
Copyright  PR Statistics  2022  |  Privacy Policy  |  Disclaimer  |  Site Map

Details

Start:
21 November 2023
End:
23 November 2023
Cost:
£125.00 – £250.00
Event Categories:
, ,

Venue

Delivered remotely (United Kingdom)
Western European Time, United Kingdom + Google Map

Tickets

The numbers below include tickets for this event already in your cart. Clicking "Get Tickets" will allow you to edit any existing attendee information as well as change ticket quantities.
DWRS03 ONLINE EARLY BIRD
DWRS03 ONLINE EARLY BIRD BOOK BEFORE 8th NOVEMBER
£ 125.00
Unlimited
DWRS03 ONLINE
DWRS03 ONLINE
£ 250.00
Unlimited