Genetic data analysis and exploration using R (GDAR03)
23 October 2017 - 27 October 2017£260.00 - £580.00
This course provides a comprehensive introduction to exploratory statistical methods used in population genetics and molecular ecology. Participants will become proficient in a range of approaches for uncovering genetic structures from usual genetic data including most genetic markers (e.g. microsatellites, SNPs, AFLP) and genetic sequences (DNA or amino-acid). After covering different types of phylogenetic reconstruction, and basic population genetics tests, a strong emphasis will be put on using factorial methods (e.g. Principal Component Analysis) for investigating genetic diversity. In particular, we will focus on the identification and description of genetic clusters, and on characterising spatial genetic patterns. The last day of the course will be an open problem day, where participants will be able to analyse their own data.
The course is aimed at PhD students, research postgraduates, and practicing academics as well as persons in industry working with genetic data in fields such as molecular ecology, evolutionary biology, and phylogenetics.
We offer COURSE ONLY and ACCOMMODATION PACKAGES;
• COURSE ONLY – Includes lunch and refreshments.
• ACCOMMODATION PACKAGE (to be purchased in addition to the course only option) – Includes breakfast, lunch, dinner, refreshments, minibus to and from meeting point and accommodation. Accommodation is multiple occupancy (max 3 people) single sex en-suite rooms. Arrival Sunday 22nd October and departure Friday 27th October PM.
To book ‘COURSE ONLY’ with the option to add the additional ‘ACCOMMODATION PACKAGE’ please scroll to the bottom of this page.
Other payment options are available please email email@example.com
Cancellation policy: Cancellations are accepted up to 28 days before the course start date subject to a 25% cancellation fee. Cancellations later than this may be considered, contact firstname.lastname@example.org Failure to attend will result in the full cost of the course being charged. In the unfortunate event that PRstatistics must cancel this course due to unforeseen circumstances a full refund for the course will be credited. However PRstatistics cannot be held responsible for any travel fees, accommodation or other expenses incurred to you as a result of the cancellation.
A mixture of lectures and hands-on practicals. Data sets for computer practicals will be provided by the instructors, but participants are welcome to bring their own data.
Assumed quantitative knowledge
A basic understanding of concepts in population genetics and the statistical analysis of genetic data.
Assumed computer background
Previous experience with data analysis using R is required such as the ability to import/export data, manipulate data frames, fit basic statistical models & generate simple exploratory and diagnostic plots.
Equipment and software requirements
A laptop/personal computer with a working version or R and RStudio installed. R and RStudio are supported by both PC and MAC and can be downloaded for free by following these links.
It is essential that you come with all necessary software and packages already installed (you will be sent a list of packages prior to the course) internet access may not always be available.
UNSURE ABOUT SUITABLILITY THEN PLEASE ASK email@example.com
Monday 23rd – Classes from 09:00 to 17:00
Intro to phylogenetic reconstruction.
Module 1a: Reconstructing phylogenies from genetic sequence data. Three main approaches covered: distance-based phylogenies; maximum parsimony; and likelihood-based approaches.
Module 1b: Short R refresher.
Practical 1: Phylogenetic reconstruction using R. Three main approaches plus rooting a tree; assessing/testing for a molecular clock; and bootstrapping.
Main packages: ape, phangorn.
Tuesday 24th – Classes from 09:00 to 17:00
Intro to multivariate analysis of genetic data
Module 2: Key concepts in multivariate analysis. Focus on using factorial methods for genetic data analysis.
Practical 2: Basics of multivariate analysis of genetic data in R. Topics include: data handling, population genetic tests of population structure (PCA, PCoA).
Main packages: adegenet, ade4, ape.
Wednesday 25th – Classes from 09:00 to 17:00
Exploring group diversity
Module 3: Approaches to identifying and describing genetic clusters. Topics include: hierarchical clustering, K-means, population-level multivariate analysis (between-group-PCA, DA, DAPC).
Practical 3: Applying the approaches covered in morning lecture and emphasising their strengths and weaknesses.
Main packages: adegenet, ade4.
Thursday 26th – Classes from 09:00 to 17:00
Spatial genetic structures
Module 4: Discussing the origin and significance of spatial genetic patterns, and how to test for them.
Practical 4: Visualising and analysing spatial genetic data. Topics: spatial density estimates, Moran/Mantel tests, mapping principal components in PCA, spatial PCA.
Main packages: adegenet, adehabitat, ade4.
Friday 27th – Classes from 09:00 to 16:00
Using R for reproducible science.
Module 5: Using R for reproducible science
Practical 5: Practical session based on morning lecture
Main packages: knitr, Sweave, rmarkdown