Loading Events

« All Events

  • This event has passed.

ONLINE COURSE – Introduction To Scientific, Numerical, And Data Analysis Programming In Python (PYSC03) This course will be delivered live

4th May 2022 - 5th May 2022

£275.00
ONLINE COURSE – Introduction To Scientific, Numerical, And Data Analysis Programming In Python (PYSC03) This course will be delivered live

Event Date

Wednesday, May 4th, 2022

Course Format

This is a ‘LIVE COURSE’ – the instructor will be delivering lectures and coaching attendees through the accompanying computer practical’s via video link, a good internet connection is essential.

Time Zone

TIME ZONE – GMT – however all sessions will be recorded and made available allowing attendees from different time zones to follow.

Please email oliverhooker@prstatistics.com for full details or to discuss how we can accommodate you).

About This Course

Python is one of the most widely used and highly valued programming languages in the world, and is especially widely used in data science, machine learning, and in other scientific computing applications. In order to use Python confidently and competently for these applications, it is necessary to have a solid foundation in the fundamentals of scientific, numerical, and data analysis programming Python. This two day course provides a general introduction to numerical programming in Python, particularly using numpy, data processing in Python using Pandas, data analysis in Python using statsmodels and rpy2. We will also cover the major data visualization and graphics tools in Python, particularly matplotlib, seaborn, and ggplot. Finally, we will cover some other major scientific Python tools, such as for symbolic mathematics and parallel programming and code acceleration. Note that in this course, we will not be teaching Python fundamentals and general purpose programming, but this knowledge will be assumed, and is also provided in a preceding two-day course.

Intended Audiences
This course is aimed at anyone who is interested in learning the fundamentals of Python generally and especially for ultimately using Python for data science and scientific applications. Although these applications are not covered directly here, but are covered in a subsequent course, the fundamentals taught here are vital for master data science and scientific applications of Python.
Venue
Delivered remotely
Course Details

Availability – TBC

Duration – 2 days

Contact hours – Approx. 15 hours

ECT’s – Equal to 1 ECT’s

Language – English

Teaching Format

This course will be hands-on and workshop based. Throughout each day, there will be some brief introductory remarks for each new topic, introducing and explaining key concepts.

The course will take place online using Zoom. On each day, the live video broadcasts will occur between (UK local time) at:
• 10am-12pm
• 1pm-3pm
• 4pm-6pm

All sessions will be video recorded and made available to all attendees as soon as possible, hopefully soon after each 2hr session. Attendees in different time zones will be able to join in to some of these live broadcasts, even if all of them are not convenient times. By joining any live sessions that are possible, this will allow attendees to benefit from asking questions and having discussions, rather than just watching prerecorded sessions. Although not strictly required, using a large monitor or preferably even a second monitor will make the learning experience better. All the sessions will be video recorded, and made available immediately on a private video hosting website. Any materials, such as slides, data sets, etc., will be shared via GitHub.

Assumed quantitative knowledge

We will assume familiarity with some general statistical and mathematical concepts such as matrix algebra, calculus,probability distributions. However, expertise with these concepts are not necessary. Anyone who has taken anyundergraduate (Bachelor’s) level course in mathematics, or even advanced high school level, can be assumed to havesufficient familiarity with these concepts.

Assumed computer background

We assume familiarity with using Python and knowledge of general purpose programming in Python. This topics are covered comprehensively in a preceding two-day course, which will provide all the prerequisites for this course.

Equipment and software requirements

A laptop computer with a working version of R or RStudio is required. R and RStudio are both available as free and open source software for PCs, Macs, and Linux computers. R may be downloaded by following the links here https://www.r-project.org/. RStudio may be downloaded by following the links here: https://www.rstudio.com/.

All the R packages that we will use in this course will be possible to download and install during the workshop itself as and when they are needed, and a full list of required packages will be made available to all attendees prior to the course.

A working webcam is desirable for enhanced interactivity during the live sessions, we encourage attendees to keep their cameras on during live zoom sessions.

Although not strictly required, using a large monitor or preferably even a second monitor will improve he learning experience.

PLEASE READ – CANCELLATION POLICY

Cancellations are accepted up to 28 days before the course start date subject to a 25% cancellation fee. Cancellations later than this may be considered, contact oliverhooker@prstatistics.com. Failure to attend will result in the full cost of the course being charged. In the unfortunate event that a course is cancelled due to unforeseen circumstances a full refund of the course fees will be credited.

If you are unsure about course suitability, please get in touch by email to find out more oliverhooker@prstatistics.com

 

COURSE PROGRAMME

Wednesday 4th – Classes from 10:00 to 18:00

Topic 1: Numerical programming with numpy. Although not part of Python’s official standard library, the numpy package is the part of the de facto standard library for any scientific and numerical programming. Here we will introduce numpy, especially numpy arrays and their built in functions (i.e. “methods”). Here, we will also consider how to speed up numpy code using the Numba just-in-time compiler.

Topic 2: Data processing with pandas. The pandas library provides means to represent and manipulate data frames. Like numpy, pandas can be see as part of the de facto standard library for data oriented uses of Python. Here, we will focus on data wrangling including selecting rows and columns by name and other criteria, applying functions to the selected data, aggregating the data. For this, we will use Pandas directly, and also helper packages like siuba.

Thursday 5th – Classes from 10:00 to 18:00

Topic 3: Data Visualization. Python provides many options for data visualization. The matplotlib library is a low level plotting library that allows for considerable control of the plot, albeit at the price of a considerable amount ofm low level code. Based on matplotlib, and providing a much higher level interface to the plot, is the seaborn library. This allows us to produce complex data visualizations with a minimal amount of code. Similar to seaborn is ggplot, which is a direct port of the widely used R based visualization library.

Topic 4: Statistical data analysis. In this section, we will describe how to perform widely used statistical analysis in Python. Here we will start with the statsmodels, which provides linear and generalized linear models as well as many other widely used statistical models. We will also cover rpy2, which is and interface from Python to R. This allows us to access all of the the power of R from within Python.

Topic 5: Symbolic mathematics. Symbolic mathematics systems, also known as computer algebra systems, allow us to algebraically manipulate and solve symbolic mathematical expression. In Python, the principal symbolic mathematics library is sympy. This allows us simplify mathematical expressions, compute derivatives, integrals, and limits, solve equations, algebraically manipulate matrices, and more.

Topic 6: Parallel processing. In this section, we will cover how to parallelize code to take advantage of multiple processors. While there are many ways to accomplish this in Python, here we will focus on the multiprocessing

Course Instructor

    • Dr. Mark Andrews

    Works At
    Senior Lecturer, Psychology Department, Nottingham Trent University, England

    • Teaches
    • Free 1 day intro to r and r studio (FIRR)
    • Introduction To Statistics Using R And Rstudio (IRRS03)
    • Introduction to generalised linear models using r and rstudio (IGLM)
    • Introduction to mixed models using r and rstudio (IMMR)
    • Nonlinear regression using generalized additive models (GAMR)
    • Introduction to hidden markov and state space models (HMSS)
    • Introduction to machine learning and deep learning using r (IMDL)
    • Model selection and model simplification (MSMS)
    • Data visualization using gg plot 2 (r and rstudio) (DVGG)
    • Data wrangling using r and rstudio (DWRS)
    • Reproducible data science using rmarkdown, git, r packages, docker, make & drake, and other tools (RDRP)
    • Introduction/fundamentals of bayesian data analysis statistics using R (FBDA)
    • Bayesian data analysis (BADA)
    • Bayesian approaches to regression and mixed effects models using r and brms (BARM)
    • Introduction to stan for bayesian data analysis (ISBD)
    • Introduction to unix (UNIX01)
    • Introduction to python (PYIN03)
    • Introduction to scientific, numerical, and data analysis programming in python (PYSC03)
    • Machine learning and deep learning using python (PYML03)
    • Python for data science, machine learning, and scientific computing (PDMS02)

     

    Personal website

ResearchGate

Google Scholar

Mark Andrews is a Senior Lecturer in the Psychology Department at Nottingham Trent University in Nottingham, England. Mark is a graduate of the National University of Ireland and obtained an MA and PhD from Cornell University in New York. Mark’s research focuses on developing and testing Bayesian models of human cognition, with particular focus on human language processing and human memory. Mark’s research also focuses on general Bayesian data analysis, particularly as applied to data from the social and behavioural sciences. Since 2015, he and his colleague Professor Thom Baguley have been funded by the UK’s ESRC funding body to provide intensive workshops on Bayesian data analysis for researchers in the social sciences.

Details

Start:
4th May 2022
End:
5th May 2022
Cost:
£275.00
Event Categories:
,

Venue

Delivered remotely (United Kingdom)
Western European Time Zone, United Kingdom + Google Map