
PROFESSIONAL TRAINING
Hands-On Data Science Training Sessions
From introduction to advanced courses in R, Python, SQL and Git, to machine learning and industry-specific modeling, our professional training courses can be modified to fit your organization's needs. Our instructors are either published authors in the field of data science and/or professors at prominent universities.

Introduction to R
We begin with an introduction to the Posit user interface, familiarizing students with the R programming language and providing a general overview of its efficiency and capacity via tools and packages.

Working in the Tidyverse
The Tidyverse is defined as a collection of R packages that share common philosophies, grammar and data structures. This class will be a hands-on instructional tutorial covering topics on the workflow of data manipulation (import, tidy, transform, visualize, model).

Elegant Reporting in RMarkdown & Quarto
Compelling presentation of an analysis is as important as the analysis itself, and yet is too frequently treated as an afterthought. This course covers numerous means of communicating data, Markdown for PDF and HTML reports and ioslides for presentations.

Computing in R
Building upon the general concepts of Introduction to R, this course leads to more advanced programming concepts, including assigning variables and using dplyr. Topics include: writing functions, control statements, iterating with loops, reshaping data, group manipulation and more.

Workflow & Visualization in R
Among the most important steps in analysis is visualization. This course focuses on ggplot2 (a powerful tool for plotting) and Quarto (used to interweave R code with ordinary text to produce easy-to-modify, well-formatted, automated data analysis reports).

Shiny Dashboards
Shiny is a new paradigm in data analytics, providing interactive dashboards and advanced analytics in R. Shiny supports the development of web-based dashboards to run statistics, machine learning and deep learning methods. Training in Shiny covers key aspects of dashboard design and development, code optimization and reactivity.

Advanced Statistics in R: Modeling and Analytics
We begin with the basics (descriptive statistics like mean and variance) and progress to more advanced modeling techniques (like linear models and time series). The focus will be on applied programming, though theoretical properties and derivations will be taught where appropriate.

Machine Learning in R
We focus on the available methods for implementing machine learning algorithms in R, and will examine some of the underlying theory. We will explore several models which includes linear regression, elastic net, tree-based models, clustering, bootstrapping and cross-validation.

High Performance Computing in R
In the era of complex data, in which more of the data is stored in the cloud, handling data in volume has become a necessity. There are a variety of ways in which R handles the processing of large amounts of data. This course focuses on improving processing speed in R.

Building R Packages
One of the benefits of coding is that your work is repeatable. This course will walk through the whole process of converting code into a package, writing documentation for help files and writing tests to ensure everything works.

Integrating C++ in R
Looking to migrate from C++ to R? Learn how to write high performance C++ code that integrates smoothly into R, with topics including: Rcpp, C++ data types, writing functions, syntactic sugar and the differences between C++ and R style

Forecasting with R
This course walks through the whole process of fitting time series models. First, we learn about the time series object in R and learn about various ways to plot time-based data. Then we work our way through forecasting techniques, starting with simple methods like mean forecasting and working our way to exponential and ARMA models.

Putting R in Production
Using R in production is easier than ever thanks to tools like renv, plumber and Docker. In this course we start with a fresh Posit project and git repo. We will populate it with functionality and expose the project as a REST API using the plumber package.

Introduction to Python
This course provides an introduction to Python programming with a focus on data analytics using the Pandas library. It covers Python installation, environments, and IDEs, along with key topics like data manipulation, visualization, and importing data from various sources using tools like NumPy, Matplotlib, and Statsmodels.

Machine Learning in Python
We focus on the available methods for implementing machine learning in Python, and will examine some of the underlying theory. We will explore popular tools for data science such as NumPy and Pandas as well as tools for machine and deep learning such as scikit-learn, TensorFlow, and PyTorch. We will also discuss how some of these tools can be optimized on NVIDIA graphics cards using their CUDA library.

Mastering & Collaborating with Git
Git is a distributed version-control system for tracking changes in source code during software development. It is designed for coordinating work among programmers, but it can be used to track changes in any set of files. Training in Git covers key aspects including best practices when using Git.

SQL Foundations: Querying Databases
Our SQL course offers a great introduction to the most widely used tool for working with databases. You’ll learn to filter, sort, join data, use aggregate functions, and write queries to analyze data. We’ll also discuss databases like PostgreSQL, DuckDB, and Kinetica, highlighting their uses and unique features.