[[r:r]]

R is a Free Open Source Software (FOSS) project implementation of the S programming language. Its a wonderful piece of software (although it has its limitations) but if you're a serious statistician you would ignore learning it at your peril.

Learning R

There are tons of resources out there for learning R. I've collated and categorised those that I've found useful and consider to be of high quality.

Installing Packages

Details of the packages I will typically install are here along with how to update packages when you update R and how to list installed packages.

Packages you want/need will depend on you, but the CRAN Taskviews provide a useful overview to packages for particular tasks or areas of usage.

Archived Packages

snippet.bash
> ascii_url <- "https://cran.r-project.org/src/contrib/Archive/ascii/ascii_2.1.tar.gz"
> install.packages(ascii_url, repos=NULL, type="source")

Data Management

You'll have to either import data into R or generate data depending on whether you are performing analyses or simulations.

Analysis

There are many analyses that can be performed within R, and the number is growing rapidly as people write and make available extensions. Regardless there are some data manipulation and approaches to analysis that make life a lot easier.

Avoiding Loops

Loops are actually quite slow in R, and where possible the problem should be cast using one of the apply() functions (there are several, mapply(), rapply(), sapply() and lapply()). Personally it took me some time to get my head around using these, and I found A Brief Introduction to apply in R invaluable. There are also other solutions such as some of the helper functions in the plyr package and in a lot of instances using the reshape2 package to melt() the data greatly facilitates the workflow.

More recently the purrr package provides a consistent and “tidy” approach to repeating tasks across lists under the guise of “functional programming”.

Because this is quite a large and varied topic I've split details of to a separate page

Output

R integrates seamlessly with LaTeX using the wonderful Knitr package. Tips and tricks to facilitate common tasks are documented as and when I come across and resolve the problem.

More recently though I've switched to using RMarkdown, which is much more flexible allowing the production of HTML, LaTeX/PDF, M$-Word, and even integrating Shiny to produce dynamic/interactive web-pages.

Graphics

Whilst essentially an output graphics is such a huge area it warrants its own section. There are many options for graphics in R, but they basically fall into two categories lattice graphics or ggplot2. I've opted to dedicate time learning the later, ggplot2 so there won't be much here on lattice.

Programming

Essentially any script written in R constitutes programming, but in this section I go into slightly greater detail about writing functions and keeping related functions grouped together as packages.

There is lots to R programming and unsurprisingly a lot has been written…

Error Messages

If you're anything like me you'll regularly encounter error messages whilst working with R. I have attempted to curate those that I come across along with some of their meanings.

Updating Packages

A neat trick to update all installed packages whenever there is a major release of R is the following code…

snippet.bash
install.packages( 
    lib  = lib <- .libPaths()[1],
    pkgs = as.data.frame(installed.packages(lib), stringsAsFactors=FALSE)$Package,
    type = 'source'
)

Installing Manually

Occasionally I've had packages where I've been provided updated versions and I need to install them manually rather than relying on CRAN (since I'm trying to install a version newer than on CRAN). This can be done with…

snippet.bash
install.packages('~/path/to/source-file-1.0.1.tar.gz',
                 repos = NULL,
                 type  = 'source')

Missing Packages

The following was found on Reddit and is purportedly adapted from StackOverflow…

snippet.bash
list.of.packages <- c("assertr",
                      "ggplot2",
                      "tidyverse",
                      "magrittr",
                      "stringr",
                      "lubridate")
# Install missing packages
misssing.packages <- list.of.packages[!(list.of.packages %in% installed.packages()[,"Package"])]
if (length(misssing.packages) > 0) {install.packages(misssing.packages, dependencies = TRUE)}
# Load packages
lapply(list.of.packages, require, character.only = TRUE)
# Cleanup
rm(list.of.packages)
rm(misssing.packages)

Modeling

R, being a statistical programming language, is really useful for statistical modelling.

XGBoost and SHAP

A useful post on using XGBoost and SHAP for interpreting complex models is here

Shiny

R on Android

A lot of people carry little computers around in their pockets these days (i.e. smartphones and tablets). Wouldn't it be great to have R in your pocket too? Well you can, and I've written up how to do this…Installing R on Android.

Another method using termux is detailed here.

R Validation

Many people often state that they are worried about using R because its open-source and not validated. This is nonsense, all commercial products come with indemnity clauses that absolved the authors and publishers of any responsibility should the software be found to be faulty, thats Stata, SPSS, SAS and many more. The R Foundation have a document on compliance with the US's Food and Drug Adminstration (FDA) requirements R-FDA

The NIHR published Validation of Statistical Programming and the appendix includes an example of validating base commands in R in the appendix.

The question has cropped up on R-help many times (from memory, haven't time to search now I'm afraid).

If there are huge concern then its possible to use ValidR from Mango Solutions which provides a validated version of R. I would hazard a guess that many people are perfectly happy using Excel for work as its got a license but there are HUGE problems with it.

For the most part such fears are caused by people's ignorance and hopefully the above resources help educate and inform them to allay their concerns and encourage them to embrace open source software.

Links

Programming

Modelling

Documentation

Graphics

Palettes

Books

Bayesian

Reproducible Research

Docker

Production Environments

CI/CD

Blogs

You only really need to follow one blog to keep abreast of many people who blog about R….

Podcasts

Essentially audio blogs…

HowTos

Miscellaneous

Blogs

statistics:R statistics:statistics statistics:programming

r/r.txt · Last modified: 2023/12/05 22:05 by admin
CC Attribution-Share Alike 4.0 International
Driven by DokuWiki Recent changes RSS feed Valid CSS Valid XHTML 1.0