An Introduction to R and R Studio

Introduction to R and R Studio

This is the first lecture in the series of lectures on how to use SEMinR package in R for PLS-SEM.

Whatr is R?

  • R is statistical computing language (R Core Team, 2021), which is the software language used to import and clean data as well as create and analyze PLS path models.
  • R is a free, open-source software, which enables users to write and execute code that analyzes data. Readers should note that the name “R” can refer to both the programming language and the primary software that runs code written in this language.
  • Further, open source refers to the kind of software whose underlying code is made freely available and is generally open to suggested improvements or new features built by others.
  • The open-source nature of the R software makes code written in the R language highly reproducible, shareable, testable, scalable, and deployable to larger automated applications.
  • An ever-expanding community of R users supports, tests, documents, and provides add-on resources for each other.
  • The R language was designed with computational statistics in mind. In its simplest form, it can be run from your operating system’s command line or from the R console see (Fig).
  • However, I recommend using R from the convenience of an integrated development environment (IDE), such as RStudio.
  • An IDE is a programming environment that offers tools such as project management, tabs for easily managing multiple script files, and additional developer tools.
  • We discuss the layout of the RStudio IDE in more detail in the next section. Throughout this book, we will demonstrate the use of R from within the RStudio IDE.

Downloading R and R Studio

R Studio Layout

R Studio Layout

R Packages

  • R includes a lot of preinstalled packages containing many of the standard functions and algorithms you will use in your statistical computations.
  • Examples of such standard functions are mean() and sd() for calculating the mean and standard deviation, respectively, or lm() for generating linear regression models.
  • While you should be able to fulfill much of your computational needs with the standard packages bundled in R, you might need to install further software libraries containing newer or more complicated algorithms.
  • Such software libraries are bundled as packages that, when installed, add a new range of functions and operations. Examples of popular packages are dplyr, ggplot, and, of course, the package used in this series, seminr.

Installing R Packages

  • The packages can be installed from the command line or from the packages tab. Note that you will need internet access to install packages from CRAN.
  • To install new packages, select the Packages tab in the lower right window of the RStudio IDE, click the Install button, set Install from to Repository (CRAN), and enter the package name in the Packages field: “seminr”. Next, click on Install.
  • Packages can also be installed from the command line using the install. packages() function. In this case, we wish to install the swirl package, which teaches you R programming (for more details on the swirl package). We therefore set the pkgs parameter equal to “swirl”.

Additionally, you can also install a package from the menu.From Tools Select Install Packages

Loading R Package

  • Note that packages are installed to the local software library on your computer but are not loaded into the RStudio local environment.
  • Once a package is installed, it will be available for computation in R but has to be loaded using the library()function prior to use.
  • Packages must be loaded in each session if you wish to use the functions in this library.
  • If the package is not loaded in a new session (i.e., after opening and rerunning R), the features will not be available in your session until you load the package by using the library() function.
#Load the Swirl Package
Library(swirl)

Explaining the Syntax

  • Throughout these sessions in the presentation and on tutorials on the website, it will be necessary to discuss various elements of the code when explaining how to perform analytic operations using R.
  • Code will be presented separately throughout the series.

Table provides a summary of the syntax.

Before, we go into detail, lets explain

What is a Code?

  • Code (short for source code) is a term used to describe text that is written using the protocol of a particular language.

What is an Argument?

  • An argument is a way for you to provide more information to a function.

What is a Code Block?

  • A set of Code is a Code Block

What is a Function?

  • Functions are “self contained” modules of code that accomplish a specific task.

Next,

  • The code block below also includes comments that describe the purpose of the following line of code.
  • Comments are not run by the programming language and only serve as communication to other users of the code about the purpose.
  • Comments in the R language begin with a pound symbol (“#”).

Writing R Scripts

  • Computational analyses are conducted by writing a series of instructions to the computer on how to import data, modify data, run algorithms for analyzing the data, and then report the results of those analyses.
  • These instructions take the form of R scripts that are typically entered into a file, which contains all the scripts related to a single analysis or computation.
  • These R script files have the suffix .R and are stored in your project directory.
  • To successfully conduct such analyses, you need to learn the form and function of the scripts that R can process.
  • A key reason for using a free, opensource software, like R, is the community support and resources typically found for such software.
  • A simple Internet search with keywords “R coding lesson” should provide hundreds of high-quality resources. Recommended is swirl (https://swirlstats.com/), which teaches you R programming by offering simple and useful lessons.
  • This package helps the user become experienced at working with R’s command-based interface and can be downloaded and used from the R console command line.

  • In addition to online tutorials and code lessons, there are many free e-books describing both introductory and advanced usage of R and RStudio.
  • A good archive for textbooks is available at the CRAN website (https://www.r-project.org/other-docs.html).
  •  

Finding Help in R Studio

  • Due to the complexity of a programming language – and the almost endless number of software libraries that can be installed adding to the functions and resources available to you – it can become difficult to keep track of how functions are called, what arguments they take, and what output they provide.
  • Packages have a range of files that are designed to document and demonstrate the use of the functions they provide.
  • These files take the form of R documentation, vignettes, and demonstration files.
  • R documentation describes the purpose, input, implementation, and output of a function and provides examples applying the syntax. The contents of an R document are described in Table.

Next,

  • This documentation can be accessed in the help tab in the lower right window of the RStudio IDE. Help topics and functions can be searched for in the search field of the help window or from the command line in the console window using the ? operator.
  • For example, we can search for help on the read.csv() function by typing the following into the console window in RStudio:
#Searching for Help using ? Operator
?read.csv
  • In Fig. we can see an excerpt of the contents of the R documentation for read.csv().
  • When encountering a new function or an error, the R documentation is the first place to look in.

Finding Help in R Studio - Vignette

  • Another very important document to consult for help using a package or function is the vignette. Vignettes are designed as an all-purpose user’s guide for the package – they describe the problem that the package seeks to solve and how it is used.
  • This document usually describes the functioning of the package in detail and provides examples and demonstrations of the problems and solutions.
  • You can access a list of vignettes installed by calling the vignette() function. This will output a list of available vignettes to an R vignette tab in the top left window of RStudio.
  • You can then run vignette(“SEMinR”) to access a particular vignette – in this case, the vignette for the package SEMinR (see Fig.).
#Check for all vignette in R
vignette()
#Load the SEMinR vignette
vignette("SEMinR")
  • In Fig. we can see the SEMinR vignette.
 

Reference

Hair Jr, J. F., Hult, G. T. M., Ringle, C. M., Sarstedt, M., Danks, N. P., & Ray, S. (2021). Partial Least Squares Structural Equation Modeling (PLS-SEM) Using R: A Workbook.

Partial Least Squares Structural Equation Modeling (PLS-SEM) Using R

The tutorials on SEMinR are based on the mentioned book. The book is open source and available for download under this link.

Download PDF

Video Tutorials