Week 2: Reproducible Analyses with Quarto

The theme of this lesson is good management of your files and data. In part two of this week’s coursework you will learn how to identify folders and paths, and create Quarto documents.

📖 Readings: 45 minutes

📽 Watch Videos: 15 min

💻 Activities: 5-10 min

✅ Check-ins: 1

1 Principles of Reproducibility

1.1 File Management

As boring as it sounds, file management is arguably one of the most important skills a data scientist should have. The reproducibility of a project depends just as much on the way in which the project was stored as the computing tools used. While using R and Quarto make an important step in creating a reproducible analysis, there are other pieces that are arguably just as important—such as file management.

Evidently, there has been a bit of generational shift as computers have evolved: the “file system” metaphor itself is outdated because no one uses physical files anymore. This article is an interesting discussion of the problem: it makes the argument that with modern search capabilities, most people use their computers as a laundry hamper instead of as a nice, organized filing cabinet.

File Systems

Regardless of how you tend to organize your personal files, it is probably helpful to understand the basics of what is meant by a computer file system – a way to organize data stored on a hard drive. Since data is always stored as 0’s and 1’s, it’s important to have some way to figure out what type of data is stored in a specific location, and how to interpret it.

Stop watching at 4:16.

File Paths

That’s not enough, though - we also need to know how computers remember the location of what is stored where. Specifically, we need to understand file paths.

When you write a program, you may have to reference external files - data stored in a .csv file, for instance, or a picture. Best practice is to create a file structure that contains everything you need to run your entire project in a single file folder (you can, and sometimes should, have sub-folders).

For now, it is enough to know how to find files using file paths, and how to refer to a file using a relative file path from your base folder. In this situation, your “base folder” is known as your working directory - the place your program thinks of as home.

1.2 Directories, Paths, and Projects

In R, there are two ways to set up your file path and file system organization:

  1. Set your working directory in R (do not recommend)
  2. Use RProjects (preferred!)

Working Directories in R

To find where your working directory is in R, you can either look at the top of your console or type getwd() into your console.

getwd()
[1] "C:/Users/cann4817/Desktop/spring-2026/weeks"

Although it is not recommended, you can set your working directory in R with setwd().

setwd("/path/to/my/assignment/folder")

R Projects

📖 Required Reading: Workflow and Projects

Since there are often many files necessary for a project (e.g. data sources, images, etc.), R has a nice built in system for setting up your project organization with R Projects. You can either create a new folder on your computer containing an R Project (e.g., you have not yet created a folder for this class) or you can add an R Project to an existing folder on your computer (e.g., you have already created a folder for this class).

To create a R Project, first open RStudio on your computer and click File > New Project, then:

Give your folder a name (it shouldn’t be my-stat331 but something that reflects STAT 210, e.g. stat-210-sp26). However, it is good practice for this file folder name to not contain spaces.

Then, browse on your computer for a location to save this folder to. For example, mine is saved in my Documents. Make sure you know how to find this; it should NOT be saved in your Downloads!

This new folder should now live in your Documents folder (or wherever you save it to) and contain a stat-210-sp26.Rproj file or whatever you named it. This is your new “home” base for this class - whenever you refer to a file with a relative path it will begin to look for it here.

To add a R Project to an existing folder on your computer (e.g., you already created a folder for this class), first open RStudio on your computer and click File > New Project, then:

Then, browse on your computer to select the existing folder you wish to add your R Project to. For example, mine is saved in my Documents and called my-stat331.

Your existing folder should now contain a my-stat331.Rproj file. This is your new “home” base for this class - whenever you refer to a file with a relative path it will begin to look for it here.

CautionYour folder cannot sync with anything online!

Your STAT 210 folder cannot be in a folder stored on OneDrive or iCloud! Storing your folder in this location will cause your code to periodically not run and I cannot help you fix it.

2 Reproducible Documents

Over the last ten years, science has experienced a “reproducibility” crisis. Meaning, a substantial portion of scientific findings were unable to be recreated because people didn’t sufficiently document the processes they used. As such, a foundational aspect of scientific research is using tools which allow others to reproduce your findings.

Enter Quarto—a dynamic document that allows us to interweave R code and written text in the same document. Gone are the days of copying and pasting the results of your R code into a Word document—breaking the connection between your analysis and your report. Quarto is here to save the day!

2.1 Downloading Quarto

The software associated with Quarto is automatically downloaded with the newest versions of RStudio. So, if you are using the most up to date version of RStudio (as instructed in Part 1 of this week’s coursework), you should already have Quarto installed on your computer. But, let’s test it out.

To ensure you have Quarto installed, carry out the following process:

  • Open RStudio
  • Click on “File” (in the upper navigation bar)
  • Select “New File” (in the dropdown options)
  • Select “Quarto Document…” (in the dropdown option)

A screenshot of the dropdown options for creating a new Quarto document. The image shows the options under the 'File' menu (in the upper option bar), the 'New File' option in the 'File' menu has been selected, opneing a popout. In the popout options, the cursor is highlighting the 'Quarto Document' option.

If you have Quarto installed, you should be prompted with the following menu:

A screenshot of the menu that should appear when you carry out the process described above. The menu is a square box with a title reading 'New Quarto Document'. On the left hand side of the box, the user can select what type of Quarto product they wish to create (Document, Presentation, or Interactive). On the right hand side, the user can control various aspects of their document, including the title, the author, the type of rendered document (HTML, PDF, or Word). At the bottom there are options to 'Create an Empty Document' (a barebones document), to 'Create' a document (with the user specified options), or 'Cancel'.

If, instead, you receive a message saying Quarto is not installed on your computer, you need to download Quarto: https://quarto.org/docs/download/

2.2 Introduction to Quarto

📖 Required Reading: Intro to Quarto

HTML Documents

We will exclusively use HTML documents in this course. If you are interested in learning more about formatting options for Quarto HTML documents, I would recommend checking out:

The biggest thing to realize is that a Quarto document contains three types of content:

  1. A YAML header surrounded by ---s at the top.
  2. Chunks of R code surrounded by ```{r} and ```.
  3. Text mixed with simple text formatting like # heading and _italics_.

For our class (STAT 250), you will generally only be editing the Chunks and the Text.

Basic Markdown

Markdown can format a document just like you format your Word documents in other classes, but we’ll just cover some of the basics here. You’ve already learned italics, bold and code displays. Markdown can make lists as well, using a variety of encoding (see the online tutorial). You’ve probably also noticed that when you format your text, its color display changes, to help you see the formatting easier.

Unrendered

Rendered

*italics*  or _italics_

italics or italics

**bold**

bold

`code`  

code

subscript~2~    

subscript2

superscript^2^  

superscript2

> Text block for answers

Text block for answers


# Header Size 1

Header Size 1

## Header Size 2

Header Size 2


### Header Size 3

Header Size 3

Markdown Comments Will Not Render: <!--- this comment won't show up rendered --->

Markdown Comments Will Not Render:

One last thing of note - markdown works best with “space to breathe.” If you find that something isn’t rendering correctly, making sure there is a blank line between the text you are trying to format and the next helps.

For example:
> Although this is green unrendered, it doesn't appear with the formatting we want. Try adding an "enter" between the "For example:" and the start of this line and render again. Did it fix it?
You'll also notice that if there are no line breaks, this formatting doesn't end, so you don't need to put a ">" before each line answer. 

For example: > Although this is green unrendered, it doesn’t appear with the formatting we want. Try adding an “enter” between the “For example:” and the start of this line and render again. Did it fix it? You’ll also notice that if there are no line breaks, this formatting doesn’t end, so you don’t need to put a “>” before each line answer.

Notice what happens when we add spacing:

For example:

> Although this is green unrendered, it doesn't appear with the formatting we want. Try adding an "enter" between the "For example:" and the start of this line and render again. Did it fix it?
You'll also notice that if there are no line breaks, this formatting doesn't end, so you don't need to put a ">" before each line answer. 

For example:

Although this is green unrendered, it doesn’t appear with the formatting we want. Try adding an “enter” between the “For example:” and the start of this line and render again. Did it fix it? You’ll also notice that if there are no line breaks, this formatting doesn’t end, so you don’t need to put a “>” before each line answer.

Mathematics

We often want to write small mathematics equations or write mathematical symbols. This is actually much easier in .qmd than in Word.

We surround an equation with $ signs: $e=mc^2$ which will appear as \(e=mc^2\) in the rendered document.

Here are some examples of the type of mathematics you will use within your Quarto documents.

Greek Letters

We often want to use statistical symbols within markdown to denote parameters with Greek letters. They are called by using a backslash (\) and then the full name of the Greek letter, such as \alpha, and surrounded by $ to be called within the equation-mode:

$\sigma$  
$\alpha$  
$\mu$  
$\beta$  

\(\sigma\)
\(\alpha\)
\(\mu\)
\(\beta\)

Superscripts and Subscripts

We can include sub- and superscripts to make our parameters and statistics specific. Note: once the equation is surrounded by $’s we don’t need the markdown edit for superscript like above. In math mode, you might see:

$x^2$  
$H_0$  
$\mu_1$  

\(x^2\)
\(H_0\)
\(\mu_1\)

We might want to make our subscript a longer phrase in statistics, to help denote our population of interest when we have more than one. We surround whatever we want to stay in a subscript with {}, such as:

$\mu_{female}$  
$\mu_{male}$  

\(\mu_{female}\)
\(\mu_{male}\)

Symbols over Letters

We can call our special statistical symbols in a similar way, such as \bar{x}. In this case, the bar symbol will be applied over the x, to make our statistical symbol for sample mean: \(\bar{x}\).

$\bar{x}$  
$\hat{p}$  
$\hat{\beta}_1$  

\(\bar{x}\)
\(\hat{p}\)
\(\hat{\beta}_1\)

Equations

Finally, we can write little equations for our outputs:, like \(\bar{x} = 3.54\).

$\bar{x} = 3.54$  
$\hat{y} = \hat{\beta}_0 + \hat{\beta}_1 x$  

\(\bar{x} = 3.54\)
\(\hat{y} = \hat{\beta}_0 + \hat{\beta}_1 x\)

Code Chunks

You’ve learned how to insert code chunks using the shortcuts Ctrl-Alt-I (or Cmd-Alt-I), by clicking the green Insert +C button above, but you can also just type the two sets of code that surround code chunks, ```{r} and the ending ``` . Note, those are back-ticks, found above the tab key, not single quotation marks.

We can add labels #| label: to support easy navigation as well, but be careful, DO NOT REPEAT LABELS or this will cause an error when you render the .qmd file.

```{r}
#| label: simple-addition
1 + 1
```
[1] 2

At the beginning of each Quarto document there should be a special code chunk called the setup chunk. This is where we read the packages used in the rest of the document. We often use #| include: false to as an option in the setup code chunk to hide the output in the rendered document.

You can also comment your code, which is a way of annotating it without affecting the code, using a # followed by text.

```{r}
#| label: print-answer
42*42  #the Answer to the Ultimate Question of Life, the Universe, and Everything, Squared
```
[1] 1764

Checking code & Rendering your Document

You can run specific code chunks to check their output by pressing the green play button to run the specific code chunk, or the icon next to it to run all code chunks prior to that code chunk.

Viewing Your Rendered Document

If you want to check your Markdown editing, you can render the document and view it in the ‘Viewer’ Pane by pressing the ‘Render’ button at the top of the Quarto document pane. Make sure the setting (gear icon above) is set to ‘Preview in Viewer pane’.

Changing the Document Type

You can change the document type you knit to automatically by changing the output type in the YAML header from format: html to format: docx and press the ‘Render’ button. Your word document will be created in the project folder! (In this class, we will always use html documents; please don’t edit the YAML header of your labs; we showed you this simply for future information.)

---
title: "My report"
format: html
execute:
  echo: true
  error: false
---  

Errors in your Code

If there is an error in your file, it will not render. When .qmd files render, they run on a completely clean environment. So if you somehow loaded data or something else not with code in your .qmd file, the rendering process will create an error.

If you cannot figure out the error, you can generally still get your document to render by changing error: false to error: true in the YAML code at the top.

---
title: "My report"
format: html
execute:
  echo: true
  error: true
---  

This will print out any errors in your code chunks, but still render the document.

Errors in your Markdown

The most common errors in your markdown that can cause issues are deleting back ticks (`) around your code chunks or dashes (-) from around the YAML code, so always check that first.

2.3 ✅ Check-in: Quarto Documents

Question 1: What are the options at the top of a Quarto document (between the --- and --- symbols) called?






Question 2: What symbols begin an R code chunk?

  1. ```
  2. ```{r}
  3. {r}
  4. `{r}`





Question 3: What symbol defines a heading?






Question 4: When working in a Quarto document, the

editor will display the raw Quarto document, whereas the

will display the document as it will appear when it is rendered.

Question 5: To produce an HTML report from your Quarto document, you need to click the ____ button.