Your progress on quizzes is being saved!

Tutorial 7

In this tutorial, you’ll get started working with the linear model. The goal of this tutorial is to get a feel for the equation of a line and the basic structure of the lm() function. We’ll be spending a lot of time on linear models, so just focus on getting the basics down for now.

Setting Up

All you need is this tutorial and RStudio. Remember that you can easily switch between windows with the Alt + ↹ Tab (Windows) and ⌘ Command + ⇥ Tab (Mac OS) shortcuts.

Task 1

Create a new week_08 project and open it in RStudio. Then, create two new folders in the new week_08 folder: r_docs and data. Finally, open a new R Markdown file and save it in the r_docs folder. Since we will be practising reporting (writing up results), we will need R Markdown later on. For the tasks, get into the habit of creating new code chunks as you go.

Remember, you can add new code chunks by:

  1. Using the RStudio toolbar: Click Code > Insert Chunk
  2. Using a keyboard shortcut: the default is Ctrl + Alt + I (Windows) or ⌘ Command + Alt + I (MacOS), but you can change this under Tools > Modify Keyboard Shortcuts…
  3. Typing it out: ```{r}, press Enter, then ``` again.
  4. Copy and pasting a code chunk you already have (but be careful of duplicated chunk names!)

 

Task 2punk!

Add and run the command to load the tidyverse package in the setup code chunk.

 

The Linear Model

Let’s start with a refresher to get warmed up. In the lecture we learned about the equation of a line, a very important equation that we will be using again and again.


\(y_{i} = b_{0} + b_{1}x_{1i}\)


This equation has the following elements:

These definitions are important for you to know, but if they don’t make a lot of sense at the moment, don’t worry - that’s what this tutorial is for. We’ll start by getting a handle on what the b-values do first, and then we’ll move on to creating a linear model with the familiar gensex dataset to get some practice using these values.

Using the Equation

Before we jump into the data, let’s have a go practicing with the model itself and getting the hang of how it works. To do this, we’ll use the beautiful interactive visualisation below, courtesy of Milan.

Using the Visualisation

The yellow line, our linear model, is determined by two values:

  • b0, the intercept, in purple
  • b1, the slope, in teal

You can change the values of each by moving the sliders. The numbers in the coloured circles correspond to the line above. You can reset the sliders to 0 by double clicking on them.

As you move the teal b1 slider, notice the solid horizontal black line that moves up and down. Where this line connects with the y axis is the predicted value of y when x = 1.

Task 3punk!

Spend a minute playing with the visualisation and moving both sliders to get the hang of how it works. Then try out the quiz questions below.

What happens to the line as you change the value of b0 (purple slider)?

The location of the line changes (up and down)

What happens to the line as you change the value of b1 (teal slider)?

The slope of the line changes

Set the sliders so that the line slopes down from left to right. What is the direction of the relationship quantified by b1?

Negative

Set b0 to -.66 and b1 to 4.65. What’s the predicted value of y to the nearest whole number?

4

Correct!That’s not right…

Well done so far. You can keep playing with the visualisation as much as you like - it will help a lot if you can understand how these values work in the linear model equation.

Flatlining

Before we move on, let’s take a look at a special scenario. Move or reset the sliders so that the line is perfectly horizontal, then answer the following questions.

When the linear model is perfectly horizontal, what is the value of b1?

0

Correct!That’s not right…

When b1 is 0, what is the relationship between x and y?

No relationship

This is an important point, so make sure you think it through. Remember that the slope of the line captures the change in y for each unit change in x. If the slope is 0, that means that no matter how much x changes, y doesn’t change at all. So, in other words, there is no relationship between x and y.

Why is this so important? You might recognise “no relationship between x and y” as a way we often state the null hypothesis. We’ll talk more next week about hypothesis testing for linear models, but right now, the key is to realise that a b1 of 0 means no relationship.

Now that we have a sense of how the b-values in the linear model work, let’s move on to looking at some data. Remember that you can always come back to the visualisation if you’d like to practice with the equation more.

lm() in R

Let’s have another look at the gensex data we’ve been working with all term. We’ve only had a look at a few variables thus far, but there’s lots more interesting info here!

Creating the Model

Task 4punk!

Read in the gensex data, using the link below.

Link: https://and.netlify.app/datasets/gensex_2022.csv

This time we’ll use some different variables. I’ll be using romantic_freq`` andsexual_freq`, which are ratings of how frequently the participants experience romantic and sexual attraction respectively. A higher score indicates higher frequency.

Task 5Prog-rocK

What do you think the relationship between your two variables will be? Take some notes in your RMarkdown document. Consider:

Task 6

Run the linear model with the following steps.

Task 6.1punk!

Write the formula for lm(), using frequency of romantic attraction as the predictor and frequency of sexual attraction as the outcome.

Task 6.2punk!

Use the lm() function to create a linear model with these variables and save it as freq_lm.

Interpreting the Model

Call the freq_lm object and use the output to answer the following questions. Round all your answers to 2 decimal places.

Hint

If you get stuck on any of these questions, see the solution to the task after the quiz.

What is the value of b0 in this analysis?

3.38

Correct!That’s not right…

What is the value of b1 in this analysis?

0.47

Correct!That’s not right…

What is the direction of the relationship between these variables?

Positive

Task 7Prog-rocK

In your RMarkdown document, write the equation of the linear model for this analysis.

Task 8

In your RMarkdown document, write down your interpretation of the b1 value you obtained. What does it tell you?

Task 9

Move the sliders in the interactive visualisation above to set the visualisation to the same values from your model. Is this the strength and direction of the relationship you predicted? Write down your thoughts in your RMarkdown.

Task 10Prog-rocK

Create a scatterplot of the same variables, with romantic_freq on the x axis and sexual_freq on the y axis, and tweak the formatting to make it look professional.

Hint

If you don’t remember how to create a scatterplot, have a look back at Skills Lab 2.

Task 10.1jazz...

Optionally, figure out how to add a line of best fit to this plot. Does it look like you expected based on the visualisation?

Hint

Do some Googling and figure out how to add a line of best fit to your plot. There are tons of resources on the Internet, and lots of ways you can accomplish this task.

Alternatively, check out the R Code panels for the plots in the lecture.

Task 11

Imagine you have a friend who would rate their frequency of romantic attraction as a 2. What does your model predict their frequency of sexual attraction would be?

4.32

Correct!That’s not right…

Task 12Prog-rocK

Overall, what have we discovered about the relationship between the frequency of sexual and romantic attraction? What don’t we know yet? Write down your thoughts in your Markdown.

Hint

Really take the time to think about this! What does the positive value of b1 tell you? How can you interpret the plot? What does this tell you about attraction?

Recap

Well done on all of your hard work! Make sure you work on these ideas and get them down clearly; they will be very important for the rest of the module. You should now be able to do the following:

 

Good job!

That’s all for today. See you soon!