We can use R to check that our data meet the four main assumptions for linear regression. Step 2: Make sure your data meet the assumptions This tells us the minimum, median, mean, and maximum values of the independent variable (income) and dependent variable (happiness):Īgain, because the variables are quantitative, running the code produces a numeric summary of the data for the independent variables (smoking and biking) and the dependent variable (heart disease): Simple regression summary(income.data)īecause both our variables are quantitative, when we run this function we see a table in our console with a numeric summary of the data. Click on the Import button and the file should appear in your Environment tab on the upper right side of the RStudio screen.Īfter you’ve loaded the data, check that it has been read in correctly using summary().In the Data Frame window, you should see an X (index) column and columns listing the data for each of the variables ( income and happiness or biking, smoking, and heart.disease).Choose the data file you have downloaded ( income.data or heart.data), and an Import Dataset window pops up.In RStudio, go to File > Import dataset > From Text (base).Next, load the packages into your R environment by running this code (you need to do this every time you restart R): library(ggplot2)įollow these four steps for each dataset: To install the packages you need for the analysis, run this code (you only need to do this once): install.packages("ggplot2") To run the code, highlight the lines you want to run and click on the Run button on the top right of the text editor (or press ctrl + enter on the keyboard). Then open RStudio and click on File > New File > R Script.Īs we go through each step, you can copy and paste the code from the text boxes directly into your script.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |