SEARCH
You are in browse mode. You must login to use MEMORY

   Log in to start

IST 387 Fall 2020


🇬🇧
In English
Created:


Public


0 / 5  (0 ratings)



» To start learning, click login

1 / 20

[Front]


True or False: You need to "library" a package each time you start a new R session.
[Back]


True

Practice Known Questions

Stay up to date with your due questions

Complete 5 questions to enable practice

Exams

Exam: Test your skills

Test your skills in exam mode

Learn New Questions

Dynamic Modes

SmartIntelligent mix of all modes
CustomUse settings to weight dynamic modes

Manual Mode [BETA]

The course owner has not enabled manual mode
Specific modes

Learn with flashcards
Complete the sentence
Listening & SpellingSpelling: Type what you hear
multiple choiceMultiple choice mode
SpeakingAnswer with voice
Speaking & ListeningPractice pronunciation
TypingTyping only mode

IST 387 Fall 2020 - Leaderboard

0 users have completed this course. Be the first!

No users have played this course yet, be the first


IST 387 Fall 2020 - Details

Levels:

Questions:

20 questions
🇬🇧🇬🇧
How do you declare a variable in R?
My_value <- 5 my_str <- "Hello world" my_vector <- c(5,65,23,1) names <- c("Ann", "Bob", "Clyde", "Lu") my_df <- data.frame(names, my_vector) my_df$names <- as.character(my_df$names)
What is a factor variable and how can you create one in R?
A factor variable is a variable that can take on a limited number of discrete values, i.e. a categorical variable. mtcars$gear_factor<-as.factor(mtcars$gear)
What is the difference between a histogram and bar chart?
Histograms are used for continuous variables; bar graphs are used for discrete/categorical variables. ggplot(data = mtcars,aes(x=mpg))+geom_histogram() ggplot(data = mtcars,aes(x=gear))+geom_bar()
How do you create a boxplot in R?
Boxplot(mpg ~ am, data=mtcars)
What are scatterplots useful for and how can you create one in R?
Method 1, using base graphs: plot(airquality$Ozone, airquality$Wind) Method 2, using ggplot2: ggplot(airquality, aes(x=Ozone, y=Wind)) + geom_point()
What packages can be used for data mining in R?
Ggplot2: visualization tm: text mining lm: linear regression arules: association rules mining caret: machine learning
Name an R package which can be used for data imputation.
ImputeTS (for time series data); imputeR
How do you install a package in R?
Install.packages("name_of_package")
What is R?
R is an open-source language for statistical computing and data science. It can be used in command-line mode or with "R scripts;" in its stand-alone version (base R), or in its integrated development environment (IDE) - RStudio. RStudio is also available on the cloud - RStudio Cloud.
What is the basic syntax in R?
<- is the "assignment operator," used to declare new variables and assign values to them (technically, = can be used for assignment too) # in the beginning of a line of code is used to mark that line as a comment (aka "comment it out") name_of_function() - you can identify functions in R by the parentheses following them. For example, mean(name_of_df_column) is applying the mean() function to all numbers in a dataframe column, i.e. the function arguments, or what you want to apply the function to, go inside the parentheses; in this case, the mean() function returns a single value, the average of the numbers in the dataframe column new_df <- df[df$likelihood_to_recommend == 8, ] - this is a typical way of "subsetting" from a dataframe called df. In this case, new_df is a subset of df containing all of df's columns (because there is nothing following the comma inside the square brackets - remember, the comma is used to separate the rows we want - before the comma, from the columns - after the comma), but only certain rows - the rows for which the likelihood_to_recommend column in df has a value of exactly 8. You can modify this condition - e.g. you can change == to >, in which case only rows with likelihood_to_recommend values greater than 8 will be included in the new dataframe. $ - this operator is used for "getting inside" a dataframe. E.g. df$likelihood_to_recommend means we want to access the likelihood_to_recommend column in the df dataframe. df$text means we want to access another column in that dataframe - the column called "text."
What are some of the advantages and disadvantages of R?
+ Open-source Runs on all major platforms Large and active R user community = ample online resources Developed by statisticians specifically for data analysis One of the top programming languages for data science - Its performance depends on your machine's memory resources (in particular, your RAM) Because of that, it may be slower than Python for data-intensive operations Some of us experienced difficulties loading certain packages - package compatibility issues and conflicts between different packages (e.g. tidyverse and ggplot2) are a drawback
What are some common data types in R?
Logical (TRUE or FALSE) Numeric (e.g. 5, 0.643, 1.e+9) Character (e.g. "a", "abc", "Hello", "This is my code")
What are some common data objects in R?
Single data values (e.g. 6, 23455, "What is this?", y) Vectors Data frames Matrices
Why is R useful for data science?
R was created specifically for the purposes of statistical analysis which makes it a great candidate for data science data manipulations since it offers great functionality when it comes to data cleaning, model building and evaluation, and data visualization. There are R packages specifically geared towards data science such as caret.
How do you get the name of the current working directory in R?
The working directory is the folder on your computer R checks for a file whenever you want to import data into R. For example, you can set your Downloads folder as your working directory, and then you'll only need to supply the name of the file you want to import instead of the full path to that file: df <- read_csv("myFile.csv") instead of: df <- read_csv("C:\\User\\Downloads\\myFile.csv") To see what your current working directory is, type: getwd() And to change it: setwd("path\\to\\new\\working\\directory")