Add sub-chapters in chp 1

Vinit-Sehgal · Jan 22, 2024 · 644f474 · 644f474
1 parent c669525
commit 644f474
Show file tree

Hide file tree

Showing 67 changed files with 1,018 additions and 2,432 deletions.
diff --git a/_quarto.yml b/_quarto.yml
@@ -18,9 +18,6 @@ book:
   chapters:
     - index.qmd
     - ch1.qmd
-    - ch2.qmd
-    - ch3.qmd
-    - ch4.qmd
 
 format:
   html:

diff --git a/awesome_plot.png b/awesome_plot.png
diff --git a/ch1.qmd b/ch1.qmd
@@ -1,141 +1,28 @@
 ---
 title-block-banner: true
 ---
+# Basics of R
 
-# Operators and data types
 
-## Basic operators
+```{r child="ch1_a.qmd", message = FALSE, warning = FALSE}
 
-In this section, we will learn about some basic R operators that are used to perform operations on variables. Some most commonly used operators are shown in the table below.
 
-<center>![](images/Basic_operators.png){width="70%"}</center>
-
-> R follows the conventional order (sequence) to solve mathematical operations, abbreviated as BODMAS: Brackets, Orders (exponents), Division, Multiplication, Addition, and Subtraction
-
-```{r Chapter 1, message = FALSE, warning = FALSE}
-2+4+7 # Sum
-4-5   # Subtraction
-2*3   # Multiplication
-1/2   # Division
-
-# Order of operation
-1/2*3+4-5
-1/2*(3+4-5)
-1/(2*(3+4-5))
-1/(2*3+4-5) 
-# Notice how output changes with the placement of operators
-
-# Other operators:
-2^3
-log(10)
-sqrt(4)
-pi
-
-# Clear the Environment
-rm(list=ls()) # rm is for remove,ls is short for list. The empty parenthesis i.e. () signifies all content. 
 ```
 
-## Basic data operations
-
-In this section, we will create some vector data and apply built-in operations to examine the properties of a dataset.
-
-```{r Basic data operation, message = FALSE, warning = FALSE}
-# The "is equal to" or "assignment operator in R is "<-" or "=" 
-
-# Generate sample data. Remember "c" comes from for "concatenate". 
-data<-c(1,4,2,3,9)    # Try data = c(1,4,2,3,9). Is there any difference in data in both cases?
-
-# rbind combines data by rows, and hence "r"bind
-# cbind combines data by columns, and hence "c"bind
-
-# Checking the properties of a dataset. Note: the na.rm argument ignores NA values in the dataset.
-data=rbind(1,4,2,3,9) 
-dim(data)           # [5,1]: 5 rows, 1 column
-data[2,1]           # Show the value in row 2, column 1
-data[c(2:5),1]      # Show a range of values in column 1
-mean(data, na.rm=T) # Mean
-max(data)           # Maximum
-min(data)           # Minimum
-sd(data)            # Standard deviation
-var(data)           # Variance
-
-summary(data) 
-str(data)        # Prints structure of data
-head(data)       # Returns the 1st 6 items in the object
-head(data, 2)    # Print first 2
-tail(data, 2)    # Print last 2
 
-# Do the same, but with "c()" instead of "rbind"
-data=c(1,4,2,3,9) 
-dim(data)        # Note: dim is NULL
-length(data)     # Length of a dataset is the number of variables (columns)
+```{r child="ch1_b.qmd", message = FALSE, warning = FALSE}
 
-data[2]          # This should give you 4 
-
-# Other operators work in the same way
-mean(data)       # Mean
-max(data)        # Maximum
-min(data)        # Minimum
-sd(data)         # Standard deviation
-var(data)        # Variance
-
-# Text data
-data=c("LSU","SPESS","AgCenter","Tigers") 
-data             # View
-data[1]
-
-# Mixed data
-data=c(1,"LSU",10,"AgCenter") # All data is treated as text if one value is text
-data[3]                       # Note how output is in quotes i.e. "10"
 
 ```
 
-> *For help with a function in R, just type ? followed by the function to display information in the help menu. Try pasting `?sd` in the console.*
-
-## Data types
-
-In R, data is stored as an "array", which can be 1-dimensional or 2-dimensional. A 1-D array is called a "vector" and a 2-D array is a "matrix". A table in R is called a "data frame" and a "list" is a container to hold a variety of data types. In this section, we will learn how to create matrices, lists and data frames in R.
-
-<center>![](images/list_visual.png){width="80%"}</center>
 
-```{r Data types, message = FALSE, warning = FALSE}
-# Lets make a random matrix
-test_mat = matrix( c(2, 4, 3, 1, 5, 7), # The data elements 
-  nrow=2,         # Number of rows 
-  ncol=3,         # Number of columns 
-  byrow = TRUE)   # Fill matrix by rows 
+```{r child="ch1_c.qmd", message = FALSE, warning = FALSE}
 
-test_mat = matrix( c(2, 4, 3, 1, 5, 7),nrow=2,ncol=3,byrow = TRUE) # Same result 
-test_mat
 
+```
 
-test_mat[,2]      # Display all rows, and second column
-test_mat[2,]      # Display second row, all columns
-
-# Types of datasets
-out = as.matrix(test_mat)
-out               # This is a matrix
-out = as.array(test_mat)
-out               # This is also a matrix
-out = as.vector(test_mat)
-out               # This is just a vector
-
-# Data frame and list
-data1=runif(50,20,30) # Create 50 random numbers between 20 and 30  
-data2=runif(50,0,10)  # Create 50 random numbers between 0 and 10  
-
-# Lists
-out = list()        # Create and empty list
-out[[1]] = data1    # Notice the brackets "[[ ]]" instead of "[ ]"
-out[[2]] = data2
-out[[1]]          # Contains data1 at this location
 
-# Data frame
-out=data.frame(x=data1, y=data2)
+```{r child="ch1_d.qmd", message = FALSE, warning = FALSE}
 
-# Let's see how it looks!
-plot(out$x, out$y)
-plot(out[,1])
-```
 
-> For a data frame, the dollar "\$" sign invokes the variable selection. Imagine how one would receive merchandise in a store if you give \$ to the cashier. Data frame will list out the variable names for you of you when you show it some \$.
+```
diff --git a/ch1_a.qmd b/ch1_a.qmd
@@ -0,0 +1,141 @@
+---
+title-block-banner: true
+---
+
+## Operators and data types
+
+### Basic operators
+
+In this section, we will learn about some basic R operators that are used to perform operations on variables. Some most commonly used operators are shown in the table below.
+
+<center>![](images/Basic_operators.png){width="70%"}</center>
+
+> R follows the conventional order (sequence) to solve mathematical operations, abbreviated as BODMAS: Brackets, Orders (exponents), Division, Multiplication, Addition, and Subtraction
+
+```{r Chapter 1, message = FALSE, warning = FALSE}
+2+4+7 # Sum
+4-5   # Subtraction
+2*3   # Multiplication
+1/2   # Division
+
+# Order of operation
+1/2*3+4-5
+1/2*(3+4-5)
+1/(2*(3+4-5))
+1/(2*3+4-5) 
+# Notice how output changes with the placement of operators
+
+# Other operators:
+2^3
+log(10)
+sqrt(4)
+pi
+
+# Clear the Environment
+rm(list=ls()) # rm is for remove,ls is short for list. The empty parenthesis i.e. () signifies all content. 
+```
+
+### Basic data operations
+
+In this section, we will create some vector data and apply built-in operations to examine the properties of a dataset.
+
+```{r Basic data operation, message = FALSE, warning = FALSE}
+# The "is equal to" or "assignment operator in R is "<-" or "=" 
+
+# Generate sample data. Remember "c" comes from for "concatenate". 
+data<-c(1,4,2,3,9)    # Try data = c(1,4,2,3,9). Is there any difference in data in both cases?
+
+# rbind combines data by rows, and hence "r"bind
+# cbind combines data by columns, and hence "c"bind
+
+# Checking the properties of a dataset. Note: the na.rm argument ignores NA values in the dataset.
+data=rbind(1,4,2,3,9) 
+dim(data)           # [5,1]: 5 rows, 1 column
+data[2,1]           # Show the value in row 2, column 1
+data[c(2:5),1]      # Show a range of values in column 1
+mean(data, na.rm=T) # Mean
+max(data)           # Maximum
+min(data)           # Minimum
+sd(data)            # Standard deviation
+var(data)           # Variance
+
+summary(data) 
+str(data)        # Prints structure of data
+head(data)       # Returns the 1st 6 items in the object
+head(data, 2)    # Print first 2
+tail(data, 2)    # Print last 2
+
+# Do the same, but with "c()" instead of "rbind"
+data=c(1,4,2,3,9) 
+dim(data)        # Note: dim is NULL
+length(data)     # Length of a dataset is the number of variables (columns)
+
+data[2]          # This should give you 4 
+
+# Other operators work in the same way
+mean(data)       # Mean
+max(data)        # Maximum
+min(data)        # Minimum
+sd(data)         # Standard deviation
+var(data)        # Variance
+
+# Text data
+data=c("LSU","SPESS","AgCenter","Tigers") 
+data             # View
+data[1]
+
+# Mixed data
+data=c(1,"LSU",10,"AgCenter") # All data is treated as text if one value is text
+data[3]                       # Note how output is in quotes i.e. "10"
+
+```
+
+> *For help with a function in R, just type ? followed by the function to display information in the help menu. Try pasting `?sd` in the console.*
+
+### Data types
+
+In R, data is stored as an "array", which can be 1-dimensional or 2-dimensional. A 1-D array is called a "vector" and a 2-D array is a "matrix". A table in R is called a "data frame" and a "list" is a container to hold a variety of data types. In this section, we will learn how to create matrices, lists and data frames in R.
+
+<center>![](images/list_visual.png){width="80%"}</center>
+
+```{r Data types, message = FALSE, warning = FALSE}
+# Lets make a random matrix
+test_mat = matrix( c(2, 4, 3, 1, 5, 7), # The data elements 
+  nrow=2,         # Number of rows 
+  ncol=3,         # Number of columns 
+  byrow = TRUE)   # Fill matrix by rows 
+
+test_mat = matrix( c(2, 4, 3, 1, 5, 7),nrow=2,ncol=3,byrow = TRUE) # Same result 
+test_mat
+
+
+test_mat[,2]      # Display all rows, and second column
+test_mat[2,]      # Display second row, all columns
+
+# Types of datasets
+out = as.matrix(test_mat)
+out               # This is a matrix
+out = as.array(test_mat)
+out               # This is also a matrix
+out = as.vector(test_mat)
+out               # This is just a vector
+
+# Data frame and list
+data1=runif(50,20,30) # Create 50 random numbers between 20 and 30  
+data2=runif(50,0,10)  # Create 50 random numbers between 0 and 10  
+
+# Lists
+out = list()        # Create and empty list
+out[[1]] = data1    # Notice the brackets "[[ ]]" instead of "[ ]"
+out[[2]] = data2
+out[[1]]          # Contains data1 at this location
+
+# Data frame
+out=data.frame(x=data1, y=data2)
+
+# Let's see how it looks!
+plot(out$x, out$y)
+plot(out[,1])
+```
+
+> For a data frame, the dollar "\$" sign invokes the variable selection. Imagine how one would receive merchandise in a store if you give \$ to the cashier. Data frame will list out the variable names for you of you when you show it some \$.
diff --git a/ch2.qmd → ch1_b.qmd b/ch2.qmd → ch1_b.qmd
@@ -2,11 +2,11 @@
 title-block-banner: true
 ---
 
-# Plotting with base R
+## Plotting with base R
 
 If you need to quickly visualize your data, base `R` has some functions that will help you do this in a pinch. In this section we'll look at some basics of visualizing univariate and multivariate data.
 
-## Overview
+### Overview
 
 ```{r data frame and list, message = FALSE, warning = FALSE}
 # Create 50 random numbers between 0 and 100  
@@ -26,7 +26,7 @@ plot(density(data))  # Plot with density distribution
 
 ```
 
-## Plotting univariate data
+### Plotting univariate data
 
 Let's dig deeper into the plot function. Here, we will look at how to adjust the colors, shapes, and sizes for markers, axis labels and titles, and the plot title.
 
@@ -63,7 +63,7 @@ hist(data,col="red",
 
 ```
 
-## Plotting multivariate data
+### Plotting multivariate data
 
 Here, we introduce you to data frames: equivalent of tables in `R`. A data frame is a table with a two-dimensional array-like structure in which each column contains values of one variable and each row contains one set of values from each column.
 
@@ -88,7 +88,7 @@ legend("bottomright", legend = 1:4, col=1:4, pch=1)        # Add legend at the b
 
 ```
 
-## Time series data
+### Time series data
 
 Working with time series data can be tricky at first, but here's a quick look at how to quickly generate a time series using the as.Date function.
 
@@ -102,7 +102,7 @@ plot(df,type="o")
 
 ```
 
-## Combining plots
+### Combining plots
 
 You can built plots that contain subplots. Using base R, we call start by using the "par" function and then plot as we saw before.
 
@@ -149,7 +149,7 @@ hist(data,col="red",
 )
 ```
 
-## Saving figures to disk
+### Saving figures to disk
 
 Plots can be saved as image files or a PDF. This is done by specifying the output file type, its size and resolution, then calling the plot.