revise all hw

emse-p4a-gwu · Jan 3, 2025 · 61c02a6 · 61c02a6
1 parent cea797a
commit 61c02a6
Show file tree

Hide file tree

Showing 31 changed files with 1,543 additions and 26 deletions.
diff --git a/figs/grades-1.png b/figs/grades-1.png
diff --git a/fragments/hw.qmd b/fragments/hw.qmd
@@ -30,7 +30,7 @@ url_template <- paste0(
 
 > **Due**: `r due`
 >
-> **Weight**: This assignment is worth 4% of your final grade.
+> **Weight**: This assignment is worth 3.75% of your final grade.
 >
 > **Purpose**: `r params$purpose`
 >

diff --git a/hw/1-getting-started.qmd b/hw/1-getting-started.qmd
@@ -19,4 +19,110 @@ params:
     > - Know the distinctions between how R handles different types of data types (numbers, strings, & logicals).
 ---
 
-Coming soon!
+```{r child = here::here("fragments", "hw.qmd")}
+```
+
+### Readings
+
+The readings from the last week will serve as a helpful reference as you complete this assignment. You can review them here:
+
+<blockquote class="blockquote">
+
+```{r}
+#| echo: false
+#| message: false
+#| results: asis
+
+htmltools::HTML(readings_current)
+```
+
+</blockquote>
+
+### 1) Class setup [SOLO, 10%]
+
+For this class, you'll need to install some software and register for some tools. You should have already done this, but in case you haven't,go to the course [software](../software.html) page to get set up.
+
+Once you have joined the [class slack]({{< var slack >}}), make a post to the `#welcome` channel introducing yourself - provide your name, year / program, and something interesting about yourself.
+
+### 2) Getting familiar with the course [SOLO, 10%]
+
+Follow [Snoop's advice](https://www.youtube.com/watch?v=Tnlaokj1opA) and read the entire [Course Syllabus](../syllabus.html) (actually read the whole thing). Then review the [schedule](../schedule.html) and make sure to note important upcoming deadlines.
+
+### 3) Staying organized [SOLO, 10%]
+
+Open RStudio and create a new R project called "hw1" (see the [reading](https://p4a.jhelvy.com/getting-started.html#rstudio-projects){target="_blank"} for details on how to do this). Within your project, create a new R _script_ (a ".R" file) and save it as "hw1.R". When you save it, it should show up in the R project folder you just created. Finally, copy the following code to the top of this script and fill out your name, netID, and the names of anyone you worked with on this assignment (your netID is the part of your email address before `"@gwu.edu"`):
+
+```{r eval=FALSE}
+# Name:  Last, First
+# netID: Insert your netID here
+
+# I worked with the following classmates on this assignment:
+# 1) Name: Last, First
+# 2) Name: Last, First
+```
+
+Write your responses to all other questions in this assignment in your R file.
+
+### 4) Objects & Operators: Converting Time [COLLABORATIVE, 20%]
+
+Create objects to store each of the following two values - be sure to use [meaningful variable names](https://p4a.jhelvy.com/getting-started.html#use-meaningful-variable-names){target="_blank"} when creating your objects:
+
+- The number of seconds in a minute
+- The number of minutes in an hour
+- The number of hours in a day
+- The number of days in a typical year (not a leap year)
+
+Now, say you have another object called `time_in_seconds` that contains an integer number of seconds (for example, `time_in_seconds <- 8675309`). Write code to convert the value stored in `time_in_seconds` into the units described below. Your solution may only use arithmetic operators and the objects you created (i.e. you may **not** use any numbers). You may also use the new objects you create in sequential order. For example, you may use the object created in part a) to create the object in part b), and so on.
+
+a) The value of `time_in_seconds` in minutes
+b) The value of `time_in_seconds` in hours
+c) The value of `time_in_seconds` in days
+d) The value of `time_in_seconds` in years
+
+### 5) Logical and relational operators  [SOLO, 20%]
+
+Consider the following objects:
+
+```{r}
+w <- FALSE
+x <- TRUE
+y <- FALSE
+z <- TRUE
+```
+
+Write code to answer the following questions:
+
+a) Write a statement with _logical_ operators that compares the objects `x`, `y`, and `z` and returns `TRUE`
+b) Write a statement with _logical_ operators that compares the objects `x`, `y`, and `z` and returns `FALSE`
+c) Fill in _relational_ operators to make the following statement return `TRUE`:
+
+`! (x __ y) & ! (z __ y)`
+
+d) Fill in _relational_ operators to make this statement return `FALSE`:
+
+`! (w __ y) | (z __ y)`
+
+### 6) Data types [COLLABORATIVE, 20%]
+
+Consider the following objects:
+
+```{r}
+number    <- typeof('3')
+character <- typeof(7)
+false     <- typeof("FALSE")
+true      <- typeof(TRUE)
+```
+
+Write code to answer the following questions:
+
+a) Write a statement with both _relational_ & _logical_ operators that compares the four objects `number`, `character`, `false`, and `true` and returns `TRUE`.
+b) Write a statement with both _relational_ & _logical_ operators that compares the four objects `number`, `character`, `false`, and `true` and returns `FALSE`.
+
+### 7) Read and reflect [SOLO, 10%]
+
+```{r child = here::here("fragments", "reflection.qmd")}
+```
+
+### Submit
+
+[Create a zip file](https://p4a.seas.gwu.edu/2024-Spring/faq.html#how-do-i-make-a-zip-file-for-my-homework) of all the files in your R project folder for this assignment, then submit your zip file on the corresponding assignment submission on Blackboard.
diff --git a/hw/10-data-visualization-temp.qmd b/hw/10-data-visualization-temp.qmd
@@ -0,0 +1,109 @@
+---
+title: "Homework 10 - Data Visualization"
+params:
+  number: 10
+  purpose: |
+    The purposes of this assignment are:
+    >
+    > - To practice exploring and data frames in R using the **dplyr** package
+    > - To practice generating charts using the **ggplot2** package
+---
+
+```{r child = here::here("fragments", "hw.qmd")}
+```
+
+### Readings
+
+The readings from the last week will serve as a helpful reference as you complete this assignment. You can review them here:
+
+<blockquote class="blockquote">
+
+```{r}
+#| echo: false
+#| message: false
+#| results: asis
+
+htmltools::HTML(readings_current)
+```
+
+</blockquote>
+
+### **Using AI tools**
+
+> On this assignment, you are encouraged to use ChatGPT and other AI tools (e.g. Github Copilot). But don't just blindly copy-paste code. The code provided by these tools is not perfect, and you will likely need to modify it to get the correct solution. If you do use an AI tools, you must include the prompt(s) you used (using a comment with `#`) to recieve full credit. If you had to change anything to your prompt to get better results, write that down too in your code with a comment. Learn to use tools like ChatGPT as a learning assistant - a tool to help you accomplish the task - rather than just a solutions manual. One version of using it makes you a better and more efficient coder, the other robs you of that.
+
+### 1) Staying organized [5%]
+
+Download and use [this template](templates/hw10.zip) for your assignment. Inside the "hw10" folder, open and edit the R script called `hw10.R` and fill out your name, GW netID, and the names of anyone you worked with on this assignment.
+
+### 2) Choose and load some data [5%]
+
+For this assignment, you will need to find a dataset of your choosing and create **three** summary visualizations. To keep things manageable, choose one of the following datasets from the following libraries. Note that to load any of these data frames, all you need to do is install and load the package.
+
+**dplyr**:
+
+- `storms`
+- `starwars`
+
+**ggplot2**:
+
+- `diamonds`
+- `economics`
+- `midwest`
+- `mpg`
+- `msleep`
+- `txhousing`
+
+**dslabs**:
+
+- `gapminder`
+- `movielens`
+- `murders`
+- `stars`
+
+### 3) Inspect your data [10%]
+
+Once you've chosen a data set, open your `hw10.R` file and begin exploring the data (be sure to load the package that contains the dataset at the top of your file). Write some code in code chunks to preview and summarize the data frame using some of the methods we've used in class. You should be able to quickly get an understanding of what variables are included and their nature. Consider the following questions in your exploration (you don't have to write out answers to these questions - just write code to help you answer them by previewing the data in different ways):
+
+- What is the total size of the data frame?
+- What type of data is each variable (numeric, character, logical, date)?
+- Do any variables have missing values? Why might that be?
+- What are the "boundaries" of each period of observation:
+    - For numeric variables, what are the min and max values?
+    - For character variables, what are the unique values in the variable?
+    - For date variables, what time period do the observations in these data frames span?
+
+**Do not brush this step off** - the more thoroughly you inspect your dataset, the easier (and better) you data exploration will be. This will be absolutely critical for making your charts. Make sure you take the time to develop an understanding of the variables in your dataset as it is nearly impossible to imagine what different charts might be worth creating otherwise.
+
+### 4) Make charts [40%]
+
+Now that you have a basic understanding of the dataset, make some charts to explore the variables in the data and their potential relationships. You may use base R plotting functions or the **ggplot2** package to make your figures, but you must make at least two different types of figures, including:
+
+1. A scatterplot of involving at least two variables.
+2. A bar chart involving at least one variable.
+
+You can choose to plot whichever variables you wish, but you must be able to interpret the results of your chart.
+
+### 5) Interpret your charts [15%]
+
+Below the code for each of your charts, write a description and interpretation of your chart in a comment. Make sure you address at least the following questions:
+
+1. Describe what variables you are plotting and why.
+2. Describe the primary relationship / trend / information you hope the reader will gain from your visualization.
+
+### 6) Save your charts [15%]
+
+At the bottom of your `hw10.R` file, write code to save each of your charts in the `plots` folder. Save them as .png files.
+
+### 7) Read and reflect [SOLO, 10%]
+
+Read and reflect on the following readings to preview what we will be covering next:
+
+> - Sections 26.3 (Data Frame Functions) & 26.4 (Plot Functions) in Hadley Wickham's book R4DS: [https://r4ds.hadley.nz/functions.html#data-frame-functions](https://r4ds.hadley.nz/functions.html#data-frame-functions)
+> - This blog post on iteration with the {purrr} package: [https://www.rebeccabarter.com/blog/2019-08-19_purrr](https://www.rebeccabarter.com/blog/2019-08-19_purrr)
+
+Afterwards, in a comment (`#`) in your .R file, write a short reflection on what you've learned and any questions or points of confusion you have about what we've covered thus far. This can just few a few sentences related to this assignment, next week's readings, things going on in the world that remind you something from class, etc. If there's anything that jumped out at you, write it down.
+
+### Submit
+
+{{< var hw_submit >}}
diff --git a/hw/11-programming-with-data-temp.qmd b/hw/11-programming-with-data-temp.qmd
@@ -0,0 +1,122 @@
+---
+title: "Homework 11 - Programming with Data"
+params:
+  number: 11
+  purpose: |
+    The purposes of this assignment are:
+    >
+    > - Be able to compose functions using the 'tidy evaluation' syntax to work with data.
+    > - Be able to parse through lists using `purrr::map()` functions.
+---
+
+```{r child = here::here("fragments", "hw.qmd")}
+```
+
+### Readings
+
+The readings from the last week will serve as a helpful reference as you complete this assignment. You can review them here:
+
+> - Sections 26.3 (Data Frame Functions) & 26.4 (Plot Functions) in Hadley Wickham's book R4DS: [https://r4ds.hadley.nz/functions.html#data-frame-functions](https://r4ds.hadley.nz/functions.html#data-frame-functions)
+> - This blog post on iteration with the {purrr} package: [https://www.rebeccabarter.com/blog/2019-08-19_purrr](https://www.rebeccabarter.com/blog/2019-08-19_purrr)
+
+### **Using AI tools**
+
+> On this assignment, you are encouraged to use ChatGPT and other AI tools (e.g. Github Copilot). But don't just blindly copy-paste code. The code provided by these tools is not perfect, and you will likely need to modify it to get the correct solution. If you do use an AI tools, you must include the prompt(s) you used (using a comment with `#`) to recieve full credit. If you had to change anything to your prompt to get better results, write that down too in your code with a comment. Learn to use tools like ChatGPT as a learning assistant - a tool to help you accomplish the task - rather than just a solutions manual. One version of using it makes you a better and more efficient coder, the other robs you of that.
+
+### 1) Staying organized [SOLO, 5%]
+
+Download and use [this template](templates/hw11.zip) for your assignment. Inside the "hw11" folder, open and edit the R script called `hw11.R` and fill out your name, GW netID, and the names of anyone you worked with on this assignment.
+
+## Data Frame Functions
+
+> For questions 2 - 5, after writing your function, demonstrate it using a data frame of your choice from the `dslabs` package. For example, for question 2, you could use `var_summary(dslabs::movielens, rating)` (so, obviously, you should use a different example).
+
+### 2) `var_summary(df, var)` [SOLO, 10%]
+
+Write the function `var_summary(df, var)` that takes a data frame (`df`) and a variable (`var`) as inputs, and returns the minimum, maximum, mean, and median value of that variable. The function should remove any `NA` values in `var` when computing these summary statistics. The object returned should be a single data frame / tibble (not a vector).
+
+### 3) `group_summary(df, var, group_var)` [SOLO, 10%]
+
+Write the function `group_summary(df, var, group_var)` that takes a data frame (`df`), a variable (`var`), and a grouping variable (`group_var`) as inputs, and returns a summary table showing the count, mean, and standard deviation of the variable `var` grouped by `group_var`. The function should remove any `NA` values in `var` when computing these summary statistics. The object returned should be a single data frame / tibble (not a vector).
+
+### 4) `var_hist(df, var, bins)` [SOLO, 10%]
+
+Write the function `var_hist(df, var, bins)` that takes a data frame (`df`), a variable (`var`), and the number of bins (`bins`) as inputs, and returns a histogram of that variable with a user-specified number of bins as a ggplot object. The default number of bins should be `30`.
+
+### 5) `scatterplot(df, x, y)` [SOLO, 10%]
+
+Write the function `scatterplot(df, x, y)` that takes a data frame (`df`) and two variables (`x` and `y`) as inputs, and returns a scatter plot of those two variables as a ggplot object.
+
+## Iteration across lists with `purrr`
+
+### Problems using `word_list`
+
+For these questions, we will work with the `sentences` vector that comes loaded with the `stringr` package (which is loaded when you load the `tidyverse` package):
+
+```{r}
+library(tidyverse)
+
+head(stringr::sentences)
+```
+
+This vector contains lots of random sentences. When we break those sentences into individual words using `str_split()`, we will get a _list_ back where each item in the list is a vector of words:
+
+```{r}
+word_list <- str_split(stringr::sentences, " ")
+
+word_list[1:3]
+```
+
+We will use this `word_list` for questions 6 - 8.
+
+### 6) [COLLABORATIVE, 5%]
+
+Using `map()`, write R code to obtain a vector of how many words are in each item in `word_list`.
+
+### 7) [COLLABORATIVE, 5%]
+
+Using `map()`, write R code to obtain a vector of the total number of characters in each item in `word_list`.
+
+### 8) [COLLABORATIVE, 5%]
+
+Using `map()`,  write R code to obtain a vector of the number of times the word `"the"` appears in each item in `word_list`. Your result should ignore casing, so both `"the"` and `"The"` should count. 
+
+### Problems using `sw_people`
+
+As we saw in class, the `sw_people` list contains a list of information about each character in Star Wars. You can load the list from the `repurrrsive` package:
+
+```{r}
+library(repurrrsive)
+```
+
+We will use the `sw_people` and `sw_films` lists for questions 9 & 10.
+
+### 9) [COLLABORATIVE, 10%]
+
+Using `map()` and the `sw_films` list, write R code to obtain a vector of integers that contains the number of characters in each Star Wars film.
+
+### 10) [COLLABORATIVE, 10%]
+
+Using `map_df()`, create a data frame where each row represents a character from `sw_people`. The columns should contain the following: 
+
+- `name`: The character's name, as a character.
+- `height`: The character's height, as a number
+- `is_male`: Whether the character's gender is `"male"` (`TRUE` or `FALSE`)
+- `film_count`: The number of films they have appeared in, as an integer.
+
+### 11) [SOLO, 10%]
+
+For the last problem, write your own homework question that requires the student (you) to use `map()` in the solution. You can use any lists of data you want for your question (e.g. `sw_people`, `sw_films`, `got_chars`, etc.). Then provide the answer to your question. As with all the other questions, if you use an AI tool to help you create and / or solve your question, include the prompt you used and comment on any changes you had to make to improve your outcome.
+
+### 12) Read and reflect [SOLO, 10%]
+
+Read and reflect on the following readings to preview what we will be covering next:
+
+> - Chapter 25 (Web scraping) in Hadley Wickham's book R4DS: [https://r4ds.hadley.nz/webscraping.html](https://r4ds.hadley.nz/webscraping.html)
+> - This post on accessing and collecting data with APIs in R: [https://statisticsglobe.com/api-in-r](https://statisticsglobe.com/api-in-r)
+
+Afterwards, in a comment (`#`) in your .R file, write a short reflection on what you've learned and any questions or points of confusion you have about what we've covered thus far. This can just few a few sentences related to this assignment, next week's readings, things going on in the world that remind you something from class, etc. If there's anything that jumped out at you, write it down.
+
+### Submit
+
+{{< var hw_submit >}}