Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

solution by Yue for Jun Gi #337

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from
Draft

solution by Yue for Jun Gi #337

wants to merge 4 commits into from

Conversation

martin-raden
Copy link
Member

@martin-raden martin-raden commented Dec 9, 2024

Hi @Yue-Z9

I have extracted your solution files for @jungihong10 from his own branch into the new branch yue_for_jungi.

Here is the solutions file for reviewing.

Please pull and switch to this branch before working on your solution!

Best,
Martin

@martin-raden martin-raden marked this pull request as draft December 9, 2024 10:02
Copy link
Member Author

@martin-raden martin-raden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Yue,
here some technical comments on your solution.
Best,
Martin

library(ggplot2)
library(dplyr)
library(tidyr)
library(readxl)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it might be more simple to load the meta package library(tidyverse) instead of individual subpackages

#the first one
#read xlsx file
file<-"Volleyball Passing- USA and TU.xlsx"
data <- read_excel(file,sheet = "TU sort by pass score")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • best set the working directory to the source file's location before calling read_excel() to ensure the file is found.

Pass_Score = as.factor("Pass Score"),
Attempts = as.numeric(Attempts),
Points_Won = as.numeric(Points_Won),
Points_Lost = as.numeric(Points_Lost))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • best recompute this data based on the first two columns. maybe the author of the file did something wrong in aggregating this table?

names_to = "Outcome",
values_to = "Count")

clean_data <- tidy_data %>%
Copy link
Member Author

@martin-raden martin-raden Dec 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • since you do not use or need tidy_data again, join the two pipes...

dont create variables if you need them only once (directly after)...

clean_data <- tidy_data %>%
group_by(`Pass Score`)%>%
filter(`Pass Score` %in% c("zeros", "ones", "twos", "threes"))%>%
mutate(`Pass Score` = factor(`Pass Score`, levels = c("zeros", "ones", "twos", "threes")))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can omit line 39 since factoring with predefined levels would make values that dont fit an NA..

PassingScore = c(2.1, 2.3, 2.2, 2.0, 2.4),
PointsWon = c(140, 161, 117, 93, 150),
PointsLost = c(81, 58, 30, 67, 53)
)
Copy link
Member Author

@martin-raden martin-raden Dec 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uaks... NEVER hard code data if you have a raw file.. ;)

  • please replace with loading the data from file.. (and yes,.. it is an ugly file.. 😉👍)

Copy link
Member Author

@martin-raden martin-raden Dec 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • try to extract and load ALL TEAMS' DATA (suitable for your targeted statistics) from the first sheet of the excel file..

@jungihong10
Copy link
Collaborator

Hi @Yue-Z9 I just reviewed your solution, Im not quite sure why there is a N/A for the stacked bar chart. For vollyeball there is only a max of 3 passes, so maybe exclude that part out?

Also the second graph seems reasonable, but im not sure if the r values for both are exactly 1.

Other than that, all looks good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants