#1 By: Fantasy Douche, February 25th, 2016 17:23
Hey folks, some of our writers are going to be getting more into R, or at least are interested in learning more. So we've started a Github repository for some of the tutorial-ish code that we're passing around. I figured we could also pass to the message board as well, and then we might also have some help-type discussions here.
Here's the first file https://github.com/rotoviz/team-R-code/blob/master/Ex1-download-pfr-table.R
That file will walk you through downloading a table from PFR, and then doing some simple calculations and analysis.
Note to get started I recommend downloading RStudio and using that to run your R code. Once you have RStudio downloaded you can open that file from above and then click "Run" to step through each line.
Let's keep the discussion in this thread for now.
#2 By: Kyle Peterson, February 25th, 2016 17:53
Not sure if y'all would be up for it, but adding a data dump to this repo would be awesome. My biggest gripe about NFL data is the scarcity of it and having to write site-specific data crawlers. I can add some of my crawlers, too (e.g. MFL10 data mining scripts).
#3 By: Fantasy Douche, February 25th, 2016 17:58
We'll probably add the Armchair Analysis data set at some point. One thing is that when we do that we'll likely have to make the repo private just so we're not giving away that data set.
#4 By: Jim Kloet, February 25th, 2016 18:03
A data repo would be great. Would the idea be to upload clean CSVs there or host the DB somehow?
Also, is there an easy way to share scripts from one github account into another? I've got a few scripts to share as well. You can check them out here otherwise https://github.com/jimtheflash
#5 By: Fantasy Douche, February 25th, 2016 18:18
I added you as a collaborator for the rotoviz github repo.
My thought was that we could host the necessary CSVs that are in the AA dataset. I really only use about 8 of them.
#6 By: Jim Kloet, February 25th, 2016 18:57
Thanks! Hosting CSVs makes most sense I think for how people are going to use this. Nick and I were discussing how to make the coaching data joinable to AA as well but not sure when we'll get going on that. If anyone on this thread has any interest in working on that let me know, but we can add that(those) CSVs to the mix too when it's ready.
#7 By: Fantasy Douche, February 26th, 2016 11:33
Also wanted to mention this podcast for anyone interested in doing more analysis. It's great.
#8 By: Fantasy Douche, February 26th, 2016 13:33
New R file added which goes through process of downloading a Google sheet and parsing out some combine measurements https://github.com/rotoviz/team-R-code/blob/master/Ex2-download-google-sheet.R
#9 By: Ben Gretch, February 27th, 2016 01:11
1) This is all awesome. The first PFR example was really cool to work through.
2) Working on this Google sheet one, I got to the point to read the sheet and hopefully get a nested list of 9, but I'm getting a nested list of 3 instead. (Tried it twice from the beginning with the same result.) Trying to turn table 2 into a data frame gets me the QB table (trying table 1 returns an error and table 3 returns essentially a list of the positions).
#10 By: Fantasy Douche, February 27th, 2016 10:18
Actually later in the day i was getting the same behavior. Will try to figure that out today
#11 By: Kevin Cole, February 27th, 2016 10:55
Sub out the functions and just use this. You have to install "googlesheets". I use piping, so you have to install "dplyr" too. The "ws = 2" means the second worksheet.
rbTbl <- gs_url("https://docs.google.com/spreadsheets/d/1MhmzWDgIqCIoYL0K1c8MG43W1djmSw5trl5-oJfe24Q/") %>%
gs_read(ws = 2)
#12 By: Kevin Cole, February 27th, 2016 10:58
Check this out for more documentation on the googlesheets package.
#13 By: Fantasy Douche, February 27th, 2016 12:48
Those functions are perplexing just because they worked about 10 times yesterday while I was testing it. Then when I went to actually update the Box Score Scout they wouldn't work. I found a workaround just by copying the sheet over and then re-ordering the tabs. But I was never able to get those functions to work again.
Also I can't get the gs_read function to work as it's throwing an error right now. Will try to figure this out later. Gotta run now.
#14 By: Fantasy Douche, February 27th, 2016 12:49
Actually this page describes the error I was getting using gs_read if anyone else is getting the same
#15 By: Ben Gretch, February 27th, 2016 14:07
I've been getting what I think is a different error. Thought it might be the same as FD's but I tried installing the GitHub version of that package and still got it. Added the hadley/xml2 from's Kev's link and still getting it.
Looks like this:
Error in stop_for_content_type(req, expected = "application/atom+xml; charset=UTF-8") :
#16 By: Ben Gretch, February 27th, 2016 20:58
A general question for those of you who use R a lot - when you need info on something, do you have a preferred method for searching? In the tutorials I've done and stuff I've read, I've heard about the CRAN, the help functions inside R, obviously there's googling or looking for other specific online resources.
I'm thinking as I learn the process for doing stuff, but I'm trying to figure out how to do a specific function, if there is a best (or most helpful) help resource. Can be kind of difficult to parse through the multitude of options for a n00b like myself.
#17 By: Fantasy Douche, February 28th, 2016 10:06
I would say that 95% of the solutions to my problems come from stack overflow.
#18 By: Jim Kloet, February 28th, 2016 13:11
100% agree with FD, stack overflow is almost always the solution to the problem. Googling the specific error output you're seeing also yields pretty useful results much of the time (usually a stack overflow thread).
#19 By: CK, March 1st, 2016 13:55
okay I don't R, but if you R, this would be interesting and worthwhile to scrape probably. http://www.footballstudyhall.com/pages/2015-college-football-advanced-statistical-profiles
i would gladly give you a cheeseburger for the data.
#20 By: Jim Kloet, March 1st, 2016 14:26
I can haz cheezburger? Let me take a look tonight, it looks like they're mostly html tables so could be fairly straightforward. If anyone beats me to it, post here so I don't do the same thing. If it works nicely, I'll post the code in the repo.
next page →