[R] Data cleaning & Data preparation, what do R users want?

Christopher W. Ryan cryan at binghamton.edu
Wed Nov 29 17:52:20 CET 2017


Great question. What do I want? I want my co-workers to stop using Excel
spreadsheets for data entry, storage, and sharing! I want them to
understand the value of data discipline. But alas . . . .

I work in a county health department in the US. Between dplyr, stringr,
grep, grepl, and the base R read() functions, I'm doing OK.

I need to learn more about APIs, so I can see if I can make R directly
grab data from, e.g. our state health department sources. My biggest
hassle is having to download a data file, save it somewhere, and then
open R and read it in. I'd like to be able to do it all in R. Would make
the generation of recurring reports easier.

--Chris Ryan

Robert Wilkins wrote:
> R has a very wide audience, clinical research, astronomy, psychology, and
> so on and so on.
> I would consider data analysis work to be three stages: data preparation,
> statistical analysis, and producing the report.
> This regards the process of getting the data ready for analysis and
> reporting, sometimes called "data cleaning" or "data munging" or "data
> wrangling".
> 
> So as regards tools for data preparation, speaking to the highly diverse
> audience mentioned, here is my question:
> 
> What do you want?
> Or are you already quite happy with the range of tools that is currently
> before you?
> 
> [BTW,  I posed the same question last week to the r-devel list, and was
> advised that r-help might be a more suitable audience by one of the
> moderators.]
> 
> Robert Wilkins
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list