[R] Data cleaning & Data preparation, what do R users want?

Bert Gunter bgunter.4567 at gmail.com
Wed Nov 29 17:49:12 CET 2017


Oh Crap! I mistakenly replied onlist. PLEASE IGNORE -- these are only my
ignorant opinions.

-- Bert

Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Wed, Nov 29, 2017 at 8:48 AM, Bert Gunter <bgunter.4567 at gmail.com> wrote:

> I don't think my view is of interest to many, so offlist.
>
> I reject this:
>
> " I would consider data analysis work to be three stages: data preparation,
> statistical analysis, and producing the report."
>
> For example, there is no such thing as "outliers" -- data to be removed as
> part of cleaning/preparation -- without a statistical model to be an
> "outlier" **from**, which is part of the statistical analysis. And the
> structure of the data (data preparation) may need to change depending on
> the course of the analysis (including graphics, also part of the analysis).
> So I think your view reflects a naïve view of the nature of data analysis,
> which is an iterative and holistic process. I suspect your training is as a
> computer scientist and you have not done much 1-1 consulting with
> researchers, though you should certainly feel free to reject this canard.
> Building software for large scale automated analysis of data required a
> much different analytical paradigm than the statistical consulting model,
> which is largely my background.
>
> No reply necessary. Just my opinion, which you are of course free to trash.
>
> Cheers,
> Bert
>
>
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
> On Wed, Nov 29, 2017 at 8:37 AM, Robert Wilkins <iwritecode2 at gmail.com>
> wrote:
>
>> R has a very wide audience, clinical research, astronomy, psychology, and
>> so on and so on.
>> I would consider data analysis work to be three stages: data preparation,
>> statistical analysis, and producing the report.
>> This regards the process of getting the data ready for analysis and
>> reporting, sometimes called "data cleaning" or "data munging" or "data
>> wrangling".
>>
>> So as regards tools for data preparation, speaking to the highly diverse
>> audience mentioned, here is my question:
>>
>> What do you want?
>> Or are you already quite happy with the range of tools that is currently
>> before you?
>>
>> [BTW,  I posed the same question last week to the r-devel list, and was
>> advised that r-help might be a more suitable audience by one of the
>> moderators.]
>>
>> Robert Wilkins
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posti
>> ng-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list