[R] Cleaning up messy Excel data

jim holtman jholtman at gmail.com
Thu Mar 1 14:15:40 CET 2012

But there are some important reasons to use Excel.  In my work there
are a lot of people that I have to send the equivalent of a data.frame
to who want to look at the data and possibly slice/dice the data
differently and then send back to me updates.  These folks do not know
how to use R, but do have Microsoft Office installed on their
computers and know how to use the different products.

I have been very successful in conveying what I am doing for them by
communicating via Excel spreadsheets.  It is also an important medium
in dealing with some international companies who provide data via
Excel and expect responses back via Excel.

When dealing with data in a tabular form, Excel does provide a way for
a majority of the people I work with to understand the data.  Yes,
there are problems with some of the ways that people use Excel, and
yes I have had to invest time in scrubbing some of the data that I get
from them, but if I did not, then I would probably not have a job
working for them.  I use R exclusively for the analysis that I do, but
find it convenient to use Excel to provide a communication mechanism
to the majority of the non-R users that I have to deal with.  It is a
convenient "work-around" because I would never get them to invest the
time to learn R.

So in the real world these is a need to Excel and we are not going to
cause it to go away; we have to learn how to live with it, and from my
standpoint, it has definitely benefited me in being able to
communicate with my users and continuing to provide them with results
that they are happy with.  They refer to letting me work my "magic" on
the data; all they know is they see the result via Excel and in the
background R is doing the heavy lifting that they do not have to know

On Wed, Feb 29, 2012 at 4:41 PM, Rolf Turner <rolf.turner at xtra.co.nz> wrote:
> On 01/03/12 04:43, John Kane wrote:
>> (mydata<- as.factor(c("1","2","3", ">2", "5", ">2")))
>> str(mydata)
>> newdata<- as.character(mydata)
>> newdata[newdata==">2"]<- 0
>> newdata<- as.numeric(newdata)
>> str(newdata)
>> We really need to keep Excel (and other spreadsheets) out of peoples
>> hands.
> Amen, bro'!!!
>    cheers,
>        Rolf Turner
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

More information about the R-help mailing list