[R] Statistical computing

Stephen C. Upton upton at mitre.org
Mon Mar 31 17:54:17 CEST 2003


Hi Tanya,

I would also like to second Bill's comment on not underestimating R for "data
cleaning". I have had great success with simple R scripts and functions for
parsing data that I've abandoned my use of Python - not that there is anything
wrong with Python! It's just that I can do all that I need to do in selecting
subsets, etc. with R that I found no need for another, supplemental language - not
to mention the extra learning curve. FWIW, if you have some rather large files
(GB's), get lots of memory!

HTH
steve

"Pikounis, Bill" wrote:

> Hi Tanya,
> You really cannot lose with either Perl or Python.  Either of them, along
> with other tools mentioned, will suffice for making your work SAS-free. But
> I would also not underestimate R for "data-cleaning"...
>
> > Is there a fairly easy way to become SAS-free for data management and
> > cleaning? I'm told R is really not ideal for data cleaning.
>
> I must admit that I am always eager to debunk the myth that SAS is (so much)
> better than the S language for data management, because to me the myth
> mostly points out that many statisticians have never used anything else but
> SAS.
>
> Best Regards,
> Bill
>
> ----------------------------------------
> Bill Pikounis, Ph.D.
> Biometrics Research Department
> Merck Research Laboratories
> PO Box 2000, MailDrop RY84-16
> 126 E. Lincoln Avenue
> Rahway, New Jersey 07065-0900
> USA
>
> v_bill_pikounis at merck.com
>
> Phone: 732 594 3913
> Fax: 732 594 1565
>
> > -----Original Message-----
> > From: Tanya Murphy [mailto:tmurph6 at po-box.mcgill.ca]
> > Sent: Monday, March 31, 2003 9:04 AM
> > To: Bashir Saghir (Aztek Global); r-help at stat.math.ethz.ch
> > Subject: RE: [R] Statistical computing
> >
> >
> > Thanks to all who have replied to this. I find the advice
> > very encouraging.
> > I've been reading the recommended links on Sweave and I think
> > it will answer a
> > major part of my goals.
> >
> > As for Perl vs. Python, I don't know which would be best.
> > I've started out in
> > Perl because someone got me started with a little Perl
> > program, but I've
> > looked at Python, too. I'm working in Windows (and that's not
> > likely to change
> > anytime soon--at the office, anyway) and I think WinEdt
> > serves as a good
> > enhanced editor for the main applications--LaTex, R and
> > Perl--as well as a way
> > to organize the files for a project. The GUI for Pyton seems
> > nice, too,
> > though.
> >
> > Saghir, why do you prefer Python?
> >
> > Is there a fairly easy way to become SAS-free for data management and
> > cleaning? I'm told R is really not ideal for data cleaning.
> > Is this what RODBC
> > is about?
> >
> > Tanya
> >
> >
> > >===== Original Message From "Bashir Saghir (Aztek Global)"
> > <Saghir.Bashir at UCB-Group.com> =====
> > >Dear Tanya,
> > >
> > >Have you considered using Python (www.python.org) instead of
> > Perl? I use
> > >Python, LaTeX, and R for doing what you describe. My process
> > is evolving and
> > >cannot recommend it as being the best. Essentially I am
> > moving towards a
> > >database approach currently using dictionaries in Python. In
> > the longer term
> > >I plan to switch to MySQL.
> > >
> > >In summary I split the problem into bits that link into a relational
> > >database and use meta data to run my reports. So once the
> > data base is set
> > >up I only need to give the key information and my programs
> > find all relevant
> > >information in the database meaning that I never need to
> > modify any programs
> > >to run a report with new data - just the database.
> > >
> > >I don't know of any references for this bnut if you get any
> > to your original
> > >query I would be interested.
> > >
> > >Best regards,
> > >Saghir
> > >
> > >> -----Original Message-----
> > >> From:      Tanya Murphy [SMTP:tmurph6 at po-box.mcgill.ca]
> > >> Sent:      Friday, 28 March, 2003 5:42 PM
> > >> To:        r-help
> > >> Subject:   [R] Statistical computing
> > >>
> > >> Hello,
> > >>
> > >> I've been trying to familiarize myself with the computing
> > tools of the
> > >> trade
> > >> (e.g. SAS, R, Perl, LaTex) and I've been getting somewhere with the
> > >> individual
> > >> programs, but I'm trying to get a better sense of how to
> > integrate these
> > >> tools. I'd like to use scripts and create reports in a
> > more organized way.
> > >> Can
> > >> anyone recommend books or, better yet free online
> > articles, on this topic?
> > >>
> > >> Maybe I should be a little more specific about what I do:
> > I'm a research
> > >> assistant in clinical epidemiology doing mainly data management and
> > >> analysis.
> > >> I do a number of repetitive tasks like updating a research
> > database from
> > >> the
> > >> original clinic database and other sources, create reports, create
> > >> graphical
> > >> output for individual patients, as well as work on
> > individual research
> > >> projects. Unfortunately I am not working closely with
> > 'real' statisticians
> > >> who
> > >> have probably developped good work habits using these
> > tools. Any advice on
> > >>
> > >> 'the big picture' would be greatly appreciated.
> > >>
> > >> Thanks!
> > >>
> > >> Tanya Murphy
> > >>
> >
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> >
>
> ------------------------------------------------------------------------------
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help



More information about the R-help mailing list