[R] perform t.test by rows and columns in data frame

Kara Przeczek przeczek at unbc.ca
Fri Feb 24 00:27:57 CET 2012


Sorry. I forgot to note that I am using R version 2.8.0.

________________________________________
From: r-help-bounces at r-project.org [r-help-bounces at r-project.org] on behalf of Kara Przeczek [przeczek at unbc.ca]
Sent: February 23, 2012 3:13 PM
To: r-help at r-project.org
Subject: [R] perform t.test by rows and columns in data frame

Dear R Help,

I have been struggling with this problem without making much headway. I am attempting to avoid using a loop, and would appreciate any suggestions you may have. I am not well versed in R and apologize in advance if I have missed something obvious.



I have a data set with multiple sites along a river where metal concentrations were measured. Three sites are located upstream of a mine and three sites are located downstream of the mine. I would like to compare the upstream and downstream metal levels using a t-test.



The data set looks something like this (but with more metals (25) and sites (6):

TotalMetals    Mean    Site    Location

Al    6000    1    us

Sb    0.6    1    us

Ba    150    1    us

Al    6500    2    us

Sb    0.7    2    us

Ba    160    2    us

Al    5600    3    ds

Sb    0.8    3    ds

Ba    180    3    ds

Al    170    4    ds

Sb    0.8    4    ds

Ba    175    4    ds



I have tried several variations of by() and aggregate() and tapply() without much luck. I thought I had finally got what I wanted with:

by(mr2$Mean, mr2$TotalMetals, function (x) t.test(mr2$Mean[mr2$Location=="us"], mr2$Mean[mr2$Location=="ds"]))



However, the output, although grouped by metal, had identical results for each metal with means for "x and y" equivalent to the mean of all metals within each site.

mean(mr2$Mean[mr2$Location=="us"]) #gave the x mean from the output and,

mean(mr2$Mean[mr2$Location=="ds"]) #gave the same y mean from the output





I can get the answer I want by performing the t-test for each metal individually with:



y=mr2[mr2$TotalMetals=="Al",]

t.test(y$Mean[y$Location=="us"], y$Mean[y$Location=="ds"])



But it would be painstaking to do this for each metal. In addition the data set will be getting larger in the future.

It would also be nice to collect the output in a table or similar format for easy output, if possible.



I would greatly appreciate any help that you could provide!
Thank you,

Kara



Natural Resources and Environmental Studies, MSc

University of Northern B.C.

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


More information about the R-help mailing list