[R] Help using Cast (Text) Version

Sun Jan 17 19:39:09 CET 2010

Bingo,

I knew it was something simple and that I wasn't seeing the wood for the 
trees.

David, Ista

Apologies for the vague description, which I thought was clear enough, but 
yes I now think that I understand that I need to count the 1's and be able 
to sum the total of 1' and 0's by ignoring the NA's, which as David you have 
correctly identified is in res. Of course as you have quite correctly said 
by the time it's melt-cast there is now way to distinguish between NA's and 
0's.

Here is the original code so that you can see where res comes from; Ista I 
hope that this is now clearer for you.

 library(reshape)
# Enter file name to Read & Save data
FileName=readline("Enter File name:\n")
SampleName=readline("Enter Sample (A,B or C):\n")
#for (sname in 1 : 3) {
#if (sname == 1)
#  SampleName =  "A"
#  if (sname == 2)
#    SampleName =  "B"
#    if (sname == 3)
#      SampleName =  "C"
#for ( fname in 1 : 4) {
#if (fname == 1)
#  FileName = "SPC"
#  if (fname == 2)
#    FileName = "Coli"
#      if (fname == 3)
#      FileName = "Colif"
#       if (fname == 4)
#          FileName = "Ecoli"

# Find first occurance of file
for ( rloop1 in 1 : 6) {
ReadFile=paste(rloop1,SampleName,"_",FileName,"_Stats.csv", sep="")
if (file.exists(ReadFile))
break
}
x = data.frame(read.csv(ReadFile, header=T),rnd=rloop1)
for ( rloop2 in (rloop1+1) : 6) {
ReadFile=paste(rloop2,SampleName,"_",FileName,"_Stats.csv", sep="")
if (file.exists(ReadFile)) {
    y = data.frame(read.csv(ReadFile, header=T),rnd = rloop2)
    if (rloop2 == (rloop1+1))
       z=merge(x,y,all=T)
       z=merge(y,z,all=T)
## The next piece of code is where there are not two successive rounds of 
data.
## It must be modified for each year's summary
##
##if ( (FileName == "Coli") & (SampleName == "B")) {
##    if (rloop2 == (rloop1+3))
##        z=merge(x,y,all=T)
##        z=merge(y,z,all=T)
##        }
##
##
   }
}

results <- z
res = data.frame( 
lab=results[,"lab_id"],bw=results[,"ZBW"],wi=results[,"ZWI"],pf_zbw=0,pf_zwi=0,r 
= results[,"rnd"])
#
# Establish no of samples recorded
nsmpls = length(res[,c("lab")])
#Evaluate Z_scores for Between Lab Results
for ( i in 1 : nsmpls) {
if (res[i,"bw"] > 3 | res[i,"bw"] < -3)
res[i,"pf_zbw"]=1
}
#Evaluate Z_scores for Within Lab Results
for ( i in 1 : nsmpls) {
if (res[i,"wi"] > 3 | res[i,"wi"] < -3)
res[i,"pf_zwi"]=1
}
# Melt and Cast the 'res' frame and then order it
bw = melt(res, id=c("lab","r"), "pf_zbw")
# b = cast(bw, lab ~ r)
# bw_eval = b[order(as.character(b$lab)),]

##### Code for summing the no of Fails for Between Results
#bsum = cast(bw, lab ~ r, margins=TRUE, sum)

## Save Summary of Between Results
## FileSaveBw=paste(SampleName,"_",FileName,"_2009Between.csv",sep="")
## write.csv(bw_eval,file=FileSaveBw)
##
##
##
####
# Melt and Cast the 'res' frame and then order it
wi = melt(res, id=c("lab","r"), "pf_zwi")
w = cast(wi, lab ~ r)
wi_eval = w[order(as.character(w$lab)),]

##### Code for summing the no of Fails for Within Results
#wsum = cast(wi, lab ~ r, margins=TRUE, sum)
##
## Save Summary of Within Results
## FileSaveWi=paste(SampleName,"_",FileName,"_2009Within.csv",sep="")
## write.csv(wi_eval,file=FileSaveWi)
###### cat ("File Name: ",FileName,"Sample Name: ",SampleName, "\n")

#  }

#}
end

Once again thanks for your interest
Steve

----- Original Message ----- 
From: "David Winsemius" <dwinsemius at comcast.net>
To: "Steve Sidney" <sbsidney at mweb.co.za>
Cc: <r-help at r-project.org>
Sent: Sunday, January 17, 2010 7:36 PM
Subject: Re: [R] Help using Cast (Text) Version

>
> On Jan 17, 2010, at 11:56 AM, Steve Sidney wrote:
>
>> David
>>
>> Thanks, I'll try that......but no what I need is the total (1's) for 
>> each of the rows, labelled 1-6 at the top of each col in the table 
>> provided.
>
> Part of my confusion with your request (which remains unaddressed) is 
> what you mean by "valid". The melt-cast operation has turned a bunch  of 
> NA's into 0's which are now indistinguishable from the original  0's. So I 
> don't see any way that operating on "b" could tell you the  numbers you 
> are asking for. If you were working on the original data,  "res", you 
> might have gotten the column-wise "valid" counts of column  2 with 
> something like:
>
>  sum( !is.na(res[,2]) )
>
>>
>> What I guess I am not sure of is how to identify the col after the  melt 
>> and cast.
>
> The cast object represents columns as a list of vectors. The i-th  column 
> is b[[i]] which could be further referenced as a vector. So the  j-th row 
> entry for the i-th column would be b[[i]][j].
>
>
>>
>> Steve
>>
>> ----- Original Message ----- From: "David Winsemius" 
>> <dwinsemius at comcast.net
>> >
>> To: "Steve Sidney" <sbsidney at mweb.co.za>
>> Cc: <r-help at r-project.org>
>> Sent: Sunday, January 17, 2010 4:39 PM
>> Subject: Re: [R] Help using Cast (Text) Version
>>
>>
>>>
>>> On Jan 17, 2010, at 5:31 AM, Steve Sidney wrote:
>>>
>>>> Sorry to repeat the meassage, not sure if the HTML version has  been 
>>>> received - Apologies for duplication
>>>>
>>>> Dear list
>>>>
>>>> I am trying to count the no of occurances in a column of a data   frame 
>>>> and there is missing data identifed by NA.
>>>>
>>>> I am able to melt and cast the data correctly as well as sum the 
>>>> occurances using margins and sum.
>>>>
>>>> Here are the melt and cast commands
>>>>
>>>> bw = melt(res, id=c("lab","r"), "pf_zbw")
>>>> b = cast(bw, lab ~ r, sum, margins = T)
>>>>
>>>> Sample Data (before using sum and margins)
>>>>
>>>>   lab  1  2  3  4  5  6
>>>> 1  4er66  1 NA  1  0 NA  0
>>>> 2  4gcyi  0  0  1  0  0  0
>>>> 3  5d3hh  0  0  0 NA  0  0
>>>> 4  5d3wt  0  0  0  0  0  0
>>>> .
>>>> . lines deleted to save space
>>>> .
>>>> 69 v3st5 NA NA  1 NA NA NA
>>>> 70 a22g5 NA  0 NA NA NA NA
>>>> 71 b5dd3 NA  0 NA NA NA NA
>>>> 72 g44d2 NA  0 NA NA NA NA
>>>>
>>>> Data after using sum and margins
>>>>
>>>>   lab 1 2 3 4 5 6 (all)
>>>> 1  4er66 1 0 1 0 0 0     2
>>>> 2  4gcyi 0 0 1 0 0 0     1
>>>> 3  5d3hh 0 0 0 0 0 0     0
>>>> 4  5d3wt 0 0 0 0 0 0     0
>>>> 5  6n44r 0 0 0 0 0 0     0
>>>> .
>>>> .lines deleted to save space
>>>> .
>>>> 70 a22g5 0 0 0 0 0 0     0
>>>> 71 b5dd3 0 0 0 0 0 0     0
>>>> 72 g44d2 0 0 0 0 0 0     0
>>>> 73 (all) 5 2 4 3 5 7    26
>>>>
>>>> Uisng length just tells me how many total rows there are.
>>>
>>>
>>>> What I need to do is count how many rows there is valid data, in   this 
>>>> case either a one (1) or a zero (0) in b
>>>
>>> I'm guessing that you mean to apply that test to the column in b 
>>> labeled "(all)" . If that's the case, then something like  (obviously 
>>> untested):
>>>
>>> sum( b$'(all)' == 1 | b$'(all)' == 0)
>>>
>>>
>>>
>>>>
>>>> I have a report to construct for tomorrow Mon so any help would be 
>>>> appreciated
>>>>
>>>> Regards
>>>> Steve
>>>
>>> David Winsemius, MD
>>> Heritage Laboratories
>>> West Hartford, CT
>>>
>>
>
> David Winsemius, MD
> Heritage Laboratories
> West Hartford, CT
>
>