[R] Help using Cast (Text) Version

David Winsemius dwinsemius at comcast.net
Mon Jan 18 14:58:36 CET 2010


On Jan 18, 2010, at 8:53 AM, David Winsemius wrote:

>
> On Jan 18, 2010, at 7:58 AM, Steve Sidney wrote:
>
>> Hi David
>>
>> Thanks for your patience, as well as thanks to Dennis Murphy and  
>> James Rome for trying to help.
>>
>> I have tried all your suggestions but still no joy.
>>
>> In order to try and resolve the problem I am attaching the  
>> following files, hope the system allows this.
>>
>> 1) Test_data_res.txt (used dput and this is all the data to be  
>> evaluated )
>> 2) Test_data_b.txt ( after performing the melt-cast. See the code)
>> 3) Annual Results NLA WMS Ver1.r ( the code for one of the  
>> parameters to be evaluated. In this case SPC)
>>
>> Background; the data is from a laboratory Proficiency Testing  
>> Scheme and the z-scores outside the |3| range, are identified as  
>> "fails". My code assigns a 1 or 0 depending on this evaluation and  
>> because not every lab participates in every round NA are assigned  
>> where there are no results.
>>
>> What I am looking for is the following for each round (1-6)
>> a) The total number of participants which in this case are  
>> represented by 1's and 0' per round
>
> > apply(b[,-1], 2, function(x) sum(is.na(x) ) )
> [1] 32 21 21 18 14 15

Ooops, forgot the negation operator to turn not(NA) into TRUE:

 > apply(b[,-1], 2, function(x) sum(!is.na(x) ) )
[1] 40 51 51 54 58 57

>
>
>
>> b) The total number of 1's, ie Fails per round
>
> > apply(b[,-1], 2, sum, na.rm=TRUE )
> [1] 5 2 4 3 5 7
>
>>
>>
>>
>> Regards
>> Steve
>>
>>
>>
>> ----- Original Message ----- From: "David Winsemius" <dwinsemius at comcast.net 
>> >
>> To: "Steve Sidney" <sbsidney at mweb.co.za>
>> Cc: <r-help at r-project.org>
>> Sent: Monday, January 18, 2010 12:38 AM
>> Subject: Re: [R] Help using Cast (Text) Version
>>
>>
>>>
>>> On Jan 17, 2010, at 4:37 PM, Steve Sidney wrote:
>>>
>>>> Well now I am totally baffled !!!!!!!!!!
>>>>
>>>> Using
>>>>
>>>> sum( !is.na(b[,3])) I get the total of all col 3 except those  
>>>> that  are NA -
>>>> Great solves the first problem
>>>>
>>>> What I can't seem to do is use the same logic to count all the  
>>>> 1's  in that
>>>> col, which are there before I use the cast with margins.
>>>>
>>>> So it seems to me that somehow   is wrong and is the part of my  
>>>> understanding that's missing.
>>>>
>>>> My guess is that that before using margins and sum in the cast   
>>>> statement the col is a character type and in order for == 1 to  
>>>> work  I need to convert this to an integer.
>>>
>>> Yiu can test your theory with:
>>>
>>> sum(as.integer(b[,3]) == 1)
>>>
>>> Or you could post some reproducible data using dput ....
>>>
>>> -- 
>>> David.
>>>
>>>
>>>>
>>>> Hope this helps you to understand the problem.
>>>>
>>>> Regards
>>>> Steve
>>>>
>>>> Your help is much appreciated
>>>> ----- Original Message ----- From: "David Winsemius" <dwinsemius at comcast.net
>>>> >
>>>> To: "Steve Sidney" <sbsidney at mweb.co.za>
>>>> Cc: <r-help at r-project.org>
>>>> Sent: Sunday, January 17, 2010 7:36 PM
>>>> Subject: Re: [R] Help using Cast (Text) Version
>>>>
>>>>
>>>>>
>>>>> On Jan 17, 2010, at 11:56 AM, Steve Sidney wrote:
>>>>>
>>>>>> David
>>>>>>
>>>>>> Thanks, I'll try that......but no what I need is the total  
>>>>>> (1's) for
>>>>>> each of the rows, labelled 1-6 at the top of each col in the  
>>>>>> table
>>>>>> provided.
>>>>>
>>>>> Part of my confusion with your request (which remains  
>>>>> unaddressed) is
>>>>> what you mean by "valid". The melt-cast operation has turned a   
>>>>> bunch of
>>>>> NA's into 0's which are now indistinguishable from the  
>>>>> original   0's. So I
>>>>> don't see any way that operating on "b" could tell you the   
>>>>> numbers  you
>>>>> are asking for. If you were working on the original data,   
>>>>> "res", you
>>>>> might have gotten the column-wise "valid" counts of column  2 with
>>>>> something like:
>>>>>
>>>>> sum( !is.na(res[,2]) )
>>>>>
>>>>>>
>>>>>> What I guess I am not sure of is how to identify the col after   
>>>>>> the melt
>>>>>> and cast.
>>>>>
>>>>> The cast object represents columns as a list of vectors. The i- 
>>>>> th column
>>>>> is b[[i]] which could be further referenced as a vector. So the   
>>>>> j- th row
>>>>> entry for the i-th column would be b[[i]][j].
>>>>>
>>>>>
>>>>>>
>>>>>> Steve
>>>>>>
>>>>>> ----- Original Message ----- From: "David Winsemius"
>>>>>> <dwinsemius at comcast.net
>>>>>> >
>>>>>> To: "Steve Sidney" <sbsidney at mweb.co.za>
>>>>>> Cc: <r-help at r-project.org>
>>>>>> Sent: Sunday, January 17, 2010 4:39 PM
>>>>>> Subject: Re: [R] Help using Cast (Text) Version
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> On Jan 17, 2010, at 5:31 AM, Steve Sidney wrote:
>>>>>>>
>>>>>>>> Sorry to repeat the meassage, not sure if the HTML version  
>>>>>>>> has   been
>>>>>>>> received - Apologies for duplication
>>>>>>>>
>>>>>>>> Dear list
>>>>>>>>
>>>>>>>> I am trying to count the no of occurances in a column of a   
>>>>>>>> data frame
>>>>>>>> and there is missing data identifed by NA.
>>>>>>>>
>>>>>>>> I am able to melt and cast the data correctly as well as sum  
>>>>>>>> the
>>>>>>>> occurances using margins and sum.
>>>>>>>>
>>>>>>>> Here are the melt and cast commands
>>>>>>>>
>>>>>>>> bw = melt(res, id=c("lab","r"), "pf_zbw")
>>>>>>>> b = cast(bw, lab ~ r, sum, margins = T)
>>>>>>>>
>>>>>>>> Sample Data (before using sum and margins)
>>>>>>>>
>>>>>>>> lab  1  2  3  4  5  6
>>>>>>>> 1  4er66  1 NA  1  0 NA  0
>>>>>>>> 2  4gcyi  0  0  1  0  0  0
>>>>>>>> 3  5d3hh  0  0  0 NA  0  0
>>>>>>>> 4  5d3wt  0  0  0  0  0  0
>>>>>>>> .
>>>>>>>> . lines deleted to save space
>>>>>>>> .
>>>>>>>> 69 v3st5 NA NA  1 NA NA NA
>>>>>>>> 70 a22g5 NA  0 NA NA NA NA
>>>>>>>> 71 b5dd3 NA  0 NA NA NA NA
>>>>>>>> 72 g44d2 NA  0 NA NA NA NA
>>>>>>>>
>>>>>>>> Data after using sum and margins
>>>>>>>>
>>>>>>>> lab 1 2 3 4 5 6 (all)
>>>>>>>> 1  4er66 1 0 1 0 0 0     2
>>>>>>>> 2  4gcyi 0 0 1 0 0 0     1
>>>>>>>> 3  5d3hh 0 0 0 0 0 0     0
>>>>>>>> 4  5d3wt 0 0 0 0 0 0     0
>>>>>>>> 5  6n44r 0 0 0 0 0 0     0
>>>>>>>> .
>>>>>>>> .lines deleted to save space
>>>>>>>> .
>>>>>>>> 70 a22g5 0 0 0 0 0 0     0
>>>>>>>> 71 b5dd3 0 0 0 0 0 0     0
>>>>>>>> 72 g44d2 0 0 0 0 0 0     0
>>>>>>>> 73 (all) 5 2 4 3 5 7    26
>>>>>>>>
>>>>>>>> Uisng length just tells me how many total rows there are.
>>>>>>>
>>>>>>>
>>>>>>>> What I need to do is count how many rows there is valid  
>>>>>>>> data,  in this
>>>>>>>> case either a one (1) or a zero (0) in b
>>>>>>>
>>>>>>> I'm guessing that you mean to apply that test to the column in b
>>>>>>> labeled "(all)" . If that's the case, then something like    
>>>>>>> (obviously
>>>>>>> untested):
>>>>>>>
>>>>>>> sum( b$'(all)' == 1 | b$'(all)' == 0)
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> I have a report to construct for tomorrow Mon so any help  
>>>>>>>> would be
>>>>>>>> appreciated
>>>>>>>>
>>>>>>>> Regards
>>>>>>>> Steve
>>>>>>>
>>>>>>> David Winsemius, MD
>>>>>>> Heritage Laboratories
>>>>>>> West Hartford, CT
>>>>>>>
>>>>>>
>>>>>
>>>>> David Winsemius, MD
>>>>> Heritage Laboratories
>>>>> West Hartford, CT
>>>>>
>>>>>
>>>>
>>>
>>> David Winsemius, MD
>>> Heritage Laboratories
>>> West Hartford, CT
>>>
>> <Test_data_b.txt><Test_data_res.txt><Annual Results NLA WMS Ver1.r>
>
> David Winsemius, MD
> Heritage Laboratories
> West Hartford, CT
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
Heritage Laboratories
West Hartford, CT



More information about the R-help mailing list