[R] Help using Cast (Text) Version

Steve Sidney sbsidney at mweb.co.za
Mon Jan 18 16:26:45 CET 2010


David

Excellent !!!!! It its exactly what I was looking for.

Two very small questions to conclude
1) I don't understand the significance of the -1 in the sq brackets.
2) Not sure I really understand how function(x)works in this context.

If you can point me towards a doc that explains this in simple terms I would 
be obliged. Don't expect you to have to provide the answer.

Once again many thanks for your patience and help

Regards
Steve

----- Original Message ----- 
From: "David Winsemius" <dwinsemius at comcast.net>
To: "David Winsemius" <dwinsemius at comcast.net>
Cc: "Steve Sidney" <sbsidney at mweb.co.za>; <r-help at r-project.org>
Sent: Monday, January 18, 2010 3:58 PM
Subject: Re: [R] Help using Cast (Text) Version


>
> On Jan 18, 2010, at 8:53 AM, David Winsemius wrote:
>
>>
>> On Jan 18, 2010, at 7:58 AM, Steve Sidney wrote:
>>
>>> Hi David
>>>
>>> Thanks for your patience, as well as thanks to Dennis Murphy and  James 
>>> Rome for trying to help.
>>>
>>> I have tried all your suggestions but still no joy.
>>>
>>> In order to try and resolve the problem I am attaching the  following 
>>> files, hope the system allows this.
>>>
>>> 1) Test_data_res.txt (used dput and this is all the data to be 
>>> evaluated )
>>> 2) Test_data_b.txt ( after performing the melt-cast. See the code)
>>> 3) Annual Results NLA WMS Ver1.r ( the code for one of the  parameters 
>>> to be evaluated. In this case SPC)
>>>
>>> Background; the data is from a laboratory Proficiency Testing  Scheme 
>>> and the z-scores outside the |3| range, are identified as  "fails". My 
>>> code assigns a 1 or 0 depending on this evaluation and  because not 
>>> every lab participates in every round NA are assigned  where there are 
>>> no results.
>>>
>>> What I am looking for is the following for each round (1-6)
>>> a) The total number of participants which in this case are  represented 
>>> by 1's and 0' per round
>>
>> > apply(b[,-1], 2, function(x) sum(is.na(x) ) )
>> [1] 32 21 21 18 14 15
>
> Ooops, forgot the negation operator to turn not(NA) into TRUE:
>
> > apply(b[,-1], 2, function(x) sum(!is.na(x) ) )
> [1] 40 51 51 54 58 57
>
>>
>>
>>
>>> b) The total number of 1's, ie Fails per round
>>
>> > apply(b[,-1], 2, sum, na.rm=TRUE )
>> [1] 5 2 4 3 5 7
>>
>>>
>>>
>>>
>>> Regards
>>> Steve
>>>
>>>
>>>
>>> ----- Original Message ----- From: "David Winsemius" 
>>> <dwinsemius at comcast.net
>>> >
>>> To: "Steve Sidney" <sbsidney at mweb.co.za>
>>> Cc: <r-help at r-project.org>
>>> Sent: Monday, January 18, 2010 12:38 AM
>>> Subject: Re: [R] Help using Cast (Text) Version
>>>
>>>
>>>>
>>>> On Jan 17, 2010, at 4:37 PM, Steve Sidney wrote:
>>>>
>>>>> Well now I am totally baffled !!!!!!!!!!
>>>>>
>>>>> Using
>>>>>
>>>>> sum( !is.na(b[,3])) I get the total of all col 3 except those  that 
>>>>> are NA -
>>>>> Great solves the first problem
>>>>>
>>>>> What I can't seem to do is use the same logic to count all the  1's 
>>>>> in that
>>>>> col, which are there before I use the cast with margins.
>>>>>
>>>>> So it seems to me that somehow   is wrong and is the part of my 
>>>>> understanding that's missing.
>>>>>
>>>>> My guess is that that before using margins and sum in the cast 
>>>>> statement the col is a character type and in order for == 1 to  work 
>>>>> I need to convert this to an integer.
>>>>
>>>> Yiu can test your theory with:
>>>>
>>>> sum(as.integer(b[,3]) == 1)
>>>>
>>>> Or you could post some reproducible data using dput ....
>>>>
>>>> -- 
>>>> David.
>>>>
>>>>
>>>>>
>>>>> Hope this helps you to understand the problem.
>>>>>
>>>>> Regards
>>>>> Steve
>>>>>
>>>>> Your help is much appreciated
>>>>> ----- Original Message ----- From: "David Winsemius" 
>>>>> <dwinsemius at comcast.net
>>>>> >
>>>>> To: "Steve Sidney" <sbsidney at mweb.co.za>
>>>>> Cc: <r-help at r-project.org>
>>>>> Sent: Sunday, January 17, 2010 7:36 PM
>>>>> Subject: Re: [R] Help using Cast (Text) Version
>>>>>
>>>>>
>>>>>>
>>>>>> On Jan 17, 2010, at 11:56 AM, Steve Sidney wrote:
>>>>>>
>>>>>>> David
>>>>>>>
>>>>>>> Thanks, I'll try that......but no what I need is the total  (1's) 
>>>>>>> for
>>>>>>> each of the rows, labelled 1-6 at the top of each col in the  table
>>>>>>> provided.
>>>>>>
>>>>>> Part of my confusion with your request (which remains  unaddressed) 
>>>>>> is
>>>>>> what you mean by "valid". The melt-cast operation has turned a 
>>>>>> bunch of
>>>>>> NA's into 0's which are now indistinguishable from the  original 
>>>>>> 0's. So I
>>>>>> don't see any way that operating on "b" could tell you the   numbers 
>>>>>> you
>>>>>> are asking for. If you were working on the original data,   "res", 
>>>>>> you
>>>>>> might have gotten the column-wise "valid" counts of column  2 with
>>>>>> something like:
>>>>>>
>>>>>> sum( !is.na(res[,2]) )
>>>>>>
>>>>>>>
>>>>>>> What I guess I am not sure of is how to identify the col after   the 
>>>>>>> melt
>>>>>>> and cast.
>>>>>>
>>>>>> The cast object represents columns as a list of vectors. The i- th 
>>>>>> column
>>>>>> is b[[i]] which could be further referenced as a vector. So the   j- 
>>>>>> th row
>>>>>> entry for the i-th column would be b[[i]][j].
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Steve
>>>>>>>
>>>>>>> ----- Original Message ----- From: "David Winsemius"
>>>>>>> <dwinsemius at comcast.net
>>>>>>> >
>>>>>>> To: "Steve Sidney" <sbsidney at mweb.co.za>
>>>>>>> Cc: <r-help at r-project.org>
>>>>>>> Sent: Sunday, January 17, 2010 4:39 PM
>>>>>>> Subject: Re: [R] Help using Cast (Text) Version
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> On Jan 17, 2010, at 5:31 AM, Steve Sidney wrote:
>>>>>>>>
>>>>>>>>> Sorry to repeat the meassage, not sure if the HTML version  has 
>>>>>>>>> been
>>>>>>>>> received - Apologies for duplication
>>>>>>>>>
>>>>>>>>> Dear list
>>>>>>>>>
>>>>>>>>> I am trying to count the no of occurances in a column of a   data 
>>>>>>>>> frame
>>>>>>>>> and there is missing data identifed by NA.
>>>>>>>>>
>>>>>>>>> I am able to melt and cast the data correctly as well as sum  the
>>>>>>>>> occurances using margins and sum.
>>>>>>>>>
>>>>>>>>> Here are the melt and cast commands
>>>>>>>>>
>>>>>>>>> bw = melt(res, id=c("lab","r"), "pf_zbw")
>>>>>>>>> b = cast(bw, lab ~ r, sum, margins = T)
>>>>>>>>>
>>>>>>>>> Sample Data (before using sum and margins)
>>>>>>>>>
>>>>>>>>> lab  1  2  3  4  5  6
>>>>>>>>> 1  4er66  1 NA  1  0 NA  0
>>>>>>>>> 2  4gcyi  0  0  1  0  0  0
>>>>>>>>> 3  5d3hh  0  0  0 NA  0  0
>>>>>>>>> 4  5d3wt  0  0  0  0  0  0
>>>>>>>>> .
>>>>>>>>> . lines deleted to save space
>>>>>>>>> .
>>>>>>>>> 69 v3st5 NA NA  1 NA NA NA
>>>>>>>>> 70 a22g5 NA  0 NA NA NA NA
>>>>>>>>> 71 b5dd3 NA  0 NA NA NA NA
>>>>>>>>> 72 g44d2 NA  0 NA NA NA NA
>>>>>>>>>
>>>>>>>>> Data after using sum and margins
>>>>>>>>>
>>>>>>>>> lab 1 2 3 4 5 6 (all)
>>>>>>>>> 1  4er66 1 0 1 0 0 0     2
>>>>>>>>> 2  4gcyi 0 0 1 0 0 0     1
>>>>>>>>> 3  5d3hh 0 0 0 0 0 0     0
>>>>>>>>> 4  5d3wt 0 0 0 0 0 0     0
>>>>>>>>> 5  6n44r 0 0 0 0 0 0     0
>>>>>>>>> .
>>>>>>>>> .lines deleted to save space
>>>>>>>>> .
>>>>>>>>> 70 a22g5 0 0 0 0 0 0     0
>>>>>>>>> 71 b5dd3 0 0 0 0 0 0     0
>>>>>>>>> 72 g44d2 0 0 0 0 0 0     0
>>>>>>>>> 73 (all) 5 2 4 3 5 7    26
>>>>>>>>>
>>>>>>>>> Uisng length just tells me how many total rows there are.
>>>>>>>>
>>>>>>>>
>>>>>>>>> What I need to do is count how many rows there is valid  data,  in 
>>>>>>>>> this
>>>>>>>>> case either a one (1) or a zero (0) in b
>>>>>>>>
>>>>>>>> I'm guessing that you mean to apply that test to the column in b
>>>>>>>> labeled "(all)" . If that's the case, then something like 
>>>>>>>> (obviously
>>>>>>>> untested):
>>>>>>>>
>>>>>>>> sum( b$'(all)' == 1 | b$'(all)' == 0)
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> I have a report to construct for tomorrow Mon so any help  would 
>>>>>>>>> be
>>>>>>>>> appreciated
>>>>>>>>>
>>>>>>>>> Regards
>>>>>>>>> Steve
>>>>>>>>
>>>>>>>> David Winsemius, MD
>>>>>>>> Heritage Laboratories
>>>>>>>> West Hartford, CT
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>> David Winsemius, MD
>>>>>> Heritage Laboratories
>>>>>> West Hartford, CT
>>>>>>
>>>>>>
>>>>>
>>>>
>>>> David Winsemius, MD
>>>> Heritage Laboratories
>>>> West Hartford, CT
>>>>
>>> <Test_data_b.txt><Test_data_res.txt><Annual Results NLA WMS Ver1.r>
>>
>> David Winsemius, MD
>> Heritage Laboratories
>> West Hartford, CT
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide 
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius, MD
> Heritage Laboratories
> West Hartford, CT
>



More information about the R-help mailing list