[R] discrete ECDF

David Winsemius dwinsemius at comcast.net
Wed Aug 4 18:07:56 CEST 2010


On Aug 4, 2010, at 11:06 AM, Charles C. Berry wrote:

> On Wed, 4 Aug 2010, David Winsemius wrote:
>
>> Dear list;
>>
>> I just created a utility function that replicates what I have done  
>> in the past with Excel or OO.org by putting a formula of the form  
>> =sum($A1:A$1) in an upper-corner of a section and then doing a  
>> "fill" procedure by dragging the lower-rt corner down and to the  
>> right. When divided by the grand sum of the entries this function  
>> then calculates a 2D-discrete-ECDF.
>>
>> I keep thinking I am missing the obvious, but I did try searching.  
>> Here is my effort at creating that functionality:
>>
>> ecdf.tbl <- function (.dat) { .dat <- data.matrix(.dat)  #speeds up  
>> calculations
>>      .sdat <- matrix(0, nrow(.dat), ncol(.dat) )
>>      .sdat[] <- sapply(1:ncol(.dat), function(x)
>>                     sapply(1:nrow(.dat),
>>                           function(y)  sum(.dat[1:y, 1:x],  
>> na.rm=TRUE )  ) )
>> return(.sdat) }
>>
>
> ecdf.tbl3 <-
> 	function(mat) {
> 		mat[is.na(mat)] <- 0
> 		t( apply( apply( mat,2, cumsum ), 1, cumsum ))}

Nice, ...  and I don't think the inner apply call is even necessary:

ecdf.tbl2 <-
	function(mat) {
		mat[is.na(mat)] <- 0
		t( apply( cumsum(mat), 1, cumsum ))}

-- 
David.

>
> HTH,
>
> Chuck
>
>
>>> tst <- read.table(textConnection("NA 5 6
>> 4	5	7
>> 5	6	8
>> 6	7	9
>> NA 8 NA")   )
>>
>>> tst
>> V1 V2 V3
>> 1 NA  5  6
>> 2  4  5  7
>> 3  5  6  8
>> 4  6  7  9
>> 5 NA  8 NA
>>
>>> ecdf.tbl(tst)
>>   [,1] [,2] [,3]
>> [1,]    0    5   11
>> [2,]    4   14   27
>> [3,]    9   25   46
>> [4,]   15   38   68
>> [5,]   15   46   76
>>
>>> ecdf.tbl(tst)/sum(tst, na.rm=TRUE)
>>         [,1]       [,2]      [,3]
>> [1,] 0.00000000 0.06578947 0.1447368
>> [2,] 0.05263158 0.18421053 0.3552632
>> [3,] 0.11842105 0.32894737 0.6052632
>> [4,] 0.19736842 0.50000000 0.8947368
>> [5,] 0.19736842 0.60526316 1.0000000
>>
>>
>> Did I miss a more compact vectorized or sweep()-ed solution? (I  
>> realize this is not really a function in the sense that ecdf() is.)  
>> I have seen prop.table and margin.table, but could not see how they  
>> would address this problem.
>>
>> -- 
>>
>> David Winsemius, MD
>> West Hartford, CT
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> Charles C. Berry                            (858) 534-2098
>                                            Dept of Family/Preventive  
> Medicine
> E mailto:cberry at tajo.ucsd.edu	            UC San Diego
> http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego  
> 92093-0901
>
>

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list