[R] create new matrix from user-defined function

arun smartpink111 at yahoo.com
Thu Jul 11 22:28:32 CEST 2013


Hi,
Not sure I understand you correctly.
I found it easier to index using number than replace it by lengthy column names.
You could do it similar to the one below.

matNew<-matrix(dat3[rowSums(dat3[c("B_MW_EEsDue1","C_MW_EEsDue2")])!=dat3["D_MW_EEsDueTotal"],1],ncol=1,dimnames=list(NULL,"MW_EEsDue_ERRORS"))

 matNew
#     MW_EEsDue_ERRORS
#[1,]             1882
#[2,]             1884
#[3,]             1885

If you have very large dataset, you could also check ?data.table().


library(data.table)
dt3<- data.table(dat3)
dtNew<-subset(dt3[D_MW_EEsDueTotal!=B_MW_EEsDue1+C_MW_EEsDue2],select=1)
 dtNew
#   A_CaseID
#1:     1882
#2:     1884
#3:     1885


#Some speed comparisons:
set.seed(1254)
datTest<- data.frame(A=sample(1000:15000,1e7,replace=TRUE),B= sample(1:10,1e7,replace=TRUE),C=sample(5:15,1e7,replace=TRUE),D=sample(5:25,1e7,replace=TRUE))

system.time(res1<- data.frame(MW_EEsDue_ERRORS=datTest[datTest[[4]] != datTest[[2]]+datTest[[3]],][[1]]))
# user  system elapsed 
#  2.256   0.000   2.145 

system.time(mat1<-matrix(datTest[rowSums(datTest[,2:3])!=datTest[,4],1],ncol=1,dimnames=list(NULL,"MW_EEsDue_ERRORS")))
 #  user  system elapsed 
 # 0.756   0.088   0.849 

system.time(res2<- data.frame(MW_EEsDue_ERRORS=datTest[addmargins(as.matrix(datTest[,2:3]),2)[,3]!=datTest[,4],1]))
#   user  system elapsed 
#115.740   0.000 105.778 

dtTest<- data.table(datTest)
system.time(res3<- subset(dtTest[D!=B+C],select=1))
 # user  system elapsed 
 # 0.508   0.000   0.477 

identical(res1,res2)
#[1] TRUE
setnames(res3,"A","MW_EEsDue_ERRORS")
 identical(res1,as.data.frame(res3))
#[1] TRUE
A.K.




----- Original Message -----
From: bcrombie <bcrombie at utk.edu>
To: r-help at r-project.org
Cc: 
Sent: Thursday, July 11, 2013 3:54 PM
Subject: Re: [R] create new matrix from user-defined function

Dan and Arun, thank you very much for your replies.  They are both very helpful and I love to get different versions of an answer so I can learn more R code.  You both used indexing to refer to the columns needed in the function, but since my real data frame will be much larger I'm assuming I can replace the index numbers with the names of the columns in quotes instead?   I'll try this on my own if you're busy with other forum questions.  Thanks, again.

From: Nordlund, Dan (DSHS/RDA) [via R] [mailto:ml-node+s789695n4671267h89 at n4.nabble.com]
Sent: Wednesday, July 10, 2013 5:46 PM
To: Crombie, Burnette N
Subject: Re: create new matrix from user-defined function

> -----Original Message-----
> From: [hidden email]</user/SendEmail.jtp?type=node&node=4671267&i=0> [mailto:r-help-bounces at r-
> project.org<mailto:r-help-bounces at r-%20%0b%3e%20project.org>] On Behalf Of bcrombie
> Sent: Wednesday, July 10, 2013 12:19 PM
> To: [hidden email]</user/SendEmail.jtp?type=node&node=4671267&i=1>
> Subject: [R] create new matrix from user-defined function
>
> #Let's say I have the following data set:
>
> dat3 = data.frame(A_CaseID = c(1881, 1882, 1883, 1884, 1885),
>                   B_MW_EEsDue1 = c(2, 2, 1, 4, 6),
>                   C_MW_EEsDue2 = c(5, 5, 4, 1, 6),
>                   D_MW_EEsDueTotal = c(7, 9, 5, 6, 112))
> dat3
> # A_CaseID B_MW_EEsDue1 C_MW_EEsDue2 D_MW_EEsDueTotal
> # 1     1881            2            5                7
> # 2     1882            2            5                9
> # 3     1883            1            4                5
> # 4     1884            4            1                6
> # 5     1885            6            6              112
>
> # I want to:
> #CREATE A NEW 1-COLUMN MATRIX (of unknown #rows) LISTING ONLY "A"'s
> WHERE "D
> != B + C"
> #THIS COLUMN CAN BE LABELED "MW_EEsDue_ERRORS", and output for this
> example
> should be:
>
> # MW_EEsDue_ERRORS
> # 1 1882
> # 2 1884
> # 3 1885
>
> #What is the best way to do this?  Thanks for your time.  BNC
>
>

Here is one option, there are many others.  Only you can decide what is "best".

data.frame(MW_EEsDue_ERRORS=dat3[dat3[[4]] != dat3[[2]]+dat3[[3]],][[1]])


Hope this is helpful,

Dan

Daniel J. Nordlund
Washington State Department of Social and Health Services
Planning, Performance, and Accountability
Research and Data Analysis Division
Olympia, WA 98504-5204

______________________________________________
[hidden email]</user/SendEmail.jtp?type=node&node=4671267&i=2> mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

________________________________
If you reply to this email, your message will be added to the discussion below:
http://r.789695.n4.nabble.com/create-new-matrix-from-user-defined-function-tp4671250p4671267.html
To unsubscribe from create new matrix from user-defined function, click here<http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=4671250&code=YmNyb21iaWVAdXRrLmVkdXw0NjcxMjUwfC0xMzI5MzM0NzI3>.
NAML<http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>




--
View this message in context: http://r.789695.n4.nabble.com/create-new-matrix-from-user-defined-function-tp4671250p4671361.html
Sent from the R help mailing list archive at Nabble.com.
    [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list