[R] subset data using a vector

Michael Dewey lists at dewey.myzen.co.uk
Mon Nov 23 19:10:33 CET 2015


Try looking at your function and work through what happens if the length 
is what I suggested.

 >>       x <- length(plot.id)
 >>
 >>       for (i in (1:x)) {
 >>
 >>           m2[i] <- m1[[i]]

So unless m1 has length at least x you are doomed.

On 23/11/2015 16:26, DIGHE, NILESH [AG/2362] wrote:
> Michael:  I like to use the actual range id's listed in column "rangestouse" to subset my data and not the length of that vector.
>
> Thanks.
> Nilesh
>
> -----Original Message-----
> From: Michael Dewey [mailto:lists at dewey.myzen.co.uk]
> Sent: Monday, November 23, 2015 10:17 AM
> To: DIGHE, NILESH [AG/2362]; r-help at r-project.org
> Subject: Re: [R] subset data using a vector
>
> length(strsplit(as.character(mydata$ranges2use), ","))
>
> was that what you expected? I think not.
>
> On 23/11/2015 16:05, DIGHE, NILESH [AG/2362] wrote:
>> Dear R users,
>>                   I like to split my data by a vector created by using variable "ranges".  This vector will have the current range (ranges), preceding range (ranges - 1), and post range (ranges + 1) for a given plotid.  If the preceding or post ranges in this vector are outside the levels of ranges in the data set then I like to drop those ranges and only include the ranges that are available.  Variable "rangestouse" includes all the desired ranges I like to subset a given plotid.  After I subset these dataset using these desired ranges, then I like to extract the yield data for checks in those desired ranges and adjust yield of my data by dividing yield of a given plotid with the check average for the desired ranges.
>>
>> I have created this function (fun1) but when I run it, I get the following error:
>>
>> Error in m1[[i]] : subscript out of bounds
>>
>> Any help will be highly appreciated!
>> Thanks, Nilesh
>>
>> Dataset:
>> dput(mydata)
>> structure(list(rows = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
>> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
>> 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
>> 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
>> 4L, 4L, 4L, 4L), .Label = c("1", "2", "3", "4"), class = "factor"),
>> cols = structure(c(1L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 2L, 3L, 4L,
>> 5L, 6L, 7L, 8L, 9L, 1L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 2L, 3L, 4L,
>> 5L, 6L, 7L, 8L, 9L, 1L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 2L, 3L, 4L,
>> 5L, 6L, 7L, 8L, 9L, 1L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 2L, 3L, 4L,
>> 5L, 6L, 7L, 8L, 9L), .Label = c("1", "2", "3", "4", "5", "6", "7",
>> "8", "9", "10", "11", "12", "13", "14", "15", "16"), class = "factor"),
>>       plotid = c(289L, 298L, 299L, 300L, 301L, 302L, 303L, 304L,
>>       290L, 291L, 292L, 293L, 294L, 295L, 296L, 297L, 384L, 375L,
>>       374L, 373L, 372L, 371L, 370L, 369L, 383L, 382L, 381L, 380L,
>>       379L, 378L, 377L, 376L, 385L, 394L, 395L, 396L, 397L, 398L,
>>       399L, 400L, 386L, 387L, 388L, 389L, 390L, 391L, 392L, 393L,
>>       480L, 471L, 470L, 469L, 468L, 467L, 466L, 465L, 479L, 478L,
>>       477L, 476L, 475L, 474L, 473L, 472L), yield = c(5.1, 5, 3.9,
>>       4.6, 5, 4.4, 5.1, 4.3, 5.5, 5, 5.5, 6.2, 5.1, 5.5, 5.2, 5,
>>       5.6, 4.7, 5.4, 4.8, 4.6, 3.9, 4.2, 4.4, 5.3, 5.5, 5.8, 4.6,
>>       5.8, 4.8, 5.3, 5.5, 5.6, 4.2, 4.6, 4.2, 4.2, 4, 3.9, 4.5,
>>       5, 4.8, 4.9, 5.2, 5.3, 4.6, 4.8, 5.3, 4.5, 4.5, 5.1, 4.9,
>>       5.2, 4.6, 4.8, 5.4, 5.9, 4.9, 5.8, 5.3, 4.8, 4.7, 5.2, 5.8
>>       ), linecode = structure(c(1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L,
>>       2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L,
>>       2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
>>       1L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
>>       2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 1L), .Label = c("check",
>>       "variety"), class = "factor"), ranges = c(1L, 1L, 1L, 1L,
>>       1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L,
>>       2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L,
>>       3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L,
>>       4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L
>>       ), rangestouse = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
>>       1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L,
>>       2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L,
>>       3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L,
>>       4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L), .Label = c("1,2",
>>       "1,2,3", "2,3,4", "3,4"), class = "factor")), .Names = c("rows",
>> "cols", "plotid", "yield", "linecode", "ranges", "rangestouse"
>>
>> ), class = "data.frame", row.names = c(NA, -64L))
>>
>> Function:
>>
>> fun1<- function (dataset, plot.id, ranges2use, control)
>>
>> {
>>
>>       m1 <- strsplit(as.character(dataset$ranges2use), ",")
>>
>>       dat1 <- data.frame()
>>
>>       m2 <- c()
>>
>>       row_check_mean <- c()
>>
>>       row_check_adj_yield <- c()
>>
>>       x <- length(plot.id)
>>
>>       for (i in (1:x)) {
>>
>>           m2[i] <- m1[[i]]
>>
>>           dat1 <- dataset[dataset$ranges %in% m2[i], ]
>>
>>           row_check_mean[i] <- tapply(dat1$trait, dat1$control,
>>
>>               mean, na.rm = TRUE)[1]
>>
>>           row_check_adj_yield[i] <- ifelse(control[i] == "variety",
>>
>>               trait[i]/dataset$row_check_mean[i], trait[i]/trait[i])
>>
>>       }
>>
>>       data.frame(dataset, row_check_adj_yield)
>>
>> }
>>
>> Apply function:
>> fun1(mydata, plot.id=mydata$plotid, ranges2use =
>> mydata$rangestouse,control=mydata$linecode)
>>
>> Error:
>>
>> Error in m1[[i]] : subscript out of bounds
>>
>> Session info:
>>
>> R version 3.2.1 (2015-06-18)
>>
>> Platform: i386-w64-mingw32/i386 (32-bit)
>>
>> Running under: Windows 7 x64 (build 7601) Service Pack 1
>>
>>
>>
>> locale:
>>
>> [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United
>> States.1252
>>
>> [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
>>
>> [5] LC_TIME=English_United States.1252
>>
>>
>>
>> attached base packages:
>>
>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>
>>
>>
>> loaded via a namespace (and not attached):
>>
>>    [1] magrittr_1.5    plyr_1.8.3      tools_3.2.1     reshape2_1.4.1  Rcpp_0.12.1     stringi_1.0-1
>>
>>    [7] grid_3.2.1      agridat_1.12    stringr_1.0.0   lattice_0.20-31
>>
>>
>> Nilesh Dighe
>> (806)-252-7492 (Cell)
>> (806)-741-2019 (Office)
>>
>>
>> This e-mail message may contain privileged and/or confidential
>> information, and is intended to be received only by persons entitled
>> to receive such information. If you have received this e-mail in error, please notify the sender immediately. Please delete it and all attachments from any servers, hard drives or any other media. Other use of this e-mail by you is strictly prohibited.
>>
>> All e-mails and attachments sent and received are subject to
>> monitoring, reading and archival by Monsanto, including its subsidiaries. The recipient of this e-mail is solely responsible for checking for the presence of "Viruses" or other "Malware".
>> Monsanto, along with its subsidiaries, accepts no liability for any
>> damage caused by any such code transmitted by or accompanying this e-mail or any attachment.
>>
>>
>> The information contained in this email may be subject to the export
>> control laws and regulations of the United States, potentially
>> including but not limited to the Export Administration Regulations
>> (EAR) and sanctions regulations issued by the U.S. Department of Treasury, Office of Foreign Asset Controls (OFAC).  As a recipient of this information you are obligated to comply with all applicable U.S. export laws and regulations.
>>
>> 	[[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> --
> Michael
> http://www.dewey.myzen.co.uk/home.html
> This e-mail message may contain privileged and/or confidential information, and is intended to be received only by persons entitled
> to receive such information. If you have received this e-mail in error, please notify the sender immediately. Please delete it and
> all attachments from any servers, hard drives or any other media. Other use of this e-mail by you is strictly prohibited.
>
> All e-mails and attachments sent and received are subject to monitoring, reading and archival by Monsanto, including its
> subsidiaries. The recipient of this e-mail is solely responsible for checking for the presence of "Viruses" or other "Malware".
> Monsanto, along with its subsidiaries, accepts no liability for any damage caused by any such code transmitted by or accompanying
> this e-mail or any attachment.
>
>
> The information contained in this email may be subject to the export control laws and regulations of the United States, potentially
> including but not limited to the Export Administration Regulations (EAR) and sanctions regulations issued by the U.S. Department of
> Treasury, Office of Foreign Asset Controls (OFAC).  As a recipient of this information you are obligated to comply with all
> applicable U.S. export laws and regulations.
>

-- 
Michael
http://www.dewey.myzen.co.uk/home.html



More information about the R-help mailing list