[R] sapply to bind columns, with repeat?

Fri Aug 12 20:28:19 CEST 2011

Katrina,

try this.

reorg<-function(x){
mat<-matrix(x[9:length(x)],ncol=6,byrow=T)
rem.col<-matrix(rep(x[1:8],nrow(mat)),byrow=T,ncol=8)
return(data.frame(cbind(rem.col,mat)))
}

co<-do.call('rbind',apply(coop.dat,1,function(x) reorg(x)))

You may need to tweak a bit to fit exactly what you want.

Weidong Gu

On Fri, Aug 12, 2011 at 2:35 AM, Katrina Bennett <kebennett at alaska.edu> wrote:
> Hi R-help,
>
> I am working with US COOP network station data and the files are
> concatenated in single rows for all years, but I need to pull these
> apart into rows for each day. To do this, I need to extract part of
> each row such as station id, year, mo, and repeat this against other
> variables in the row (days). My problem is that there are repeated
> values for each day, and the files are fixed width field without
> order.
>
> Here is an example of just one line of data.
>
> coop.raw <- c("DLY09752806TMAX F20100199990620107 00049 20107 00062
> B0207 00041 20207 00049 B0307 00040 20307 00041 B0407 00042 20407
> 00040 B0507 00041 20507 00042 B0607 00043 20607 00041 B0707 00055
> 20707 00043 B0807 00039 20807 00055 B0907 00037 20907 00039 B1007
> 00038 21007 00037 B1107 00048 21107 00038 B1207 00050 21207 00048
> B1307 00051 21307 00050 B1407 00058 21407 00051 B1507 00068 21507
> 00058 B1607 00065 21607 00068 B1707 00068 21707 00065 B1807 00067
> 21807 00068 B1907 00068 21907 00067 B2007 00069 22007 00068 B2107
> 00057 22107 00069 B2207 00048 22207 00057 B2307 00051 22307 00048
> B2407 00073 22407 00051 B2507 00062 22507 00073 B2607 00056 22607
> 00062 B2707 00053 22707 00056 B2807 00064 22807 00053 B2907 00057
> 22907 00064 B3007 00047 23007 00057 B3107 00046 23107 00047 B")
> write.csv(coop.raw, "coop.tmp", row.names=F, quote=F)
> coop.dat <- read.fwf("coop.tmp", widths =
> c(c(3,8,4,2,4,2,4,3),rep(c(2,2,1,5,1,1),62)), na.strings=c("9999"),
> skip=1, as.is=T)
> rep.name <- rep(c("day","hr","met","dat","fl1","fl2"), 62)
> rep.count <- rep(c(1:62), each=6, 1)
> names(coop.dat) <- c("rect", "id", "elem", "unt", "year", "mo",
> "fill", "numval", paste(rep.name, rep.count, sep="_"))
>
> I would like to generate output that contains in one row, the columns
> "id", "elem", "unt", "year", "mo", and "numval". Binded to these
> initial columns, I would like only "day_1", "hr_1", "met_1", "dat_1",
> "fl1_1", and "fl2_1". Then, in the next row I would like repeated the
> initial columns "id", "elem", "unt", "year", "mo", and "numval" and
> then binded "day_2", "hr_2", "met_2", "dat_2", "fl1_2", and "f2_2" and
> so on until all the data for all rows has been allocated. Then, move
> onto the next row and repeat.
>
> I think I should be able to do this with some sort of sapply or lapply
> function, but I'm struggling with the format for repeating the initial
> columns, and then skipping through the next columns.
>
> Thank you,
>
> Katrina
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>