[R] Split fixed width data in R

Clint Bowman clint at ecy.wa.gov
Wed Oct 22 17:54:04 CEST 2014


?read.fortran

Clint Bowman			INTERNET:	clint at ecy.wa.gov
Air Quality Modeler		INTERNET:	clint at math.utah.edu
Department of Ecology		VOICE:		(360) 407-6815
PO Box 47600			FAX:		(360) 407-7534
Olympia, WA 98504-7600

         USPS:           PO Box 47600, Olympia, WA 98504-7600
         Parcels:        300 Desmond Drive, Lacey, WA 98503-1274

On Wed, 22 Oct 2014, Zilefac Elvis wrote:

> Hi,
> I have fixed width data that I would like to split into columns. Here is a sanpshot of the data (actual data is a list object):
> lst1Sub<-
> "20131124GGG1 23.00"
> "20131125GGG1 15.00"
> "20131128GGG1  0.00"
> "201312 1GGG1  0.00"
> "201312 4GGG1  0.00"
> "201312 7GGG1 10.00"
> "20131210GGG1  0.00"
> "20131213GGG1  0.00"
> "20131216GGG1  0.00"
> "20131219GGG1  0.00"
> "20131222GGG1  0.00"
> "20131225GGG1  0.00"
> "20131228GGG1  0.00"
>
> The following script will split the data into [Year Month Day Site Precipitation]
> ------------------------------------------------------------------------------------------------------
> library(stringr)
> dateSite <- gsub("(.*G.{3}).*","\\1",lst1Sub);
> dat1 <- data.frame(Year=as.numeric(substr(dateSite,1,4)), Month=as.numeric(substr(dateSite,5,6)),
>                   Day=as.numeric(substr(dateSite,7,8)),Site=substr(dateSite,9,12),Rain=substr(dateSite,13,18),stringsAsFactors=FALSE);
> lst3 <- lapply(lst1Sub,function(x) {dateSite <- gsub("(.*G.{3}).*","\\1",x);
>                                    dat1 <- data.frame(Year=as.numeric(substr(dateSite,1,4)), Month=as.numeric(substr(dateSite,5,6)),Day=as.numeric(substr(dateSite,7,8)),Site=substr(dateSite,9,12),stringsAsFactors=FALSE);
>                                    Sims <- str_trim(gsub(".*G.{3}\\s?(.*)","\\1",x));Sims[grep("\\d+-",Sims)] <- gsub("(.*)([-][0-9]+\\.[0-9]+)","\\1 \\2",gsub("^([0-9]+\\.[0-9]+)(.*)","\\1 \\2", Sims[grep("\\d+-",Sims)]));
>                                    Sims1 <- read.table(text=Sims,header=FALSE); names(Sims1) <- c("Precipitation");dat2 <- cbind(dat1,Sims1)})
> ------------------------------------------------------------------------------------------------------------------------------------------
>
> Problem: the above script deletes the first value of my precipitation values. For example, after splitting, "20131124GGG1 23.00" becomes
> 2013 11 24 GGG1 3.00 INSTEAD of 2013 11 24 GGG1 23.00 (right answer).
>
> Anything wrong with the string trimming? Is there another way to arrive at the same answer?
>
> Thanks,
> AT.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list