[R] Help with how to process multiple column variable in a read.table

arun smartpink111 at yahoo.com
Thu May 16 17:58:24 CEST 2013


Hi,
Try this:
unemp.wy <- read.table("ftp://ftp.bls.gov/pub/time.series/la/la.data.59.Wyoming", header=TRUE, sep="\t",stringsAsFactors=FALSE,na.strings="") 
dim(unemp.wy)
#[1] 46692     5
 head(unemp.wy)
#          series_id year period value footnote_codes
#1 LASST56000003     1976    M01   4.2           <NA>
#2 LASST56000003     1976    M02   4.1           <NA>
#3 LASST56000003     1976    M03   4.0           <NA>
#4 LASST56000003     1976    M04   3.9           <NA>
#5 LASST56000003     1976    M05   3.9           <NA>
#6 LASST56000003     1976    M06   3.9           <NA>
 str(unemp.wy)
#'data.frame':    46692 obs. of  5 variables:
# $ series_id     : chr  "LASST56000003    " "LASST56000003    " "LASST56000003    " "LASST56000003    " ...
# $ year          : int  1976 1976 1976 1976 1976 1976 1976 1976 1976 1976 ...
# $ period        : chr  "M01" "M02" "M03" "M04" ...
# $ value         : num  4.2 4.1 4 3.9 3.9 3.9 4 4.1 4.1 4 ...
# $ footnote_codes: chr  NA NA NA NA ...
 tail(unemp.wy)
#              series_id year period  value footnote_codes
#46687 LAUST56000006     2012    M11 305820              D
#46688 LAUST56000006     2012    M12 304293              D
#46689 LAUST56000006     2012    M13 306064              D
#46690 LAUST56000006     2013    M01 305150           <NA>
#46691 LAUST56000006     2013    M02 304918           <NA>
#46692 LAUST56000006     2013    M03 305556              P
A.K.


>I am new to R.  I am trying to read a table from BLS FTP site: the 
column structure has 5 columns but on the 5th column data is not always 
present, >so it is throwing of error: here is my code: 
>
 >unemp.wy <- read.table("ftp://ftp.bls.gov/pub/time.series/la/la.data.59.Wyoming", header=FALSE, sep="", skip=2 ) 
>
>Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  : 
 > line 384 did not have 4 elements 
>
>Here is the structure of the text. About 384 rows the footnote 
column gets added as well. This seems to throw of the read.table. Is it 
possible to just >read the line a a text string and then parse it or is 
there a better way to approach this problem. 
>series_id	year	period	value	footnote_codes 
>LASST56000003    	1976	M01	         4.2 
>LASST56000003    	1976	M02	         4.1 
>LASST56000003    	1976	M03	         4.0 
LASST56000003    	1976	M04	         3.9 
>LASST56000003    	1976	M05	         3.9 
>
>Thanks I am using R after having used SAS for years, so I am 
unsure of the best way to overcome a Program vector approach to data 
cleansing. 
>
>Thanks



More information about the R-help mailing list