[R] read a irregular text file data into dataframe()

Stephen Tucker brown_emu at yahoo.com
Sat Mar 10 07:50:48 CET 2007


I don't know of any canned function to do this but you can write your own
function (see contents below) to:

(1) open file connection
(2) read number of fields
(3) create empty matrix with the number of rows and maximum number of columns
of your data
(4) rewind to beginning of file
(5) scan line-by-line and fill the matrix
(6) close the file connection
(7) convert matrix to data frame
(8) use the function type.convert to automatically convert numerical columns
to mode numeric (since scan(), as I've specified it, reads in everything as
mode character, which converts the holding matrix's mode to character from
its default of logical).

the function below will work for your example data set, but to make it more
general, you can add arguments like 'what' to scan(), 'sep' to both
count.fields() and scan(); depending on whether you have column names you can
modify it accordingly as well.

# call function with this line
df <- read.irregular("c:\\test.txt")

# this is the function

read.irregular <- function(filenm) {
  fileID <- file(filenm,open="rt")
  nFields <- count.fields(fileID)
  mat <- matrix(nrow=length(nFields),ncol=max(nFields))
  invisible(seek(fileID,where=0,origin="start",rw="read"))
  for(i in 1:nrow(mat) ) {
    mat[i,1:nFields[i]] <-scan(fileID,what="",nlines=1,quiet=TRUE)
  }
  close(fileID)
  df <- as.data.frame(mat)
  df[] <- lapply(df,type.convert,as.is=TRUE)
  return(df)
}

Hope this helps.

--- "j.joshua thomas" <researchjj at gmail.com> wrote:

> I am using R2.4.1 calling a text file contains the following data
> structure:
> 
> when i call the file into R using
> 
> tData<-read.table("c:\\test.txt")
> 
> it gave me Error saying, irregular column in the data set
> however i need to use the below type of data
> 
> Is there any alternative in R?
> 
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> 
> 0010 0028 0061 0088
> 0010 0042 0084
> 0004 0010 0055
> 0010 0018 0040 0042
> 0010 0046 0059
> 0010 0016 0042 0055
> 0010 0012 0018 0054
> 0010 0034 0042 0102
> 0081
> 0001 0076 0085
> 0080 0086
> 0017 0032 0081
> 0004 0010 0055
> 0010 0042 0061 0080
> 0010 0017 0078 0084
> 0006 0010 0040 0042
> 0075 0080
> 0005 0028 0032
> 0006 0010 0040 0061
> -- 
> Lecturer J. Joshua Thomas
> KDU College Penang Campus
> Research Student,
> University Sains Malaysia
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 



 
____________________________________________________________________________________
It's here! Your new message!  
Get new email alerts with the free Yahoo! Toolbar.



More information about the R-help mailing list