[R] How to parse text file into a table?

jim holtman jholtman at gmail.com
Sun Feb 22 17:18:27 CET 2009


Here is one way of reading in your data:

> input <- readLines(textConnection("###---------Start of record----------------------
+
+ Name: John
+ Height: 170cm
+ Weight: 70kg
+ Age: 30
+
+ Status: Married
+ Children: 2
+
+ Employment:  Engineer
+
+ ###---------End of record-----------------------"))
> closeAllConnections()
> result <- list()
> recordNo <- 1
> for (i in input){
+     if (nchar(i) == 0) next
+     if (length(grep("Start of record", i)) != 0){
+         # initialize next element of the list
+         result[[recordNo]] <- c(Name=NA, Height=NA, Weight=NA,
+             Age=NA, Status=NA, Children=NA)
+
+     }
+     else if (length(grep("End of record", i)) != 0) recordNo <- recordNo + 1
+     else {
+         # follow assumes you have consistent naming with ":" terminating data
+         # if not, add some error checking code
+         name <- sub("^(.*):.*", "\\1", i)
+         value <- sub(".*:\\s*(.*)", "\\1", i)
+         result[[recordNo]][name] <- value
+     }
+ }
>
> result
[[1]]
      Name     Height     Weight        Age     Status   Children Employment
    "John"    "170cm"     "70kg"       "30"  "Married"        "2" "Engineer"

>


On Sun, Feb 22, 2009 at 8:45 AM, Daren Tan <darentan76 at gmail.com> wrote:
> I am given a text file of records to be converted into a table format.
> I have searched related topics or packages, but can't find any similar
> cases. Please help.
>
> Sample record is given below. Take note the last element doesn't have
> a semi colon.
>
> ###---------Start of record----------------------
>
> Name : John
> Height: 170cm
> Weight  : 70kg
> Age: 30
>
> Status: Married
> Children: 2
>
> Employment  Engineer
>
> ###---------End of record-----------------------
>
> Table format should have this header
> Name    Height  Weight  Age     Status  Children        Employment
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?




More information about the R-help mailing list