[R] Alignment of data sets

Sat Sep 7 04:08:19 CEST 2013

HI,

The question is not clear.

Lines1<- readLines(textConnection("Year, Day, Hour, Value
2010,  001,    0,    15.9
2010,  001,    1,    7.3
2010,  001,    2,    5.2
2010,  001,    3,    8.0
2010,  001,    4,    0.0
2010,  001,    5,    12.1
2010,  001,    6,    11.6
2010,  001,    7,    13.9
2010,  001,    8,    11.9
2010,  001,    9,    13.6
2010,  001,    10,    16.1
2010,  001,    11,    18.5"))

library(stringr)
#Looking at the spaces between each comma.

str_count(gsub("(\\d+,\\s+\\d+).*","\\1",Lines1[-1])," ")
# [1] 2 2 2 2 2 2 2 2 2 2 2 2
str_count(gsub("^\\d+,\\s+(\\d+,\\s+\\d+).*","\\1",Lines1[-1])," ")
# [1] 4 4 4 4 4 4 4 4 4 4 4 4
str_count(gsub("\\d+,\\s+\\d+,\\s+(\\d+,\\s+\\d+)","\\1",Lines1[-1])," ")
# [1] 4 4 4 4 4 4 4 4 4 4 4 4

Lines2<- gsub(",",",   ",gsub(" ","",Lines1))[-1]
 str_count(Lines2," ")
# [1] 9 9 9 9 9 9 9 9 9 9 9 9
 str_count(gsub("(\\d+,\\s+\\d+).*","\\1",Lines2)," ")
# [1] 3 3 3 3 3 3 3 3 3 3 3 3
str_count(gsub("^\\d+,\\s+(\\d+,\\s+\\d+).*","\\1",Lines2)," ")
# [1] 3 3 3 3 3 3 3 3 3 3 3 3
str_count(gsub("\\d+,\\s+\\d+,\\s+(\\d+,\\s+\\d+)","\\1",Lines2)," ")
# [1] 3 3 3 3 3 3 3 3 3 3 3 3

write(Lines2,"capture2.txt")

A.K.

----- Original Message -----
From: "Mostafavipak, Nasrin" <Nasrin.Mostafavipak at stantec.com>
To: "r-help at R-project.org" <r-help at r-project.org>
Cc: 
Sent: Friday, September 6, 2013 3:42 PM
Subject: [R] Alignment of data sets

Hi all;

I have a data set with the format below:

Year, Day, Hour, Value

2010,  001,    0,    15.9
2010,  001,    1,    7.3
2010,  001,    2,    5.2
2010,  001,    3,    8.0
2010,  001,    4,    0.0
2010,  001,    5,    12.1
2010,  001,    6,    11.6
2010,  001,    7,    13.9
2010,  001,    8,    11.9
2010,  001,    9,    13.6
2010,  001,    10,    16.1
2010,  001,    11,    18.5

That should be converted to this format:

2010,  001,    0,    15.9
2010,  001,    1,      7.3
2010,  001,    2,      5.2
2010,  001,    3,      8.0
2010,  001,    4,      0.0
2010,  001,    5,    12.1
2010,  001,    6,    11.6
2010,  001,    7,    13.9
2010,  001,    8,    11.9
2010,  001,    9,    13.6
2010,  001,  10,    16.1
2010,  001,  11,    18.5
The number of spaces is important. I have tried justify, but it produces spaces at the end or at the beginning of the rows depending on the choice of right, left alignment. Also I need 3 significant digits for the second column, when I use read.csv it gives me 1 instead of 001. So I use read.table, and one of the problems with read.table is that it produces row names that I don't want. Also I need commas in my output file.

So far this is the best I could do:

mydata = read.table("C:/ozone3.txt", sep = "")

capture.output( print(mydata, sep = ",", print.gap=3), file="capture2.txt" )

and the output has all the unwanted row names and also there are no commas.

Any suggestions?

Thank you
Nasrin

    [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.