[R] data formatting

arun smartpink111 at yahoo.com
Fri Feb 15 20:11:36 CET 2013



Dear Eliza,

Try this:

Lines1<-readLines(textConnection("1911.01.01       7.87
1911.01.02       9.26 
1911.01.03       8.06 
1911.01.04       8.13 
1911.01.05      12.90 
1911.02.06       5.45 
1911.02.07       3.26 
1911.03.08       5.70 
1911.03.09       9.24 
1911.04.10       7.60 
1911.05.11      14.82 
1911.05.12      14.10 
1911.06.13       7.87 
1911.06.14       9.26 
1911.07.15       8.06 
1911.07.16       8.13 
1911.08.17      12.90 
1911.08.18       5.45 
1911.09.19       3.26 
1911.09.20       5.70 
1911.10.21       9.24 
1911.10.22       7.60 
1911.11.23      14.82 
1911.12.24      14.10")) 

Lines2<-Lines1[Lines1!=""]
library(stringr)
 str_count(Lines2, " ")
# [1] 7 7 7 7 6 7 7 7 7 7 6 6 7 7 7 7 6 7 7 7 7 7 6 6


Lines2[str_count(Lines2," ")==7]<- str_replace(Lines2[str_count(Lines2," ")==7],"\\s+","     ") #reduced 2 spaces

 Lines2[str_count(Lines2," ")==6]<- str_replace(Lines2[str_count(Lines2," ")==6],"\\s+","    ") #reduced 2 spaces
 str_count(Lines2," ")
# [1] 5 5 5 5 4 5 5 5 5 5 4 4 5 5 5 5 4 5 5 5 5 5 4 4
substr(Lines2[substr(Lines2,6,6)==0|substr(Lines2,9,9)==0],6,6)<-" "
substr(Lines2[substr(Lines2,6,6)==0|substr(Lines2,9,9)==0],9,9)<-" "
str_count(Lines2," ") #see the difference in space.  This counts all the space.  Here 2 white space are added to replace 0
# [1] 7 7 7 7 6 7 7 7 7 6 5 5 6 6 6 6 5 6 6 6 5 5 4 4
Lines2
# [1] "1911. 1. 1     7.87" "1911. 1. 2     9.26" "1911. 1. 3     8.06"
# [4] "1911. 1. 4     8.13" "1911. 1. 5    12.90" "1911. 2. 6     5.45"
# [7] "1911. 2. 7     3.26" "1911. 3. 8     5.70" "1911. 3. 9     9.24"
#[10] "1911. 4.10     7.60" "1911. 5.11    14.82" "1911. 5.12    14.10"
#[13] "1911. 6.13     7.87" "1911. 6.14     9.26" "1911. 7.15     8.06"
#[16] "1911. 7.16     8.13" "1911. 8.17    12.90" "1911. 8.18     5.45"
#[19] "1911. 9.19     3.26" "1911. 9.20     5.70" "1911.10.21     9.24"
#[22] "1911.10.22     7.60" "1911.11.23    14.82" "1911.12.24    14.10"

A.K.
________________________________
From: eliza botto <eliza_botto at hotmail.com>
To: "smartpink111 at yahoo.com" <smartpink111 at yahoo.com> 
Sent: Friday, February 15, 2013 12:38 PM
Subject: data formatting



Dear Arun,
[text file is also attached if format is changed]
i need to data managing genius expertise on the following issue.
i have data like the following table

1911.01.01       7.87 ##(7 spaces between the columns)
1911.01.02       9.26 ##(7 spaces between the columns)
1911.01.03       8.06 ##(7 spaces between the columns)
1911.01.04       8.13 ##(7 spaces between the columns)
1911.01.05      12.90 ##(6 spaces between the columns)
1911.02.06       5.45 ##(7 spaces between the columns)
1911.02.07       3.26 ##(7 spaces between the columns)
1911.03.08       5.70 ##(7 spaces between the columns)
1911.03.09       9.24 ##(7 spaces between the columns)
1911.04.10       7.60 ##(7 spaces between the columns)
1911.05.11      14.82 ##(6 spaces between the columns)
1911.05.12      14.10 ##(6 spaces between the columns)
1911.06.13       7.87 ##(7 spaces between the columns)
1911.06.14       9.26 ##(7 spaces between the columns) 
1911.07.15       8.06 ##(7 spaces between the columns) 
1911.07.16       8.13 ##(7 spaces between the columns) 
1911.08.17      12.90 ##(6 spaces between the columns) 
1911.08.18       5.45 ##(7 spaces between the columns) 
1911.09.19       3.26 ##(7 spaces between the columns) 
1911.09.20       5.70 ##(7 spaces between the columns)
1911.10.21       9.24 ##(7 spaces between the columns)
1911.10.22       7.60 ##(7 spaces between the columns)
1911.11.23      14.82 ##(6 spaces between the columns)
1911.12.24      14.10 ##(6 spaces between the columns)
and i want it to be in the following manner and afterwards i want to save that file in ".txt" format.
 1911. 1. 1     7.87 ##(5 spaces between the columns)
 1911. 1. 2     9.26 ##(5 spaces between the columns)
 1911. 1. 3     8.06 ##(5 spaces between the columns)
 1911. 1. 4     8.13 ##(5 spaces between the columns)
 1911. 1. 5    12.90 ##(4 spaces between the columns)
 1911. 2. 6     5.45 ##(5 spaces between the columns)
 1911. 2. 7     3.26 ##(5 spaces between the columns)
 1911. 3. 8     5.70 ##(5 spaces between the columns)
 1911. 3. 9     9.24 ##(5 spaces between the columns)
 1911. 4.10     7.60 ##(5 spaces between the columns)
 1911. 5.11    14.82 ##(4 spaces between the columns)
 1911. 5.12    14.10 ##(4 spaces between the columns)
 1911. 6.13     7.87 ##(5 spaces between the columns)
 1911. 6.14     9.26 ##(5 spaces between the columns)
 1911. 7.15     8.06 ##(5 spaces between the columns)
 1911. 7.16     8.13 ##(5 spaces between the columns)
 1911. 8.17    12.90 ##(4 spaces between the columns)
 1911. 8.18     5.45 ##(5 spaces between the columns)
 1911. 9.19     3.26 ##(5 spaces between the columns)
 1911. 9.20     5.70 ##(5 spaces between the columns)
 1911.10.21     9.24 ##(5 spaces between the columns)
 1911.10.22     7.60 ##(5 spaces between the columns)
 1911.11.23    14.82 ##(4 spaces between the columns)
 1911.12.24    14.10 ##(4 spaces between the columns)
you could see that spaces between the columns needed to be reduced in executed file and also the zeros in date columns with months and days are needed to be replaced with space.
thankyou very very much in advance
elisa



More information about the R-help mailing list