[R] bug & paste (continues...)

Ott Toomet siim at obs.ee
Tue Oct 9 18:04:32 CEST 2001


Hello again,

There are some new facts:

1) if you save image and restart the R, paste behaves normally:
> paste( ce0, m, sep="", collapse="")
[1] "1985<1>9<2>2<3>2<4>1<5>A<6>1<7><8>NA<9>5<0>1999END"
> paste( ce0a, m, sep="", collapse="")
[1] "1985<1>9<2>2<3>2<4>1<5>A<6>1<7><8>NA<9>5<0>1999END"

But making a new subset (a bit shorter this time and not text):
> e2000 <- read.dta( "/home/siim/tyy/andmebaasid/etu0012.dta")
> e0 <- e2000[1,231:239]
> e0
  c01b c02 c03 cx c05a01 c05ak01 c05b01 c05bk01 c0601
1    9   2   2  1      A       1             NA     5

(c05b01 should be a string variable, empty string in this case.
C05bk01 should be numerical, NA when empty.)

The problem arises again (m is correspondingly shorter now):
> paste( e0, m, sep="", collapse="")
[1] "9<1>2<2>2<3>1<4>A1<6>NA<8>5END"
> m
[1] "<1>" "<2>" "<3>" "<4>" "<5>" "<6>" "<7>" "<8>" "END"

paste without collapse gives a bit different picture, but that is
not correct either:
> paste( e0, m, sep="")
[1] "9<1>"  "2<2>"  "2<3>"  "1<4>"  "A"     "1<6>"  ""      "NA<8>" "5END"

nchar() do not show any hidden chars:
> nchar( e0)
[1] 1 1 1 1 1 1 0 2 1

So, it seems that R somehow remembers that e0 is taken from the big
dataframe, but I do not know how it is possible.  The memorisation is
passed in assignation:
> e1 <- e0
> paste( e1, m, sep="")
[1] "9<1>"  "2<2>"  "2<3>"  "1<4>"  "A"     "1<6>"  ""      "NA<8>" "5END"

but it vanishes when you save and load data:
> save( e0, file="jama.rd")
> load( "jama.rd")
> paste( e0, m, sep="")
[1] "9<1>"  "2<2>"  "2<3>"  "1<4>"  "A<5>"  "1<6>"  "<7>"   "NA<8>" "5END"


If you have any more ideas...

Best wishes,

Ott



P.S I do not know if this is related with the previous problem, but
when I remove the database:
>rm(e2000)

and then look memory:
> gc()
         used (Mb) gc trigger (Mb)
Ncells 220608  5.9     741108 19.8
Vcells  88222  0.7   11163343 85.2

Then it shows the memory usage less than 10M.  However, operating
system shows R is still using more than 80M.



On Tue, 9 Oct 2001, Ott Toomet wrote:

> Hi,
>
> dput( ce0) gives a correct answer:
> > dput( ce0)
> c("1985", "9", "2", "2", "1", "A", "1", "", "NA", "5", "1999" )
>
> The same does just print( ce0):
> > print( ce0)
>  [1] "1985" "9"    "2"    "2"    "1"    "A"    "1"    ""     "NA"   "5"
> [11] "1999"
>
> However, if I make a new similar vector ce0a:
> > ce0a <- c( 1985,9,2,2,1,"A",1,"",NA,5,1999)
>
> Then the paste works correctly:
> > paste( ce0a, m, sep="", collapse="")
> [1] "1985<1>9<2>2<3>2<4>1<5>A<6>1<7><8>NA<9>5<0>1999END"
>
> I had M as
> > m
>  [1] "<1>" "<2>" "<3>" "<4>" "<5>" "<6>" "<7>" "<8>" "<9>" "<0>" "END"
>
> So I have two apparently similar vectors which behave differently with
> paste:
> > paste( ce0a, m, sep="", collapse="")
> [1] "1985<1>9<2>2<3>2<4>1<5>A<6>1<7><8>NA<9>5<0>1999END"
> > paste( ce0, m, sep="", collapse="")
> [1] "1985<1>9<2>2<3>2<4>1<5>A1<7>NA<9>5<0>1999END"
> > ce0a
>  [1] "1985" "9"    "2"    "2"    "1"    "A"    "1"    ""     "NA"   "5"
> [11] "1999"
> > ce0
>  [1] "1985" "9"    "2"    "2"    "1"    "A"    "1"    ""     "NA"   "5"
> [11] "1999"
>
> I suggest there can be some hidden attributes somewhere in ce0 which I have
> not noticed (there seem not to be factors), the problem seems to arise with
> the non-numerical columns (ce0 is just part of one row of the big
> dataframe).  Is it possible to figure it out, and possible change?  At least
> attributes() do show nothing:
> > attributes(ce0)
> NULL
> > attributes(ce0a)
> NULL
>
> The problem is actully that I cannot transform a stata7 dataset to ASCII, R
> seems to be the only program here which is able to open it, but I have still
> problems with saving.
>

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list