[R] subset question

Aimin Yan aiminy at iastate.edu
Thu Dec 14 07:02:33 CET 2006


I have a data set p1982, its structure is the following

Then I take 20 observations from this dataset, and assign to pr.

in p1982, p has 1982 levels, in dataset pr,  p should have 1 levels.

But I do str(pr), it shows that p still has 1982 levels.

also for these

 > pr$aa
  [1] ARG THR ASP CYS TYR ASN VAL ASN ARG ILE ASP THR THR ALA SER CYS LYS 
THR ALA LYS
Levels: ALA ARG ASN ASP CYS GLN GLU HIS ILE LEU LYS MET PHE PRO SER THR TRP 
TYR VAL

it seems pr$aa don't have level GLU, but it list this level.

I don't understand this, Is there some reason for these?

thanks,



 > str(p1982)
'data.frame':   465979 obs. of  6 variables:
  $ p  : Factor w/ 1982 levels "154l_aa","1A0P_aa",..: 1 1 1 1 1 1 1 1 1 1 ...
  $ aa : Factor w/ 19 levels "ALA","ARG","ASN",..: 2 16 4 5 18 3 19 3 2 9 ...
  $ as : num  152.0  15.9  65.1  57.2  28.9 ...
  $ ms : num  108.8  28.3  59.2  49.9  31.8 ...
  $ cur: num  -0.1020  0.2564  0.0312 -0.0550  0.0526 ...
  $ sc : num   92.10 103.67   7.27  72.98  96.12 ...

 > pr<-p1982[1:20,]

 > str(pr)
'data.frame':   20 obs. of  6 variables:
  $ p  : Factor w/ 1982 levels "154l_aa","1A0P_aa",..: 1 1 1 1 1 1 1 1 1 1 ...
  $ aa : Factor w/ 19 levels "ALA","ARG","ASN",..: 2 16 4 5 18 3 19 3 2 9 ...
  $ as : num  152.0  15.9  65.1  57.2  28.9 ...
  $ ms : num  108.8  28.3  59.2  49.9  31.8 ...
  $ cur: num  -0.1020  0.2564  0.0312 -0.0550  0.0526 ...
  $ sc : num   92.10 103.67   7.27  72.98  96.12 ...

 > pr
          p  aa     as     ms         cur        sc
1  154l_aa ARG 152.04 108.83 -0.10201400  92.10410
2  154l_aa THR  15.86  28.32  0.25635600 103.67100
3  154l_aa ASP  65.13  59.16  0.03121370   7.27311
4  154l_aa CYS  57.20  49.85 -0.05495890  72.97930
5  154l_aa TYR  28.87  31.75  0.05264570  96.11660
6  154l_aa ASN  31.14  31.09  0.06327110  55.65980
7  154l_aa VAL   0.00   0.00  0.00000000 142.92100
8  154l_aa ASN  83.46  62.03 -0.10425800  78.38800
9  154l_aa ARG 156.02 111.52 -0.12303800  70.28280
10 154l_aa ILE   6.71  18.37  0.29933600 150.02100
11 154l_aa ASP  86.45  59.83 -0.15856600  73.52120
12 154l_aa THR  26.39  33.68  0.06101840 133.57200
13 154l_aa THR 107.61  70.48 -0.17145100  72.48660
14 154l_aa ALA   2.31   5.40  0.24000000  90.67890
15 154l_aa SER  30.16  30.08 -0.00753989  96.24600
16 154l_aa CYS  60.11  46.86 -0.09648100  32.19480
17 154l_aa LYS 127.05  95.48 -0.11545500  81.00930
18 154l_aa THR   5.74  18.45  0.27963100 164.13100
19 154l_aa ALA   0.00   0.00  0.00000000  68.85680
20 154l_aa LYS 113.58  81.72 -0.12914300  49.38620
 >



More information about the R-help mailing list