[R] Selecting names with regard to visit frequency

arun smartpink111 at yahoo.com
Wed Jul 24 00:39:00 CEST 2013


Hi Michael,
It could be due to some extra space.  If you use read.table(..., fill=TRUE), it should read.  Then, there would be missing values.  Using ?dput() will be better.  

 dput(df1)
structure(list(x = c(2L, 5L, 4L, 6L, 24L, 7L, 12L, 3L, 5L)), .Names = "x", class = "data.frame", row.names = c("A1", 
"A2", "A3", "A4", "A5", "A6", "A7", "A8", "A9"))
Now, try the code by assigning:
df1<- structure(list(x.....


It wouldn't work with decimals because here:
3:5
#[1] 3 4 5 #it will matching all values that are 3,4, and 5

Trying this on another dataset:

df2<- structure(list(x = c(2, 5, 4.4, 6, 24, 7, 12, 3.6, 5)), .Names = "x", class = "data.frame", row.names = c("A1", 
"A2", "A3", "A4", "A5", "A6", "A7", "A8", "A9"))
vec2<- unlist(df2)
 names(vec2)<- row.names(df2)
vec2
#  A1   A2   A3   A4   A5   A6   A7   A8   A9 
# 2.0  5.0  4.4  6.0 24.0  7.0 12.0  3.6  5.0 
 names(vec2)[vec2%in% 3:5] #incorrect
#[1] "A2" "A9"

names(vec2)[vec2%in% seq(3,5,by=0.1)]
#[1] "A2" "A3" "A8" "A9"


#If I change
vec2[3]<- 4.46
 names(vec2)[vec2%in% seq(3,5,by=0.1)]
#[1] "A2" "A8" "A9"
 names(vec2)[round(vec2,1)%in% seq(3,5,by=0.1)]
#[1] "A2" "A3" "A8" "A9"

names(vec2)[vec2>=3 & vec2<=5] #should be better in such cases
#[1] "A2" "A3" "A8" "A9"


It is also better to check R FAQ 7.31.

A.K.




Hi Arun, 
Perhaps these are dataframes I am working with, and have mistaken 
them for vectors (I am still very new at this and learning the data 
structures). 

I tried to read the text in as you have it here (copied and pasted), but it did not work. 
Error in read.table(text = " \n\"\",\"x\" \n\"A1\",2 \n\"A2\",5 
\n\"A3\",4 \n\"A4\",6 \n\"A5\",24 \n\"A6\",7 \n\"A7\",12 \n\"A8\",3 
\n\"A9\",5 \n",  : 
  more columns than column names 

I retried both: 
names(vec1)[vec1%in% 3:5] 

& 

names(vec1)[!is.na(match(vec1,3:5))] 

before and after processing my current dataframe to a vector but
 I get a NULL return. I also get a NULL return if I unlist the dataframe
 and try to execute: 
names(vec1)[vec1>=3 & vec1<=5] 

All 3 do work if I keep the dataframe in its original form, instead of using: 
 vec1<-unlist(df1) 
 names(vec1)<- row.names(df1) 

 I discovered another issue, however. I am working with a couple
 datasets, one of them has whole numbers the other has percentages in 
place of visits such as: 
"A1",0.2 
"A2",0.5 
 ... 
the two options: 

names(vec1)[vec1%in% 3:5] 
names(vec1)[!is.na(match(vec1,3:5))] 

do not seem to work with ranges given in decimals (and that is 
probably what I originally tested them on) but are fine with whole 
numbers. 

Thanks, 
steele



More information about the R-help mailing list