# [R] Selecting rows and columns of a data frame using relational operators

Erich Subscriptions erich.subs at neuwirth.priv.at
Mon Feb 27 13:30:07 CET 2017

```The answer is simple

data[,4] == 1 produces a logical vector of length nrow(data)
and the subsetting mechanism for data frames in R needs a vector of the same length
as the data frame has rows.

data[1:20,4] == 1
produces a data frame of length 20, and if this is not the length of data.
So R applies its standard procedure, it repeats this vector as often as needed to get
a vector of length == nrow(data)

Th following code illustrates what is happening

data <- data.frame(x=rnorm(100),y=rnorm(100),z=rnorm(100),a=rep(c(1,2,1,2),c(2,48,2,48)))

vec1 <- data[,4]==1
vec2 <- data[1:20,4]==1

> On 27 Feb 2017, at 13:07, Tunga Kantarcı <tungakantarci at gmail.com> wrote:
>
> Consider a data frame named data. data contains 4 columns and 1000
> rows. Say the aim is to bring together columns 1, 2, and 4, if the
> values in column 4 is equal to 1. We could use the syntax
>
> data(data[,4] == 1, c(1 2 4))
>
> for this purpose. Suppose now that the aim is to bring together
> columns 1, 2, and 4, if the values in column 4 is equal to 1, for the
> first 20 rows of column 4. We could use the syntax
>
> data(data[1:20,4] == 1, c(1 2 4))
>
> for this purpose. However, this does not produce the desired result.
> This is surprising at least for someone coming from MATLAB because
> MATLAB produces what is desired.
>
> Question 1: The code makes sense but why does it not produce what we
> expect it to produce?
>
> Question 2: What code is instead suitable?
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help