[R] indexing in data frames

David L Carlson dcarlson at tamu.edu
Fri Aug 10 00:42:23 CEST 2012


Amazing. You have to create the data frame and then add a variable
containing the list to keep R from checking the number of rows and
objecting:

This does not work
data.frame(b = c(1988, 1989),c = list(c(1985, 1982, 1984),c(1988, 1980)))

Nor this
data.frame(a$b, a$c)

-------
David

> -----Original Message-----
> From: David Winsemius [mailto:dwinsemius at comcast.net]
> Sent: Thursday, August 09, 2012 5:17 PM
> To: dcarlson at tamu.edu
> Cc: 'jimi adams'; r-help at r-project.org
> Subject: Re: [R] indexing in data frames
> 
> 
> On Aug 9, 2012, at 2:43 PM, David L Carlson wrote:
> 
> > You have not defined a data frame since data frames cannot contain
> > lists,
> 
> Not true:
> 
>  > dput(a)
> structure(list(b = c(1988, 1989),
>                 c = list(c(1985, 1982, 1984),
>                          c(1988, 1980))), .Names = c("b", "c"))
> 
>  > ab <- data.frame(a$b)
>  > ab
>     a.b
> 1 1988
> 2 1989
>  > ab$cb <- a$c
>  > ab
>     a.b               cb
> 1 1988 1985, 1982, 1984
> 2 1989       1988, 1980
>  > str(ab)
> 'data.frame':	2 obs. of  2 variables:
>   $ a.b: num  1988 1989
>   $ cb :List of 2
>    ..$ : num  1985 1982 1984
>    ..$ : num  1988 1980
> 
> But it seems unlikely that the OP's "a" object is a dataframe since
> the console eval-print loop would not display a dataframe in that
> manner.
> 
> At any rate with the ab dataframe:
> 
>  > for( i in 1:NROW(ab) ) print(  ab$a.b[i] - ab$cb[[i]] )
> [1] 3 6 4
> [1] 1 9
> 
> The OP should note the need to use '[[' on a list-object to get
> commensurate classes to pass to the '-' operator.
> 
> --
> david.
> 
> 
> > but lists can contain data frames so you are asking about how to
> > process a
> > list. I'm changing your object names to a, b, and d because c() is
> the
> > concatenation function and it can cause all kinds of problems to use
> > it as
> > an object name.
> >
> >> a <- list(b=c(1988, 1989), d=list(c(1985, 1982, 1984), c(1988,
> >> 1980)))
> >> a
> > $b
> > [1] 1988 1989
> >
> > $d
> > $d[[1]]
> > [1] 1985 1982 1984
> >
> > $d[[2]]
> > [1] 1988 1980
> >
> >> a$b; a[[1]] # Two ways to refer to the first element of the list
> > [1] 1988 1989
> > [1] 1988 1989
> >
> >> a$d; a[[2]] # Two ways to refer to the second element of the list
> > [[1]]
> > [1] 1985 1982 1984
> >
> > [[2]]
> > [1] 1988 1980
> >
> > [[1]]
> > [1] 1985 1982 1984
> >
> > [[2]]
> > [1] 1988 1980
> >
> >> a[[2]][[1]]; a$d[[1]] # Two ways to refer to the 1st element of the
> >> 2nd
> > element
> > [1] 1985 1982 1984
> > [1] 1985 1982 1984
> >
> >> a[[2]][[2]]; a$d[[2]] # Two ways to refer to the 2nd element of the
> >> 2nd
> > element
> > [1] 1988 1980
> > [1] 1988 1980
> >
> >> a$new <- sapply(1:2, function(i) a$b[i] - a$d[[i]])
> >> a$new
> > [[1]]
> > [1] 3 6 4
> >
> > [[2]]
> > [1] 1 9
> >
> > You can do all this with a data.frame if you think about it
> > differently:
> >
> >> a <- data.frame(year = c(1988, 1989), group = c("G1988", "G1989"))
> >> b <- data.frame(group = c(rep("G1988", 3), rep("G1989", 2)),
> >    d = c(1985, 1982, 1984, 1988, 1980))
> >> ab <- merge(a, b)
> >> ab <- data.frame(ab, diff=ab$year-ab$d)
> >> new <- split(ab$diff, ab$group)
> >> new
> > $G1988
> > [1] 3 6 4
> >
> > $G1989
> > [1] 1 9
> >
> > ----------------------------------------------
> > David L Carlson
> > Associate Professor of Anthropology
> > Texas A&M University
> > College Station, TX 77843-4352
> >
> >> -----Original Message-----
> >> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> >> project.org] On Behalf Of jimi adams
> >> Sent: Thursday, August 09, 2012 3:43 PM
> >> To: r-help at r-project.org
> >> Subject: [R] indexing in data frames
> >>
> >> I'm still not fully understanding exactly how R is handling data
> >> frames, but am getting closer. Any help with this one will likely
> >> go a
> >> long way in getting me there. Let's say I have a data frame, let's
> >> call
> >> it "a". Within that data frame i have two variables, let's call them
> >> "b" and "c", where "b" is a single numeric value per observation,
> >> while
> >> "c" is a LIST of numeric values. What I want to be able to do is
> >> perform an operation on each element in "c" by the single element in
> >> "b".
> >>
> >> So, for example, if I wanted to subtract each element in "c" from
> the
> >> scalar in "b". For example, if i had
> >>
> >>> a$b
> >> [1] 1988
> >> [2] 1989
> >> .
> >> &
> >>> a$c
> >> [[1]]
> >> [1] 1985 1982 1984
> >> [[2]]
> >> [1] 1988 1980
> >> .
> >>
> >> I'm looking for a result of:
> >> a$new
> >> [[1]]
> >> [1] 3 6 4
> >> [[2]]
> >> [1] 1 9
> >> .
> >>
> >> I've tried a few different things, none of which have the desired
> >> result. Any help appreciated.
> >> thanks!
> >>
> >> jimi adams
> >> Assistant Professor
> >> Department of Sociology
> >> American University
> >> e: jadams at american.edu
> >> w: jimiadams.com
> >>
> >> ______________________________________________
> >> R-help at r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide http://www.R-project.org/posting-
> >> guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> 
> David Winsemius, MD
> Alameda, CA, USA



More information about the R-help mailing list