[R] How to split a data.frame into its columns?

David Winsemius dwinsemius at comcast.net
Mon Aug 29 08:27:32 CEST 2016


> On Aug 28, 2016, at 11:14 PM, Marius Hofert <marius.hofert at uwaterloo.ca> wrote:
> 
> Hi,
> 
> I need a fast way to split a data.frame (and matrix) into a list of
> columns.

This is a bit of a puzzle since data.frame objects are by definition "lists of columns".

If you want a data.frame object (say it's name is dat) to _only be a list of columns then

dat <- unclass(dat)

The split.data.frame function splits by rows since that is the most desired and expected behavior and because the authors of S/R probably thought there was no point in making the split "by columns" when it already was.

-- 
David.

> For matrices, split(x, col(x)) works (which can then be done
> in C for speed-up, if necessary), but for a data.frame? split(iris,
> col(iris)) does not work as expected (?).
> The outcome should be lapply(seq_len(ncol(iris)), function(j)
> iris[,j]) and not require additional packages (if possible).
> 
> Thanks & cheers,
> Marius
> 
> PS: Below is the C code for matrices. Not sure how easy it would be to
> extend that to data.frames (?)
> 
> SEXP col_split(SEXP x)
> {
>    /* Setup */
>    int *dims = INTEGER(getAttrib(x, R_DimSymbol));
>    int n = dims[0], d = dims[1];
>    SEXP res = PROTECT(allocVector(VECSXP, d));
>    SEXP ref;
>    int i = 0, j, k;
> 
>    /* Distinguish int/real matrices */
>    switch (TYPEOF(x)) {
>    case INTSXP:
>    for(j = 0; j < d; j++) {
>    SET_VECTOR_ELT(res, j, allocVector(INTSXP, n));
>    int *e = INTEGER(VECTOR_ELT(res, j));
>    for(k = 0 ; k < n ; i++, k++) {
>    e[k] = INTEGER(x)[i];
>    }
>    }
>    break;
>    case REALSXP:
>    for(j = 0; j < d; j++) {
>    SET_VECTOR_ELT(res, j, allocVector(REALSXP, n));
>    double *e = REAL(VECTOR_ELT(res, j));
>    for(k = 0 ; k < n ; i++, k++) {
>    e[k] = REAL(x)[i];
>    }
>    }
>    break;
>    case LGLSXP:
>    for(j = 0; j < d; j++) {
>    SET_VECTOR_ELT(res, j, allocVector(LGLSXP, n));
>    int *e = LOGICAL(VECTOR_ELT(res, j));
>    for(k = 0 ; k < n ; i++, k++) {
>    e[k] = LOGICAL(x)[i];
>    }
>    }
>    break;
>    case STRSXP:
>    for(j = 0; j < d; j++) {
> ref = allocVector(STRSXP, n);
>    SET_VECTOR_ELT(res, j, ref);
>    ref = VECTOR_ELT(res, j);
>    for(k = 0 ; k < n ; i++, k++) {
>    SET_STRING_ELT(ref, k, STRING_ELT(x, i));
>    }
>    }
>    break;
>    default: error("Wrong type of 'x': %s", CHAR(type2str_nowarn(TYPEOF(x))));
>    }
> 
>    /* Return */
>    UNPROTECT(1);
>    return(res);
> }
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA



More information about the R-help mailing list