[R] Merging vector data into one file

jim holtman jholtman at gmail.com
Mon Feb 1 13:43:39 CET 2010


You never really said what your data structure looks like.  It appears
that the 'single row' might be a named vector.  It would be good to
follow the post policy and supply sample data.  Here is one way of
doing it, depending on exactly what your data looks like:

> # create sample data of a list of named vectors with counts
>  x <- replicate(5,table(sample(letters,20,TRUE)),simplify=FALSE)
> x
[[1]]

a b c d e f h i j k o p q t y z
1 1 1 1 1 1 1 3 2 1 1 1 2 1 1 1

[[2]]

a b c d g h i j o q r s t u v w x y
1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 2 1

[[3]]

a b c e g j k n p q r s t v y z
1 1 1 1 1 1 3 2 1 1 1 1 1 2 1 1

[[4]]

c d g j k l m o q t u v y z
1 1 1 2 1 1 2 1 2 1 1 2 2 2

[[5]]

b d f g i j k m n o q s w y
1 2 1 1 2 1 1 1 1 2 3 1 1 2

> # create a 'long' table of variables and counts
> x.long <- do.call(rbind, lapply(x, stack))
> head(x.long)
  values ind
1      1   a
2      1   b
3      1   c
4      1   d
5      1   e
6      1   f
> tapply(x.long$value, x.long$ind, sum)  # summarize
a b c d e f h i j k o p q t y z g r s u v w x n l m
3 4 4 5 2 2 2 6 7 6 5 2 9 4 7 4 4 2 3 2 6 2 2 3 1 3
>


On Sun, Jan 31, 2010 at 11:22 PM, Mark Altaweel <maltaweel at anl.gov> wrote:
> Hi,
>
> I had another question. If you had say a vector (e.g., called data) with 235
> elements and each element looked like the following
>
> data[[1]]
> Column A-B    Column Z-S    Column A-S....
> 1                         2                        5 .......
>
>
> data[[2]]
> Column Z-B    Column A-S    Column A-B....
> 2                         1                        3 .......
>
>
>
> Anyway, each element consists of one row that lists the names of the columns
> and the second row is the counts of those columns. What I wanted to do is
> merge all the elements from the vector  so that I aggregate the counts for
> every column in the vector elements. So if a column name (e.g., Column A-B)
> is present in two elements, then I would want those elements that have the
> same column name to aggregated their counts; however, if the column name is
> unique then I simply just want to integrate that column with the total. The
> example below shows the result of what I mean using the two elements above:
>
>
> result=data[[1]] + data[[2]].......
>
> Column A-B    Column Z-S    Column A-S    Column Z-B.........
> 4                        2                        6
>   2.......
>
>
>
> So what I want would take the 235 elements, aggregate the column names and
> counts, and produce one output  variable (e.g., called result) that has all
> the column names and counts present in the 235 elements. I tried using
> sapply, (e.g., sapply(data,function(.df){sum(.df)}) ), but this only just
> provided aggregate counts without producing the column names.  I tried an
> aggregate() function, but that didnt aggregate my data exactly the way I
> wanted, perhaps I got the syntax wrong though. Anyway, is there a better and
> easier way to do this?
>
> Thanks in a advance again.
>
> Mark
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?



More information about the R-help mailing list