[R] creating summary functions for data frame

Karin Lagesen karin.lagesen at medisin.uio.no
Thu Oct 11 11:05:20 CEST 2007


I have a data frame that looks like this:


> gctablechromonly[1:5,]
     refseq geometry gccontent X60_origin X60_terminus  length  kingdom
1 NC_009484      cir    0.6799    1790000       773000 3389227 Bacteria
2 NC_009484      cir    0.6799    1790000       773000 3389227 Bacteria
3 NC_009484      cir    0.6799    1790000       773000 3389227 Bacteria
4 NC_009484      cir    0.6799    1790000       773000 3389227 Bacteria
5 NC_009484      cir    0.6799    1790000       773000 3389227 Bacteria
                  grp feature gene begin dir gc_content replicor LEADLAG
1 Alphaproteobacteria     CDS  CDS   261   +   0.654244    RIGHT    LEAD
2 Alphaproteobacteria     CDS  CDS  1737   -   0.651408    RIGHT     LAG
3 Alphaproteobacteria     CDS  CDS  2902   +   0.607843    RIGHT    LEAD
4 Alphaproteobacteria     CDS  CDS  3693   +   0.617647    RIGHT    LEAD
5 Alphaproteobacteria     CDS  CDS  4227   +   0.699208    RIGHT    LEAD
>

About half of these columns are factors, for instance refseq, kingdom,
grp and feature.

Now, I have seen that I can do 

by(gctablechromonly, gctablechromonly$feature, summary)

to get useful information.

However, I a wondering how I can write my own functions to get what
I'd like. For instance, how could I get a table with grp as rows down
the right, feature on the top, and a count of each kind of feature
within each grp?

I realize that this is probably pretty easy to do, but I do not know
enough R yet to know which words to look for in the mail archives...:)

TIA,

Karin
-- 
Karin Lagesen, PhD student
karin.lagesen at medisin.uio.no
http://folk.uio.no/karinlag



More information about the R-help mailing list