[BioC] Howto annotate blast subject.id with AnnotationDbi

Arnaud Mounier arnaud.mounier at dijon.inra.fr
Thu May 30 08:28:21 CEST 2013


Le 29/05/2013 19:54, Marc Carlson a écrit :
> Unfortunately no. Those IDs are not present in the org.At.eg.db package
> as this is a gene-level annotation package. These kinds of IDs have
> never been included in this package, although I guess that we could
> consider adding them at some point in the future.
Indeed, it could be a good idea because there some issues which can't 
avoid this future (I think).
Here an example from a blastp :

 > df.blast.report[df.blast.report$"query.id" == "medtr7g099680.1",]
            query.id  subject.id identity alignment.length mismatches 
gap.opens q.start q.end s.start s.end evalue bit.score
99  medtr7g099680.1 AT1G79930.2    35.62              438        266 
      4      10   434       4   438  2e-85       289
100 medtr7g099680.1 AT1G79930.1    35.62              438        266 
      4      10   434       4   438  2e-85       290
101 medtr7g099680.1 AT2G32120.2    33.98              512        310 
      9       9   508      30   525  7e-85       282
102 medtr7g099680.1 AT2G32120.1    33.98              512        310 
      9       9   508      30   525  7e-85       282
103 medtr7g099680.1 AT1G79920.1    35.62              438        266 
      4      10   434       4   438  1e-84       288
104 medtr7g099680.1 AT1G79920.2    35.62              438        266 
      4      10   434       4   438  2e-84       287

Each row couple (99-100, 101-102, 103-104) have the same query.id and 
the difference between each subject.id in this 3 couples is only at gene 
model level. Information should be lost after this point.

You can notice that despite the gene model's difference, all other 
information are the same. But here another example with 3 gene model 
different for the same locus and 3 differents hits.

 > df.blast.report[df.blast.report$"query.id" == "medtr8g081490.1",]
            query.id  subject.id identity alignment.length mismatches 
gap.opens q.start q.end s.start s.end evalue bit.score
188 medtr8g081490.1 AT4G13940.3    80.35              453         43 
      2       1   452       1   408      0       734
189 medtr8g081490.1 AT4G13940.2    89.43              331         34 
      1     123   452       2   332      0       622
190 medtr8g081490.1 AT4G13940.4    89.29              308         32 
      1       1   307       1   308      0       559


Thank's for you reply,
Ar.

-- 
« Le soleil filtre à travers les branches des arbres par éclairs, comme 
le sens à travers la langue. »
Nancy Huston

Arnaud Mounier
INRA - UMR Agroécologie 1347
CNRS - ERL IPM 6300 (Plant-Microorganism Interaction)
17, rue Sully - BP 86510 - F-21065 Dijon Cedex - France
Work phone : +33 380 693 167 - Fax : +33 380 693 753

https://www6.dijon.inra.fr/umragroecologie/Personnel/IPM/ITA/MOUNIER-Arnaud



More information about the Bioconductor mailing list