[BioC] how to find the VALIDATED pair (miRNA, gene-3'UTR-sequence)

michael watson (IAH-C) michael.watson at bbsrc.ac.uk
Thu Jun 25 08:39:52 CEST 2009


They are predicted.
 
The only databases of experimentally predicted taregst are TarBase and miRecords, and when I last looked they had 1300 and 1135 records respectively.
 
Mick 

________________________________

From: bioconductor-bounces at stat.math.ethz.ch on behalf of mauede at alice.it
Sent: Thu 25/06/2009 4:57 AM
To: Sean Davis
Cc: bioconductor at stat.math.ethz.ch
Subject: [BioC] how to find the VALIDATED pair (miRNA, gene-3'UTR-sequence)



Thank you very much.
I believe I can use biomaRt functions to get the 3'UTR sequences through providing the crhomosome name and start/end sequence coordinates.
However I am not sure that the text file I downloaded from http://microrna.sanger.ac.uk/cgi-bin/targets/v5/download.pl
that is "arch.v5.txt.homo_sapiens" contains (points to) the VALIDATED miRNA <-> gene-3'UTR sequences (or coordinates of them).
Since the prediction code "miRANDA" is mentioned, my question is: 
are the (miRNA, gene-3'UTR-sequence) pairs listed in the files downloadable from   http://microrna.sanger.ac.uk/cgi-bin/targets/v5/download.pl 
experimentally VALIDATED or computationally PREDICTED ?
At he time being I definitely need the (miRNA,gene-3'UTR-sequences) experimentally VALIDATED pairs.
Please, correct me if I am mistaken .
Thank you so much,
Maura

-----Messaggio originale-----
Da: Sean Davis [mailto:seandavi at gmail.com]
Inviato: mer 24/06/2009 18.28
A: mauede at alice.it
Cc: bioconductor at stat.math.ethz.ch
Oggetto: Re: [BioC] how to find the validated pair (miRNA, gene-3'UTR-sequence)

On Wed, Jun 24, 2009 at 11:45 AM, <mauede at alice.it> wrote:

> Sorry for my misuse of Biology nomenclature. I am still very confused.
>
> My first task (very trivial for you) is to generate a text files containing
> a list of Homo-Sapiens validated miRNAs (microRNA-identifier, sequence)
> and relative 3'UTR regions (gene-identifier, 3'UTR-sequence).


Hi, Maura.  See here:

http://microrna.sanger.ac.uk/cgi-bin/targets/v5/download.pl

If you download the text file for human, it looks like:

Similarity      hsa-miR-647     miRanda miRNA_target    2
120824263       120824281       +       .       16.3205 3.701400e-06
ENST00000295228 INHBB
Similarity      hsa-miR-130a    miRanda miRNA_target    2
120825363       120825385       +       .       16.5359 1.687830e-02
ENST00000295228 INHBB

>From here, you have the miR name, the chromosome (2 in this case), the
chromosome start and end positions, and the strand.  You can use this to get
the sequence from the genome (the fasta sequence for those locations).  The
transcript name (ENST....) is from the Ensembl database, so there is plenty
of information via biomaRt, if necessary, but the HUGO gene symbol is given
in the last column.

Several of the code snippets you give below give similar information.  If
you are concerned about what a specific data source is giving you, you
should probably contact that data source directly via email.  Most websites
offer a "contact us" link.

If this isn't what you need, then perhaps you can show more specifically how
this information is not meeting your needs.  Know that you may have to do a
little bit of programming to get things into exactly the formats that you
like.

Sean


>
> I realize this is just a matter of retrieving all known information. The
> difficulty for me is where to find the pair (miRNA, gene-3'UTR) matching
> information.
> In the following I downloaded a lot of stuff but I do not know how to put
> the pieces together to fulfill my task.
> I think the 3'UTR sequences can be retrieved through function "getSequence"
> from package "biomaRt"m .... if only I knew which parameters to pass to such
> a function to achieve my goal.
>
> 1) Function "hsSeqs" from package "microRNA" produces 677 miRNAs entries
> ex.  hsa-let-7a   "UGAGGUAGUAGGUUGUAUAGUU"
>  Are such miRNAs validated ?
>  If the answer is "yes" then how can I retrieve the correspondent
> gene-3'UTR regions ?
>
> 2) Function "hsSeqs" from package "microRNA" produces a matrix 709015x 6
> contaiing miRNA identifiers
>   and apparently some data from the paired gene.
> ex.      name             target            chrom start       end
> strand
>   [1,] "hsa-miR-647"    "ENST00000295228" "2"   "120824263" "120824281" "+"
>   [2,] "hsa-miR-130a"   "ENST00000295228" "2"   "120825363" "120825385" "+"
>
>
>  Again. how can I retrieve the correspondent gene-3'UTR regions from the
> above data ?


Note my answer above.  The gene 3'UTR information is there, but you may need
to do some calculations, depending on what you want.  Also, note that
"genes" do not have 3'UTRs--only transcripts have that.


>
>
> 3)  Function "s3utr" from package "microRNA" produces 112  3'UTR entries
> ex.
> "CCTGCCCGCCCGCATGGCCAGCCAGTGGCAAGCTGCCGCCCCCACTCTCCGGGCACCGTCTCCTGCCTGTGCGTCCGCCC
>
> ACCGCTGCCCTGTCTGTTGCGACACCCTCCCCCCCACATACACACGCAGCGTTTTGATAAATTATTGGTTTTCAACG"
>
>   Where do such 3'UTR come from ? Which (miRNA, gene) do they belong to ?
>
> 4) I downloaded the file "mature.fa"  (Fasta format sequences of all mature
> miRNA sequences) from http://microrna.sanger.ac.uk/sequences/ftp.shtml
>    The file contais a number of records starting withthe miRNA identifier.
> ex:  hsa-miR-943        miRanda miRNA_target    9885484 9885504 15.6748
> 4.721740e-02    +       .       URL "
> http://www.ensembl.org/homo_sapiens/geneview?gene=ENST00000302092"
> hsa-miR-944     miRanda miRNA_target    9885188 9885209 16.602
>  1.659470e-03    +       .       URL "
> http://www.ensembl.org/homo_sapiens/geneview?gene=ENST00000302092"
>
>  Where are the 3'UTR regions indicated in the above records ?
>
>
> 5) I downloaded miRNA Validated Targets from
> http://mirecords.umn.edu/miRecords/download.php.
>    It generated a huge XLS file with alot of data.
> ex:   Pubmed_id Target gene_species_scientific  Target gene_species_common
>      Target gene_name        Target gene_Refseq_acc  Target site_number
>  miRNA_species   miRNA_mature_ID miRNA_regulation        Reporter_target
> gene/region     Reporter link element   Test_method_inter       Target gene
> mRNA_level  Original description    Mutation_target region  Post
> mutation_method    Original description_mutation_region    Target
> site_position    A       Reporter_target site    Reporter link element
> Test_method_inter_site  Original description_inter_site Mutation_target site
>    Post mutation_method_site       Original description_mutation_site
>  Mutiple site mutation note      Additional note
> 12808467        Homo sapiens    human   Hes1    NM_198155.2     1
> Homo sapiens    hsa-miR-23a     mutation                        Western
> blotting                Next, to examine whether expression of the gene for
> Hes1 is regulated by miR-23, we introduced synthetic miR-23 or mutant miR-23
> (Fig. 2a) into undifferentiated NT2 cells. When synthetic miR-23 was
> introduced at 2 mMinto undifferentiated NT2 cells,the intracellular level of
> Hes1 fell significantly (Fig. 2b).By contrast,in the presence of synthetic
> mutant miR-23,the level of Hes1 in undifferentiated NT2 cells remained
> unchanged and similar to that in untreated wild-type NT2 cells (Fig. 2b).
>                        801     overexpression by mature miRNA transfection
>   luciferase      target site(five copies of the target sequence) activity
> assay  Furthermore, the luciferase activity of LucSTS23 in undifferentiated
> NT2 cells that had been treated with synthetic miR-23 was lower than that in
> untreated wild-type NT2 cells (Fig. 3c).      Yes     Luciferase activity
> assay       Furthermore, the luciferase activity of LucSTS23 in
> undifferentiated NT2 cells that had been treated with synthetic miR-23 was
> lower than that in untreated wild-type NT2 cells (Fig. 3c).
>
> Thank you in advance for helping me out of my misery.
> Maura
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>        [[alternative HTML version deleted]]
>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>




tutti i telefonini TIM!


        [[alternative HTML version deleted]]

_______________________________________________
Bioconductor mailing list
Bioconductor at stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/bioconductor
Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list