[R] Unable to extract gene list from chromosome

pooja sinha pj@|nh@07 @end|ng |rom gm@||@com
Fri Apr 9 15:34:08 CEST 2021


Hi David,

That's the only file I have for analysis and I am also getting the final_1
as 0 obs. of  6 variables. My problem is that I am not getting any output.
It seems like I am missing something in the* values* code but I don't know
what. Just for your hint I googled and some people have suggested using
values as vectors which I do not understand. Also when I pick one row of
the start column and do it on the interactive phase it's giving the result
but it's not possible to do one by one due to the large no. of rows. I
posted my problem in biostars but am still waiting for someone to reply.

Thanks,
Puja

On Thu, Apr 8, 2021 at 7:28 PM David Winsemius <dwinsemius using comcast.net>
wrote:

>
> On 4/8/21 3:42 PM, pooja sinha wrote:
>
> Hi David,
>
> Sorry I forgot to attach the file. Now it's attached.
>
>
> Now when I go back and check the values of the setup variables after
> seeing an error on the last call,
>
> Error in .processResults(postRes, mart = mart, sep = sep, fullXmlQuery =
> fullXmlQuery,  :
>   Query ERROR: caught BioMart::Exception::Database: Error during query
> execution: You have an error in your SQL syntax; check the manual that
> corresponds to your MySQL server version for the right syntax to use near
> 'AND (main.seq_region_end_1020 >= '15108600' OR main.seq_region_end_1020 >=
> '9115' at line 1
>
> I now notice:
>
>
> AT_AC_Gene$chr
>
> #NULL
>
> Changing that to AT_AC_Gene$Chromosome_number gets at least a startup
> message from the server:
>
> Batch submitting query
> [==>-------------------------------------------------------------------]
> 5% eta:  1m
>
> Error in .processResults(postRes, mart = mart, sep = sep, fullXmlQuery =
> fullXmlQuery,  :
>   Query ERROR: caught BioMart::Exception::Database: Error during query
> execution: You have an error in your SQL syntax; check the manual that
> corresponds to your MySQL server version for the right syntax to use near
> 'AND (main.seq_region_end_1020 >= '15108600' OR main.seq_region_end_1020 >=
> '9115' at line 1
>
> But then I get the same error before about SQL syntax error.
>
>
> Then I ran it with only complete cases and now get no error but again see
> no hits:
>
> str(final_1)
> 'data.frame':    0 obs. of  6 variables:
>  $ external_gene_name: logi
>  $ ensembl_gene_id   : logi
>  $ start_position    : logi
>  $ end_position      : logi
>  $ rgd_symbol        : logi
>  $ chromosome_name   : logi
>
>
> I also see a lot of NA's in that dataset and when I just send the first 10
> rows of the request, I get no error (but also no matches.)
>
>
> So you clearly are not giving us all the data or all the code, but I'm
> finally wondering if you just don't have an data that matches teh external
> datasets in your chosen "biomart". Can you offer a smaller dataset that you
> know with certainty should produce a match?
>
>
> Alternatively, you might want to post this instead at the BioConductor
> mailing list. They are the people who have a better chance of spotting
> obvious errors. I've found two likely code-related errors but I'm not a
> computational biostatistician.
>
> David
>
>
>
> Thanks,
> Puja
>
> On Thu, Apr 8, 2021 at 6:01 PM David Winsemius <dwinsemius using comcast.net>
> wrote:
>
>>
>> On 4/8/21 2:30 PM, pooja sinha wrote:
>> > Hi All,
>> >
>> > I am trying to extract gene list from chromosome number and position,
>> for
>> > that I am using biomaRt in R but I am getting error messages as shown
>> > below. Also below is the code I am using for extraction.
>> >
>> > library("biomaRt")
>> > listMarts()
>> > ensembl <- useMart("ensembl")
>> > datasets <- listDatasets(ensembl)
>> > ensembl = useDataset("rnorvegicus_gene_ensembl",mart=ensembl)
>> > AT_AC_Gene <- read.csv("AT-AC-methylkit_biomart-4-7-21.csv",header=T)
>>
>>
>> #--- a this point I get
>>
>> Error in file(file, "rt") : cannot open the connection
>> In addition: Warning message:
>> In file(file, "rt") :
>>    cannot open file 'AT-AC-methylkit_biomart-4-7-21.csv': No such file
>> or directory
>>
>> > attributes <-
>> >
>> c("external_gene_name","ensembl_gene_id","start_position","end_position","rgd_symbol","chromosome_name")
>> > filters <- c("chromosome_name","start","end")
>> > values <- list(AT_AC_Gene$chr,AT_AC_Gene$start,AT_AC_Gene$end)
>> > final_1 <- getBM(attributes=attributes, filters=filters, values=values,
>> > mart=ensembl)
>> >
>> > The code runs well without any error but the final1 output has 0
>> > observations of 6 variables. Why?
>> >
>> > Can anyone help me with this?
>>
>>
>> You are more likely to get a useful response on the BioC mailing list.
>> It appears you have a dependenciy of a csv file that you have not told
>> us about.
>>
>>
>> --
>>
>> David
>>
>> >
>> >
>> > Thanks,
>> >
>> > Puja
>> >
>> >       [[alternative HTML version deleted]]
>> >
>> > ______________________________________________
>> > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>>
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list