[BioC] AnnBuilder and ftp problems

Martin Morgan mtmorgan at fhcrc.org
Fri Aug 31 17:39:41 CEST 2007


Hi Pedro --

Here's my advice, maybe others will have better ideas.

Get readLines to work, and do not worry about AnnBuilder until that is
figured out.

To get readLines to work, I suggest making only changes that are
essential. So remove the environment variables you mention, and the
options you set in R. What does

> readLines("ftp://ftp.ncbi.nih.gov/repository/UniGene/Homo_sapiens/Hs.info")

produce? It sounds like it will fail to connect to the ftp server. So
then try

% export ftp_proxy=http://12.16.105.41:8080
% export ftp_proxy_user="anonymous"
% export ftp_proxy_password="plopez at cnic.es"

HOWEVER, make sure that these proxy settings are correct (this depends
on your specific site; we cannot help you here). In particular the
ftp_proxy should likely be ftp://...:21 ('21' is the default ftp port,
but could be different for you; 8080 is the standard http port and
unlikely to be correct for ftp; using an http:// for ftp proxy doesn't
sound right to me, either).

In R, set

> options(internet.info=0)

and try

> readLines("ftp://ftp.ncbi.nih.gov/repository/UniGene/Homo_sapiens/Hs.info")

again. Now what is the output? If it looks like you saw earlier, e.g.,

Error in file(con, "r") : unable to open connection In addition: Warning
messages:

1: using FTP proxy ' http_proxy=http://12.16.105.41:8080' in: file(con, "r")
2: RxmlNanoFTPGetMore : read 0 [0 - 0] in: file(con, "r")
3: failed to get response from server in: file(con, "r")

then likely it means that your ftp_proxy is incorrect.

Hope that helps; this is really a bit of a guess on my part.

Please also arrange to send your email as plain text, as the 'helpful'
formating (e.g., parsing URLs) makes the message difficult to read.

Hope that helps, and please let us know how it goes.

Martin

>
>  
>
> I get the following error: 
>
>  
>
> Error in file(con, "r") : unable to open connection In addition: Warning
> messages:
>
> 1: using FTP proxy ' http_proxy=http://12.16.105.41:8080' in: file(con, "r")
>
> 2: RxmlNanoFTPGetMore : read 0 [0 - 0] in: file(con, "r")
>
> 3: failed to get response from server in: file(con, "r")


Pedro López-Romero <plopez at cnic.es> writes:

>  
>
> Dear All,
>
>  
>
> Probably this question could have posted in other list, but since it affects
> AnnBuilder, I guessed that people used to working with the AnnBuilder
> package could help me.  
>
>  
>
> Basically I am having problems with ftp connections and then I can not use
> AnnBuilder. 
>
> I have configured my computer following several instructions posted in the
> BioC mailing list and in the different R help functions, but still I have
> problems with some functions of AnnBuilder that I will describe next.. 
>
>  
>
> First I will give some details of my computer and R configuration. 
>
>  
>
> I am working with SUSE 10.1
>
>  
>
> My R session Info is: 
>
>  
>
>> sessionInfo()
>
> R version 2.5.1 (2007-06-27)
>
> i686-pc-linux-gnu
>
>  
>
> attached base packages:
>
> [1] "tools"     "stats"     "graphics"  "grDevices" "utils"     "datasets"
>
> [7] "methods"   "base"
>
>  
>
> other attached packages:
>
> AnnBuilder   annotate        XML    Biobase
>
>   "1.14.0"   "1.14.1"  "0.99-93"   "1.14.1"
>
>  
>
>  
>
> I set the following environmental variables  as it is said in ?download.file
> and 
>
> http://article.gmane.org/gmane.science.biology.informatics.conductor/647/mat
> ch=proxy+settings
>
>  
>
> unset no_proxy
>
>  
>
> export http_proxy=http://12.16.105.41:8080
>
> export ftp_proxy=http://12.16.105.41:8080
>
> export https_proxy=http://12.16.105.41:8080
>
>  
>
> export ftp_proxy_user="anonymous"
>
> export ftp_proxy_password="plopez at cnic.es"
>
>  
>
>  
>
>  
>
> I have also set the following R options (after reading ?download.file) 
>
>  
>
> options(timeout=86400)
>
> options(download.file.method="wget")
>
> options(internet.info=0)
>
>  
>
>  
>
> Doing all this, I can not get ftp connection from R. However, when I do fto
> from a shell window, I do not have problems a all with ftp sites. 
>
>  
>
> I will describe the problem in detail below (related to the use of
> AnnBuilder)
>
>  
>
> FIRST, the function AnnBuilder:::LoadFromUrl gave me an error message, due
> to download.file(..., method="internal").  The error was because
> download.file use method="internal", however download.file(.) went ok with
> method="wget" so I changed the code of the function LoadFromUrl to allow
> download.file(.) to use method=.wget.
>
>  
>
> Here is what I changed in LoadFromUrl: 
>
>  
>
>  options(show.error.messages = FALSE)
>
>     if (.Platform$OS.type == "unix") {
>
>         tryMe <- try(download.file(srcUrl, fileName, method = "wget",
>
>             quiet = TRUE))
>
>  
>
> DOING THIS, I CAN DOWNLOAD ANY FILE USING loadFromUrl(.), as I show here
> below:
>
>  
>
>> myDir= tempdir( )
>
>>
> loadFromUrl("ftp://ftp.ncbi.nih.gov/repository/UniGene/Homo_sapiens/Hs.info"
> ,destDir =myDir,verbose=T)
>
>  
>
> loading from URL: HYPERLINK
> "ftp://ftp.ncbi.nih.gov/repository/UniGene/Homo_sapiens/Hs.info"ftp://ftp.nc
> bi.nih.gov/repository/UniGene/Homo_sapiens/Hs.info
>
> [1] "/tmp/RtmpgGc8B2/file37f57062Hs.info"
>
>  
>
>  
>
> Then, it seems that the proxy problem (if this is the problem) is solved and
> I can access to ftp sites.
>
>  
>
> However, after solving this, I got other error messages at other points of
> the execution of the ABPkgBuild function (.) 
>
>  
>
> Here is the whole ode that I am using and the error message: 
>
>  
>
>> library(AnnBuilder)
>
>> 
>
>> myDir=tempdir()
>
>> fromWeb=TRUE
>
>> 
>
>> Mapfile="HgbAG.txt"             
>
>> myBase=file.path(Mapfile)      
>
>> read.table(myBase,sep="\t",header=FALSE,as.is=TRUE)
>
>             V1         V2
>
> 1  A_24_P66027  NM_004900
>
> 2  A_24_P66028   AA085955
>
> 3  A_24_P66029  NM_014616
>
> 4  A_24_P66030   AK092846
>
> 5  A_24_P66031  NM_001539
>
> 6  A_24_P66032 THC2450799
>
> 7  A_24_P66033  NM_006709
>
> 8  A_24_P66034  NM_000978
>
> 9  A_24_P66035     T12590
>
> 10 A_24_P66037  NM_001017
>
> 11 A_24_P66038   AK021474
>
> 12 A_24_P66039  NM_198527
>
> 13 A_24_P66040  NM_000311
>
> 14 A_24_P66041   AK091028
>
> 15 A_24_P66042   AK057596
>
> 16 A_24_P66044   AY358648
>
> 17 A_24_P66045   AK026647
>
> 18 A_24_P66046  NM_032445
>
> 19 A_24_P66047  NM_004886
>
>> 
>
>> 
>
>> myBaseType="gbNRef"           # RefSeq & Genbank
>
>> 
>
>> 
>
>> 
>
>> myChip="htest"         
>
>> myOrg="Homo sapiens"
>
>> myVersion="0.0.1"
>
>> 
>
>> 
>
>> ABPkgBuilder(
>
> +         baseName=myBase,
>
> +         baseMapType=myBaseType,
>
> +         pkgName=myChip,
>
> +         pkgPath=myDir,
>
> +         organism=myOrg,
>
> +         version=myVersion,
>
> +         otherSrc=NULL,
>
> +         author=list(authors ="P.
> Lopez-Romero",maintainer="plopez at cnic.es"),
>
> +         fromWeb=TRUE)
>
>  
>
> Attaching package: 'GO'
>
>  
>
>  
>
>         The following object(s) are masked from package:AnnBuilder :
>
>  
>
>          GO
>
>  
>
>  
>
>  
>
> Error in readURL(infoUrl) : Can't read from url: HYPERLINK
> "ftp://ftp.ncbi.nih.gov/repository/UniGene/Homo_sapiens/Hs.info"ftp://ftp.nc
> bi.nih.gov/repository/UniGene/Homo_sapiens/Hs.info
>
>  
>
>  
>
>  
>
>  
>
> This new error is in readURL is really casued by the readLines( .) function,
> as a consequence of the function file(con, .r.) (both functions belong to
> the base package). If I execute:
>
>  
>
>> readLines("ftp://ftp.ncbi.nih.gov/repository/UniGene/Homo_sapiens/Hs.i
>
>> nfo")
>
>  
>
> I get the following error: 
>
>  
>
> Error in file(con, "r") : unable to open connection In addition: Warning
> messages:
>
> 1: using FTP proxy ' http_proxy=http://12.16.105.41:8080' in: file(con, "r")
>
> 2: RxmlNanoFTPGetMore : read 0 [0 - 0] in: file(con, "r")
>
> 3: failed to get response from server in: file(con, "r")
>
>  
>
> But if instead of using readLines(.) directly I use the following code, I
> can read the file: 
>
>  
>
>> tmp=tempdir()
>
>> tmp2=loadFromUrl("ftp://ftp.ncbi.nih.gov/repository/UniGene/Homo_sapie
>
>> ns/Hs.info",destDir=tmp)
>
>  
>
>> readLines(tmp2)
>
>  
>
>  [1] "UniGene Build #204  Homo sapiens"                   
>
>  [2] ""                                                   
>
>  [3] "Sequences Included in UniGene"                      
>
>  [4] "============================="                      
>
>  [5] ""                                                   
>
>  [6] "Known genes are from GenBank 10 Jul 2007"           
>
>  [7] "ESTs are from dbEST through 10 Jul 2007"
>
>  
>
>  
>
> So I DECIDED TO MODIFY the code of readURL  in /AnnBuilder/R/getSrcBuilt.R
> as I show below: 
>
>  
>
>> AnnBuilder:::readURL
>
> function (url)
>
> {
>
>     con <- url(url)
>
>     options(show.error.messages = FALSE)
>
>     tmp <- tempdir()
>
>     tmp2 <- loadFromUrl(con, destDir = tmp)
>
>     temp <- try(readLines(tmp2))
>
>     close(con)
>
>     options(show.error.messages = TRUE)
>
>     if (!inherits(temp, "try-error")) {
>
>         return(temp)
>
>     }
>
>     else {
>
>         stop(paste("Can't read from url:", url))
>
>     }
>
> }
>
> <environment: namespace:AnnBuilder>
>
>  
>
>  
>
>  
>
> THEN, readURL(.) WORKS, BUT NOT THE OTHERS THAT USE the readLines, as for
> example parseKEGGGenome(url = kegggenomeUrl)
>
>  
>
>  
>
>  
>
> The only solution that I have figured out is to modify the code in all the
> AnnBuilder fucntions that make use of readLines, but this can be a
> cumbersome task, especially when I have to update the package.
>
>  
>
> So far, I have tried as much as I could but the problem is still there and
> I do not know if it is an R option, a SUSE or proxy configuration. What
> puzzles me a lot is the fact that loadFromUrl(.) function works (I had to
> modify the code a bit, though) but not the readLine(.). 
>
>  
>
> I will appreciate it very much any help, since I am completely stuck at this
> point and I do not know what else I can try.- 
>
>  
>
>  
>
> Thanks a lot.- 
>
>  
>
> Pedro 
>
>  
>
>  
>
>  
>
>
>
> Checked by AVG Free Edition. 
>
> 18:05
>  
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
Martin Morgan
Bioconductor / Computational Biology
http://bioconductor.org



More information about the Bioconductor mailing list