[R] Antw: [R-sig-Geo] RCurl: using ls instead of NLST

Matteo Mattiuzzi matteo.mattiuzzi at boku.ac.at
Wed May 29 19:44:37 CEST 2013


Dear Jonathan,in the MODIS package I use the following function to list files within a http or a ftp folder. I'm not really practical in XML stuff but I got it working somehow.
The LP DAAC has changed from FTP to HTTP, I'm not sure if it is good idea to use the ftp protocol anymore. I also had a lot of problems with the ftplistonly=T option so I decided to use the saver version FALSE and split it by my own.
Matteo


MODIS:::filesUrl


filesUrl <- function(url)
{
    require(RCurl)


    if (substr(url,nchar(url),nchar(url))!="/")
    {
       url <- paste0(url,"/") 
    }
    
    iw   <- options()$warn 
    options(warn=-1)
    on.exit(options(warn=iw))


    try(co <- getURLContent(url),silent=TRUE)
    
    if (!exists("co")) {return(FALSE)}
    
    if (substring(url,1,4)=="http")
    {
        if(!require(XML))
        {
            stop("You need to install the 'XML' package from 'Omegahat' repository")
        }
             
        co     <- htmlTreeParse(co)
        co     <- co$children[[1]][[2]][[2]]
        co     <- sapply(co$children, function(el) xmlGetAttr(el, "href"))
        co     <- as.character(unlist(co))
        co     <- co[!co %in% c("?C=N;O=D", "?C=M;O=A", "?C=S;O=A", "?C=D;O=A")]
        fnames <- co[-1] 
         
     } else 
     {
        co <- strsplit(co, if(.Platform$OS.type=="unix"){"\n"} else{"\r\n"})[[1]]
       
        co <- strsplit(co," ")
        elim    <- grep(co,pattern="total")
        if(length(elim)==1)
        {
            co <- co[-elim]
        }
        fnames <- basename(sapply(co,function(x){x[length(x)]}))
     }
     fnames <- gsub(fnames,pattern="/",replacement="")


    return(fnames)
}





>>> Jonathan Greenberg  29.05.13 18.25 Uhr >>>
R-helpers:

I'm trying to retrieve the contents of a directory from an ftp site
(ideally, the file/folder names as a character vector):
"ftp://e4ftl01.cr.usgs.gov/MOTA/MCD12C1.005/"
# (MODIS data)

Where I get the following error via RCurl:
require("RCurl")
url <- "ftp://e4ftl01.cr.usgs.gov/MOTA/MCD12C1.005/"
filenames = getURL(url,ftp.use.epsv=FALSE,ftplistonly=TRUE)
> Error in function (type, msg, asError = TRUE)  : RETR response: 550

Through some sleuthing, it turns out the ftp site does not support NLST
(which RCurl is using), but will use "ls" to list the directory contents --
is there any way to use "ls" remotely on this site?  Thanks!

--j

-- 
Jonathan A. Greenberg, PhD
Assistant Professor
Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
Department of Geography and Geographic Information Science
University of Illinois at Urbana-Champaign
607 South Mathews Avenue, MC 150
Urbana, IL 61801
Phone: 217-300-1924
http://www.geog.illinois.edu/~jgrn/
AIM: jgrn307, MSN: jgrn307 at hotmail.com, Gchat: jgrn307, Skype: jgrn3007

    [[alternative HTML version deleted]]

_______________________________________________
R-sig-Geo mailing list
R-sig-Geo at r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-geo



More information about the R-help mailing list