[R] Help for loop

jim holtman jholtman at gmail.com
Sat Dec 18 02:54:18 CET 2010


Not sure about getting the file names, but you are 'extending' the
data structure on each iteration, which is inefficient; try 'lapply'
instead:

small.data <- do.call(rbind, lapply(mysites, function(.file){
    try(base <- read.table(.file, sep=";", header=T, as.is=T,
        fileEncoding="windows-1252"), TRUE)
}))



On Fri, Dec 17, 2010 at 10:15 AM, Daniel <dmsilv at gmail.com> wrote:
> Hello all,
> Is there any way to get each file from a website list and aggregate in a
> data frame?
> Otherwise I have to type 23 thousand web address into a long script like it:
>
> base1 <- read.table("site 1", sep=";", header=T,
> fileEncoding="windows-1252")
> base2 <- read.table("site 2", sep=";", header=T,
> fileEncoding="windows-1252")
>
> I need to download each .CSV file from each address in the list vector and
>  row bind all them into a big data frame.
> Also I need to decode each object to UTF-8. Of course, many of web sites
> from the list maybe be empty, so, my loop needs to jump for the next
> address.
>
> My first shot look looks like working, but, after one night and half a dia,
> it didn't finish. That I mean, much time for the task. Can somebody help
> me?
>
> Example, few address:
>
> mysites <-c("
> http://spce2010.tse.gov.br/spceweb.consulta.receitasdespesas2010/exportaReceitaCsvCandidato.action?sqCandidato=40000000613&sgUe=AM&cpfCnpjDoador=",
> "
> http://spce2010.tse.gov.br/spceweb.consulta.receitasdespesas2010/exportaReceitaCsvCandidato.action?sqCandidato=40000000620&sgUe=AM&cpfCnpjDoador=",
> "
> http://spce2010.tse.gov.br/spceweb.consulta.receitasdespesas2010/exportaReceitaCsvCandidato.action?sqCandidato=40000000259&sgUe=AM&cpfCnpjDoador=",
> "
> http://spce2010.tse.gov.br/spceweb.consulta.receitasdespesas2010/exportaReceitaCsvCandidato.action?sqCandidato=250000002241&sgUe=SP&cpfCnpjDoador=",
> "
> http://spce2010.tse.gov.br/spceweb.consulta.receitasdespesas2010/exportaReceitaCsvCandidato.action?sqCandidato=250000002438&sgUe=SP&cpfCnpjDoador=
> ", "
> http://spce2010.tse.gov.br/spceweb.consulta.receitasdespesas2010/exportaReceitaCsvCandidato.action?sqCandidato=40000000257&sgUe=AM&cpfCnpjDoador=
> ","
> http://spce2010.tse.gov.br/spceweb.consulta.receitasdespesas2010/exportaReceitaCsvCandidato.action?sqCandidato=120000000162&sgUe=MS&cpfCnpjDoador="
> )
>
> big.data <- NULL
> base <-NULL
>  for (i in mysites) {
>  try(base <- read.table(i, sep=";", header=T, as.is=T,
> fileEncoding="windows-1252"), TRUE)
>  if(!is.null(base)) big.data <- rbind(big.data, base)
>  }
>
> --
> Daniel Marcelino
> Skype: dmsilv
> http://marcelino.pbworks.com/
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?



More information about the R-help mailing list