[Rd] Objects created by more than one data call?

William Dunlap wdunlap at tibco.com
Wed May 22 23:12:32 CEST 2013


I used svn to copy the current version of Ecdat from Rforge to my PC
  C:\temp\packages>svn checkout svn://r-forge.r-project.org/svnroot/ecdat/
then fired up R to look at the rda files in it.

> setwd("c:/temp/packages/Ecdat/ecdat/pkg/data")
> read.dcf("../DESCRIPTION")[, c("Package","Version")]
Package Version
"Ecdat" "0.2-3"
> dir.rda <- function(rdaFile) { e <- new.env() ; load(rdaFile, envir=e) ; objects(e, all=TRUE)}
> dir.rda("VietNamH.rda")
[1] "MedExp"
> rdas <- dir(pattern="\\.rda$")
> names(rdas) <- rdas
> z <- lapply(rdas, dir.rda)
> tab <- table(unlist(z))
> tab[tab>1]

Hstarts  MedExp
      3       2
> z[sapply(z, function(zi)"Hstarts" %in% zi)]
$Hstarts.rda
[1] "Hstarts"

$Intratesm.rda
[1] "Hstarts"

$Intratesq.rda
[1] "Hstarts"

> z[sapply(z, function(zi)"MedExp" %in% zi)]
$MedExp.rda
[1] "MedExp"

$VietNamH.rda
[1] "MedExp"

It looks some files don't contain what their names suggest:
> dir.rda("VietNamH.rda")
[1] "MedExp"

The two versions of MedExp are quite different:
> load("VietNamH.rda", envViet <- new.env(parent=emptyenv()))
> load("MedExp.rda", envMed <- new.env(parent=emptyenv()))
> objects(envViet)
[1] "MedExp"
> objects(envMed)
[1] "MedExp"
> all.equal(envViet$MedExp, envMed$MedExp)
 [1] "Names: 11 string mismatches"                                 
 [2] "Length mismatch: comparison on first 11 components"          
 [3] "Component 1: 'current' is not a factor"                      
 ...
[18] "Component 10: Numeric: lengths (5999, 5574) differ"          
[19] "Component 11: 'current' is not a factor"                     

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


> -----Original Message-----
> From: Spencer Graves [mailto:spencer.graves at prodsyse.com]
> Sent: Wednesday, May 22, 2013 1:27 PM
> To: William Dunlap
> Cc: r-devel at r-project.org
> Subject: Re: [Rd] Objects created by more than one data call?
> 
> On 5/21/2013 3:03 PM, William Dunlap wrote:
> > If you look at
> >     data(package="Ecat")$results[,"Item"]
> > you will see the items "Hstarts", "Hstarts (Intratesm)", and "Hstarts (Intratesq)"
> > which I think means that the dataset Hstarts is found in 3 .rda files, "Hstarts.rda",
> > "Intratesq.rda", and "Intratesm.rda".  There are duplicate, modulo (filename),
> > items for "MedExp" as well.
> 
> 
>        Thanks for this.  I may get me closer, but I still don't see it:
> (data(Intratesm)) imports only the object Intratesm, etc.  For more
> details, see below.
> 
> 
>        Any other suggestions?
> 
> 
>        Thanks again,
>        Spencer
> 
> 
>  > Ecdat.data <- data(package="Ecdat")$results
>  > (Hstarts2 <- grep('Hstarts', Ecdat.data[, 'Item']))
> [1] 47 48 49
>  > (MedExp2 <- grep('MedExp', Ecdat.data[, 'Item']))
> [1] 67 68
>  > Ecdat.data[Hstarts2, ]
>       Package LibPath Item
> [1,] "Ecdat" "C:/Users/sgraves/pgms/R/R-3.0.0/library" "Hstarts"
> [2,] "Ecdat" "C:/Users/sgraves/pgms/R/R-3.0.0/library" "Hstarts (Intratesm)"
> [3,] "Ecdat" "C:/Users/sgraves/pgms/R/R-3.0.0/library" "Hstarts (Intratesq)"
>       Title
> [1,] "Housing Starts"
> [2,] "Housing Starts"
> [3,] "Housing Starts"
>  > Ecdat.data[MedExp2,]
>       Package LibPath Item
> [1,] "Ecdat" "C:/Users/sgraves/pgms/R/R-3.0.0/library" "MedExp"
> [2,] "Ecdat" "C:/Users/sgraves/pgms/R/R-3.0.0/library" "MedExp (VietNamH)"
>       Title
> [1,] "Structure of Demand for Medical Care"
> [2,] "Structure of Demand for Medical Care"
>  > library(Ecdat)
>  > (data(Intratesm))
> [1] "Intratesm"
>  > (data(Intratesq))
> [1] "Intratesq"
> 
> 
> > Bill Dunlap
> > Spotfire, TIBCO Software
> > wdunlap tibco.com
> >
> >
> >> -----Original Message-----
> >> From: r-devel-bounces at r-project.org [mailto:r-devel-bounces at r-project.org] On
> Behalf
> >> Of Spencer Graves
> >> Sent: Tuesday, May 21, 2013 12:21 PM
> >> To: Prof Brian Ripley
> >> Cc: r-devel at r-project.org
> >> Subject: Re: [Rd] Objects created by more than one data call?
> >>
> >> On 5/21/2013 9:03 AM, Prof Brian Ripley wrote:
> >>> On 21/05/2013 16:51, Spencer Graves wrote:
> >>>> On 5/21/2013 7:47 AM, Prof Brian Ripley wrote:
> >>>>> On 21/05/2013 15:28, Spencer Graves wrote:
> >>>>>> On 5/20/2013 10:10 PM, Prof Brian Ripley wrote:
> >>>>>>> On 21/05/2013 00:12, Spencer Graves wrote:
> >>>>>>>> Hello, All:
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>         If I use LazyData with the Ecdat package on R-Forge, "R CMD
> >>>>>>>> check" reports "no visible binding for global variable
> >>>>>>>> 'nonEnglishNames'", where 'nonEnglishNames' is a dataset in Ecdat
> >>>>>>>> used
> >>>>>>>> as the default argument for a function.  With LazyData, that NOTE
> >>>>>>>> disappears.  However, then I get, "Warning: objects 'Hstarts',
> >>>>>>>> 'Hstarts', 'MedExp' are created by more than one data call".
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>         What do you suggest I do to fix this problem?
> >>>>>>> Not create the objects in more than one data() call.
> >>>>>>>
> >>>>>>> Check what each of your data() calls produces.
> >>>>>>
> >>>>>>         Thanks.  How do I do that?
> >>>>> Call data() on each in turn, and see what files get added to an empty
> >>>>> workspace.
> >>>>
> >>>> Like the following?
> >>> You missed the 'empty'.  Look at tools:::data2LazyLoadDB to see how
> >>> this is checked.
> >>
> >>         Thanks for the suggestion.  Unfortunately, I tried that function,
> >> including stepping through it line by line, fixing references to other
> >> functions not exported from tools, without enlightenment;  see below.
> >>
> >>
> >>         Thanks again,
> >>         Spencer
> >>
> >>
> >>   > lib.loc = NULL
> >>   > package='Ecdat'
> >>   > pkgpath <- find.package(package, lib.loc, quiet = TRUE)
> >>   > pkgpath
> >> [1] "C:/Users/sgraves/pgms/R/R-3.0.0/library/Ecdat"
> >>   > dataDir <- file.path(pkgpath, "data")
> >>   > dataDir
> >> [1] "C:/Users/sgraves/pgms/R/R-3.0.0/library/Ecdat/data"
> >>   > enc <- tools:::.read_description(file.path(pkgpath,
> >> "DESCRIPTION"))["Encoding"]
> >>   > enc
> >> <NA>
> >>     NA
> >>   > if (!is.na(enc)) {
> >> +         op <- options(encoding = enc)
> >> +         on.exit(options(encoding = op[[1L]]))
> >> +     }
> >>   > file_test("-d", dataDir)
> >> [1] TRUE
> >>   > file.path(dataDir, "Rdata.rds")
> >> [1] "C:/Users/sgraves/pgms/R/R-3.0.0/library/Ecdat/data/Rdata.rds"
> >>   > (file.exists(file.path(dataDir, "Rdata.rds")) &&
> >> file.exists(file.path(dataDir,
> >> +             paste(package, "rdx", sep = "."))) &&
> >> file.exists(file.path(dataDir,
> >> +             paste(package, "rdb", sep = "."))))
> >> [1] FALSE
> >>   > file.exists(file.path(dataDir,
> >> +             paste(package, "rdx", sep = ".")))
> >> [1] FALSE
> >>   > file.path(dataDir,
> >> +             paste(package, "rdx", sep = "."))
> >> [1] "C:/Users/sgraves/pgms/R/R-3.0.0/library/Ecdat/data/Ecdat.rdx"
> >>   > dataEnv <- new.env(hash = TRUE)
> >>   > tmpEnv <- new.env()
> >>   > f0 <- files <- list_files_with_type(dataDir, "data")
> >> Error: could not find function "list_files_with_type"
> >>   > f0 <- files <- tools:::list_files_with_type(dataDir, "data")
> >>   > files <- unique(basename(file_path_sans_ext(files,
> >> +                 TRUE)))
> >> Error in basename(file_path_sans_ext(files, TRUE)) :
> >>     could not find function "file_path_sans_ext"
> >>   > files <- unique(basename(tools:::file_path_sans_ext(files,
> >> +                 TRUE)))
> >>   >             dlist <- vector("list", length(files))
> >>   > files
> >> character(0)
> >>   >  names(dlist) <- files
> >>   >             loaded <- character(0L)
> >>   > loaded
> >> character(0)
> >>   > for (f in files) {
> >> +                 utils::data(list = f, package = package, lib.loc =
> >> lib.loc,
> >> +                   envir = dataEnv)
> >> +                 utils::data(list = f, package = package, lib.loc =
> >> lib.loc,
> >> +                   envir = tmpEnv)
> >> +                 tmp <- ls(envir = tmpEnv, all.names = TRUE)
> >> +                 rm(list = tmp, envir = tmpEnv)
> >> +                 dlist[[f]] <- tmp
> >> +                 loaded <- c(loaded, tmp)
> >> +             }
> >>   >             dup <- duplicated(loaded)
> >>   > dup
> >> logical(0)
> >>   > if (any(dup))
> >> +                 warning(sprintf(ngettext(sum(dup), "object %s is
> >> created by more than one data call",
> >> +                   "objects %s are created by more than one data call"),
> >> +                   paste(sQuote(loaded[dup]), collapse = ", ")),
> >> +                   call. = FALSE, domain = NA)
> >>   >             if (length(loaded)) {
> >> +                 dbbase <- file.path(dataDir, "Rdata")
> >> +                 makeLazyLoadDB(dataEnv, dbbase, compress = compress)
> >> +                 saveRDS(dlist, file.path(dataDir, "Rdata.rds"),
> >> +                   compress = compress)
> >> +                 unlink(f0)
> >> +                 if (file.exists(file.path(dataDir, "filelist")))
> >> +                   unlink(file.path(dataDir, c("filelist", "Rdata.zip")))
> >> +             }
> >>   >
> >>
> >>>>
> >>>>   > library(Ecdat)
> >>>>   > objects()
> >>>> character(0)
> >>>>   > (data(Hstarts))
> >>>> [1] "Hstarts"
> >>>>   > (data(MedExp))
> >>>> [1] "MedExp"
> >>>>   > objects()
> >>>> [1] "Hstarts" "MedExp"
> >>>>   > sessionInfo()
> >>>> R version 3.0.0 (2013-04-03)
> >>>> Platform: i386-w64-mingw32/i386 (32-bit)
> >>>>
> >>>> locale:
> >>>> [1] LC_COLLATE=English_United States.1252
> >>>> [2] LC_CTYPE=English_United States.1252
> >>>> [3] LC_MONETARY=English_United States.1252
> >>>> [4] LC_NUMERIC=C
> >>>> [5] LC_TIME=English_United States.1252
> >>>>
> >>>> attached base packages:
> >>>> [1] stats     graphics  grDevices utils     datasets  methods base
> >>>>
> >>>> other attached packages:
> >>>> [1] Ecdat_0.2-3
> >>>>
> >>>> loaded via a namespace (and not attached):
> >>>> [1] tools_3.0.0
> >>>>
> >>>>
> >>>>         Thanks,
> >>>>         Spencer
> >>>>
> >>>>>>         In the "man" directory, I just did "grep 'data(MedExp' *.Rd",
> >>>>>> which identified only "MedExp.Rd:\usage{data(MedExp)}"; "grep
> >>>>>> 'data(Hstarts *.Rd" similarly returned only
> >>>>>> "Hstarts.Rd:\usage(data(Hstarts)}".
> >>>>>>
> >>>>>>
> >>>>>>         Thanks again for the reply.
> >>>>>>         Spencer
> >>>>>>>>         Thanks,
> >>>>>>>>         Spencer Graves
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>   > sessionInfo()
> >>>>>>>> R version 3.0.0 (2013-04-03)
> >>>>>>>> Platform: i386-w64-mingw32/i386 (32-bit)
> >>>>>>>>
> >>>>>>>> locale:
> >>>>>>>> [1] LC_COLLATE=English_United States.1252
> >>>>>>>> [2] LC_CTYPE=English_United States.1252
> >>>>>>>> [3] LC_MONETARY=English_United States.1252
> >>>>>>>> [4] LC_NUMERIC=C
> >>>>>>>> [5] LC_TIME=English_United States.1252
> >>>>>>>>
> >>>>>>>> attached base packages:
> >>>>>>>> [1] stats     graphics  grDevices utils     datasets methods base
> >>>>>>>>
> >>>>>>>> other attached packages:
> >>>>>>>> [1] Ecdat_0.2-3
> >>>>>>>>
> >>>>>>>> loaded via a namespace (and not attached):
> >>>>>>>> [1] tools_3.0.0
> >>>>>>>>
> >>>>>>>> ______________________________________________
> >>>>>>>> R-devel at r-project.org mailing list
> >>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-devel



More information about the R-devel mailing list