[BioC] placement of DTD files in a package?

Douglas Bates bates@stat.wisc.edu
20 Mar 2002 17:55:08 -0600


Robert Gentleman <rgentlem@jimmy.harvard.edu> writes:

> On Wed, Mar 20, 2002 at 02:19:56PM -0800, Anthony Rossini wrote:
> > On Wed, 20 Mar 2002, Vincent Carey 525-2265 wrote:
> > 
> > > 
> > > > So, do DTD files get placed under data or in a completely separate location, for installation purposes?  (i.e. ../package/data, ../package/inst/xml,  or ../package/inst/dtd, or other??)
> > > >
> > > 
> > > i know of no convention on this.  we may not need one.
> > > package code that uses the DTD will have to be explicit
> > > about its location.  any of the choices you list may
> > > be appropriate depending on the visibility and separateness
> > > of resources desired by the package designer.
> > >
> > > does this lead to cacophony in package structure?
> > > i don't think so.
> > 
> > I think I agree with you.  I don't have strong feelings on the matter, other than if a standard workflow for determination exists, that I might as well use it.  The context is the DTD describing the XML format for a dataset.  I'm tempted to stick it in ../package/inst/dtd, but was wondering how others have dealt with it.  I sent the question here, since the number of package developers using R XML outside of this particular mailing list seems small.
> > 
>   Me either, somehow I think of it (at least a bit) as data so I like
>  package/inst/data
>   but almost anything is fine
>   we just need to ensure it gets copied over to the installation
>   directory so it can get found automatically.

I think eventually you would find that it is better to separate the
dtd from the data -- i.e. use Tony's original idea of a
../package/inst/dtd directory.

One reason for not mixing the DTD and the data is because the DTD
tends to be more permanent than the data.  You can be adding or
modifying the data sets but the DTD, because it describes a data
format, is a more stable description.

Also, once you have a established and more-or-less finalized the DTD
it is handy to make it available from an http server so you can
begin the XML file with

<?xml version='1.0' encoding='UTF-8'?>
<!DOCTYPE foo SYSTEM "http://www.bioconductor.org/dtd/foo.dtd">
<foo>
 ...
</foo>

and a validating parser will have access to the DTD independently of
the file's location.  If you are going to create a collection of DTD's
under, say, www.bioconductor.org/dtd/, it would be handy to have the
DTD's within packages separately accessible and identifiable.