[Rd] Re: [Omega-devel] StatDataML Description element

Kevin Little klittle@iecodesign.com
Tue, 7 Mar 2000 14:22:09 -0600


This is a multi-part message in MIME format.

------=_NextPart_000_0033_01BF8840.85DCFC80
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: 7bit

  [from your draft]
  The description element
  <!ELEMENT description (title?, source?, date?, version?, comment?)>

  <!ELEMENT title (#PCDATA) >
  <!ELEMENT source (#PCDATA) >
  <!ELEMENT date (#PCDATA) >
  <!ELEMENT version (#PCDATA) >
  <!ELEMENT comment (#PCDATA) >

  The description element itself consists of five elements (title, source,
date, comment, version) which are simple strings and include no other
elements. It is used to provide meta-information about a dataset that is
typically not needed for computations on the data itself.

One of the key issues I worry about is "data set provenance" and I encourage
my clients to be able to trace back to one or more sources to inform
themselves of the quality of the data set (how much reliance they can place
in what they are about to see or manipulate, etc).

1. easy links to an originator would be good e.g. is there any way to use
hypertext in any of the arguments for email or web site referrals?

2. what limits (any?) are there typically on the length of the strings in
the arguments of DESCRIPTION?

3. In S or R or Omega, a statistical dataset's description is always
available by command or menu item (I'm guessing).  Are the arguments
typically available now for any kind of search or filtering or action by my
statistical application (e.g. can I ask the statistical application to only
allow me to work on datasets from a certain source? I guess this is possible
and would be addressed by the application developers, not the developers of
the XML standard but maybe there is something required of the XML standard
to make it possible to hook to applications?)

4. is the list of arguments fixed at five? Or could one allow for multiple
comment1, comment2, ...

[If one can search or otherwise operate on the string in the "comment" field
then I guess you don't need to extend the list of arguments.  That leads to
another question:  how extensible or upgradeable is StatDataML envisioned to
be?  I can imagine a "Data Quality Stamp or Certification" being relevant
within certain communities and it would be nice to have that in the
meta-data description, perhaps as a separate argument.]

Thanks again for your draft proposal on StatDataML, this is a potentially
very important contribution.

Regards
Kevin Little, Ph.D.
Informing Ecological Design, LLC
2213 West Lawn Avenue
Madison, WI  53711
tel 608.251.4355 fax 608.251.0399
email klittle@iecodesign.com


------=_NextPart_000_0033_01BF8840.85DCFC80
Content-Type: text/html;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD><TITLE></TITLE>
<META content=3D"text/html; charset=3Diso-8859-1" =
http-equiv=3DContent-Type>
<META content=3D"MSHTML 5.00.2614.3500" name=3DGENERATOR></HEAD>
<BODY>
<BLOCKQUOTE style=3D"MARGIN-RIGHT: 0px">
  <DIV style=3D"MARGIN-RIGHT: 0px"><FONT color=3D#0000ff face=3DArial =
size=3D2>[from=20
  your draft]</FONT></DIV>
  <DIV style=3D"MARGIN-RIGHT: 0px"><FONT color=3D#0000ff face=3DArial =
size=3D2>The=20
  description element<BR>&lt;!ELEMENT description (title?, source?, =
date?,=20
  version?, comment?)&gt;<BR><BR>&lt;!ELEMENT title (#PCDATA)=20
  &gt;<BR>&lt;!ELEMENT source (#PCDATA) &gt;<BR>&lt;!ELEMENT date =
(#PCDATA)=20
  &gt;<BR>&lt;!ELEMENT version (#PCDATA) &gt;<BR>&lt;!ELEMENT comment =
(#PCDATA)=20
  &gt;<BR><BR>The description element itself consists of five elements =
(title,=20
  source, date, comment, version) which are simple strings and include =
no other=20
  elements. It is used to provide meta-information about a dataset that =
is=20
  typically not needed for computations on the data=20
itself.<BR></DIV></BLOCKQUOTE></FONT>
<DIV style=3D"MARGIN-RIGHT: 0px"><FONT face=3DArial size=3D2>One of the =
key issues I=20
worry about is "data set provenance" and I encourage my clients to be =
able to=20
trace back to one or more sources to&nbsp;inform themselves of the =
quality of=20
the data set (how much reliance they can place in what they are about to =
see or=20
manipulate, etc).&nbsp; </FONT></DIV>
<DIV style=3D"MARGIN-RIGHT: 0px"><FONT face=3DArial =
size=3D2></FONT>&nbsp;</DIV>
<DIV style=3D"MARGIN-RIGHT: 0px"><FONT face=3DArial size=3D2>1. easy =
links to an=20
originator would be good e.g. is <FONT face=3DArial size=3D2>there any =
way to use=20
hypertext in any of the arguments for email or web site=20
referrals?</FONT></FONT></DIV>
<DIV style=3D"MARGIN-RIGHT: 0px"><FONT face=3DArial size=3D2><FONT =
face=3DArial=20
size=3D2></FONT></FONT>&nbsp;</DIV>
<DIV style=3D"MARGIN-RIGHT: 0px"><FONT face=3DArial size=3D2><FONT =
face=3DArial=20
size=3D2>2. what limits (any?) are there typically on the length of the =
strings in=20
the arguments of&nbsp;DESCRIPTION?</FONT></FONT></DIV>
<DIV style=3D"MARGIN-RIGHT: 0px">&nbsp;</DIV>
<DIV style=3D"MARGIN-RIGHT: 0px"><FONT face=3DArial size=3D2>3. In S or =
R or Omega, a=20
statistical dataset's description is always available by command or menu =
item=20
(I'm guessing).&nbsp; Are the arguments typically available now for any =
kind of=20
search or filtering or action by my statistical application (e.g. can I =
ask the=20
statistical application to only allow me to work on datasets from a =
certain=20
source? I guess this is possible and would be addressed by the =
application=20
developers, not the developers of the XML standard but maybe there is =
something=20
required of the XML standard to make it possible to hook to=20
applications?)</FONT></DIV>
<DIV style=3D"MARGIN-RIGHT: 0px">&nbsp;</DIV>
<DIV style=3D"MARGIN-RIGHT: 0px"><FONT face=3DArial size=3D2><FONT =
face=3DArial=20
size=3D2>4.&nbsp;is the list of arguments fixed at five? Or could one =
allow for=20
multiple comment1, comment2, ... </FONT></FONT></DIV>
<DIV style=3D"MARGIN-RIGHT: 0px"><FONT face=3DArial size=3D2><FONT =
face=3DArial=20
size=3D2></FONT></FONT>&nbsp;</DIV>
<DIV style=3D"MARGIN-RIGHT: 0px"><FONT face=3DArial size=3D2><FONT =
face=3DArial=20
size=3D2>[If one can search or otherwise operate on the string in the =
"comment"=20
field then I guess you don't need to extend the list of arguments.&nbsp; =
That=20
leads to another question:&nbsp; how extensible or upgradeable is =
StatDataML=20
envisioned to be?&nbsp; I can imagine a "Data Quality=20
Stamp&nbsp;or&nbsp;Certification" being relevant within certain =
communities=20
and&nbsp;it would be nice to have that in&nbsp;the meta-data =
description,=20
perhaps as a separate argument.]&nbsp; </FONT></FONT></DIV>
<DIV style=3D"MARGIN-RIGHT: 0px"><FONT face=3DArial =
size=3D2></FONT>&nbsp;</DIV>
<DIV style=3D"MARGIN-RIGHT: 0px"><FONT face=3DArial size=3D2><FONT =
color=3D#0000ff=20
face=3DArial size=3D2><FONT color=3D#000000>Thanks again for your draft =
proposal on=20
StatDataML, this is a potentially very important=20
contribution.</FONT></FONT></FONT></DIV>
<DIV style=3D"MARGIN-RIGHT: 0px"><FONT face=3DArial =
size=3D2></FONT>&nbsp;</DIV>
<DIV style=3D"MARGIN-RIGHT: 0px"><FONT face=3DArial =
size=3D2>Regards</FONT></DIV>
<DIV style=3D"MARGIN-RIGHT: 0px"><FONT face=3DArial size=3D2>
<P><FONT face=3DArial size=3D2>Kevin Little, Ph.D.</FONT> <BR><FONT =
face=3DArial=20
size=3D2>Informing Ecological Design, LLC</FONT> <BR><FONT face=3DArial =
size=3D2>2213=20
West Lawn Avenue</FONT> <BR><FONT face=3DArial size=3D2>Madison, =
WI&nbsp;=20
53711</FONT> <BR><FONT face=3DArial size=3D2>tel 608.251.4355 fax=20
608.251.0399</FONT> <BR><FONT face=3DArial size=3D2>email=20
klittle@iecodesign.com</FONT> </P></DIV></FONT></BODY></HTML>

------=_NextPart_000_0033_01BF8840.85DCFC80--

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._