Value units in XML Files:


 - Units for Quantities:

Four possible methods for handling the units of quantitative values in XML 
come to mind:

   1. Do not notate units in the XML, and allow only one unit to be used
	that is specified in separate documentation.
	Example:	altitude="30000"


   2. Allow (or require) units to be specified with each value.
 	Example:   <length> 1.2 miles </length>
		   <param altitude="30000 ft" 
		          length="23 km"
		          height="1509 meters" />
        (Hereafter dispensing with angle-bracket (<,>) parts ...)

   3. Define separate tag and/or attribute names for each combination of value and each
      possible unit:
	Example:    altitude_ft="30000"
		    altitude_meters="9144"
		    altitude_miles="5.8"
		    altitude_km="9.144"
		    length_miles="1.2"
		    length_ft="6240"
		    length_km="1.9"
		    length_meters="1900"
		    length_yards="2190"
		    length_inches=".."
		    length_cm=".."
		    ...

    4. Allow only one unit to be used, and specify it as part of the tag name.
	Example:	altitude_ft="30000"



Next, let's discuss these while considering some of the primary principles
of XML, which include the recording of sufficient information within
self-contained files to consistently and unambiguously decode and
process information for both humans and computers, and while allowing
much freedom of organization and content.


Comparative Pro's and Con's:

Method 1:

 Pro:
  Simpler to develop code for, since there is no dynamic unit detection or conversion.

 Con:
  This method has the disadvantage that units are implied, and can
  easily be misunderstood or misinterpreted, leading to serious, but possibly
  undetected errors.  It is also more difficult for anyone to find out what the
  units are, since they would have to look elsewhere for documentation, which 
  may be out of date, could be lost or may be unavailable later or at other sites.
  This is basically the "old-way" of recording computer data.


Method 2:

 Pro:
  This method is natural for humans to read, write, and interpret.
  It corresponds to how values are noted in general literature, dash-boards,
  verbal communications, and scientific and technical documentation.
  The units are clear and contained within the XML, so they can be correctly
  interpreted without separate documentation.  Being self-contained, the
  units are clearly communicated with the values to anyone at any time, even 
  if the documentation is separated or lost.  The potential for 
  misinterpretation is minimized.

  A single keyword is used for each distinct quantity, and many units
  can be conveniently used for all quantities.  This permits a high degree
  of code re-use, is easy for computers to decode, and easy for programmers
  to write and maintain.  It is consistent with the modern object-oriented 
  concept of polymorphism.  For example, the SI (System Internal) 
  system of units is composed of 7 primary kinds of units.  Therefore a small 
  number of unit converters can be used to convert units for hundreds or 
  thousands of quantities.  Consider that a given file may have hundreds
  or thousands of distance or power measurements.

  It also accommodates a great degree of flexibility in units that
  can be accepted.  For example, one supplier of data have all their data in
  English units, while another has all theirs in metric and might
  prefer to keep it that way as well.  It may be difficult or impractical to get 
  multiple organizations to agree on units.  Already existing data is what
  it is, unless separately converted.

Con:
  Requiring units may seem "wordy".  Allowing default units allows the
  potential of miscommunication, like method 1.


Method 3:

 Pro:
  This method minimizes the potential for miscommunication and
  records all units within the single self-contained file, just as
  method 2 above does.

 Con:
  It produces an explosion in the number of keyword tags needed,
  and/or limits the units that can be used.  It does not exploit re-use
  of conversion code.  Every quantity-unit combination has to be documented,
  coded, interpreted, and converted separately.

  For example, consider a system containing 100 length/distance quantities
  and 20 power quantities.  Length can be specified in 8 units:
  km, meters, cm, mm, miles, yards, feet, inches.  Power can be
  specified in 8 units as well: dB, watts, mw, uw, kw, Mw.
  So a total of 920 keyword tags would be required (8 * 100 + 6 * 20).
  But method 2 would require only 120 quantity-names and two unit converters,
  which would be simpler to specify in xml as well as develop code for.


Method 4:

 Pro:
  No confusion.  Unit is stated where used, similar to methods (2) and (3) above.

 Con:
  This method has the disadvantage that only one unit can be accepted for each quantity and
  lacks flexibility.  It forces data suppliers to convert their units,
  which may require processing database files, or writing custom export conversion
  code for each measurement requiring conversion.



Summary:
 All methods have pros and cons.  Assuming higher weights are assigned to
 communicating data correctly, with greater flexibility, and easiest interpretation
 by the widest audience, than other concerns, such as minimizing file size;  then 
 methods (2)+(3) appear better than the (1) and (4).

 Of methods (2) and (3), method (2) has the same advantages of (3), while
 having fewer serious disadvantages.  Therefore, method-2 is generally recommended,
 as long as default units are discouraged.