The geospatial technology community has a
long history of trying to retrofit georeferencing information onto
existing imaging formats—sometimes successfully (notably the
GeoTIFF extensions to TIFF), but all too often in ways that lead
to user confusion, divergent implementations and ultimately
non-interoperability (see "The
Dangers of Non-Interoperability.")
JPEG 2000 (often called JP2 when referring to the file format) is
the emerging standard for high-quality image compression in the
geospatial arena, but there already are signs that it too could
suffer the same fate as other imaging formats. However, within the
collaborative standards development processes of Open Geospatial
Consortium Inc. (OGC), several vendors are developing a standard
for using JPEG 2000 imagery in geographic information system (GIS)
workflows. Using the OGC Geography Markup Language (GML) standard
to describe coordinate reference systems and accurate geographic
positions, users can describe geographic features, handle multiple
images and benefit from additional advanced features of JPEG 2000.
JPEG 2000 Benefits
During the last few years, experts from a variety of imaging
fields have developed the JPEG 2000 standard (ISO/IEC 15444-1) as
a high-end alternative to the popular JPEG image format. Based on
the same wavelet compression technologies already used by the GIS
industry, such as LizardTech’s MrSID, JPEG 2000 offers
high-quality lossy and lossless image compression in a
multi-resolution (pyramidal) format that is internal to the file
structure. JPEG 2000 is highly scalable in several dimensions: It
supports file sizes into the gigabyte range and beyond,
multispectral and hyperspectral datasets with increased
bit-depths, and selective decompression of scenes within the image
at user-controllable qualities.
However, JPEG 2000 is designed for more than
traditional image compression. A JP2 file, containing an archival image
compressed only moderately, can have scenes extracted and further
compressed on the fly for use by remote or bandwidth-constrained
users—without the overhead of decoding and re-encoding. In a network
environment, JPEG 2000 images can be streamed from server to client
while still in compressed form, allowing viewers to access only the data
(pixels) they need and at only the resolution and quality they require.
By design, JPEG 2000 offers no specific support
for any particular application domain, such as georeferencing metadata
for geospatial imaging. This means JPEG 2000 doesn’t specify mechanisms
for georeferencing the image, describing the sensor model used to
collect the data, or correlating features within the imagery to other
GIS datasets. For good reasons, the standards committee decided not to
support such domain-specific requirements. Instead, the committee left
room within the JP2 file format for “boxes” containing arbitrary XML
data that can refer to the image data within the file.
Enter GML
The Geography Markup Language (GML) is an XML
grammar used to describe geographic data such as coordinate reference
systems and positioning, geographic features, sensor models, annotations
and styling, etc. Like JPEG 2000, GML version 3.1 will be an ISO
standard (ISO 19136) in its own right.
Despite its richness, GML doesn’t provide for
“metadata” in the normal sense of the term, such as Federal Geographic
Data Committee (FGDC) metadata. GML is a language used to construct
definitions for features, geometries, etc.; it doesn’t define any
concrete features on its own. GML is used to construct XML application
schemas for use within a given application or system. GML does allow for
inclusion of other XML data formats and referencing of other entities,
possibly external to the file itself; for example, FGDC metadata is
explicitly allowed.
Consider the simple case of an aerial image with an associated
coordinate system and position information. One can easily envision the
coordinate system and position being represented in GML and stored
within the JP2 file using the allowed XML boxes (see Figure 1). On the
other hand, at least the positioning data could be represented with a
traditional 6-line world file (.wld, or perhaps .j2w). This wouldn’t
define the coordinate system, although other header file formats could
certainly be used or invented for use with JP2 files.
Why Use GML?
It seems simple enough to express a world file expression. So why should
users go all the way to using GML? Consider three more complex
situations, which show how the power of JPEG 2000 can be exploited using
GML.
Sensor Models
Original satellite and aerial imagery are now much more than just three
8-bit RGB bands, and JPEG 2000 supports many of the imagery requirements
users now see, such as bit-depths of 16 or higher and hyperspectral
bands. These images typically have rich sets of associated metadata. For
example:
• Full descriptions of the cameras may include sensor characteristics
such as the number and wavelengths of the spectral bands, precision and
calibration information, and type of sensor (pushbroom, etc.)
• Positioning information may include the usual coordinate system and
position of the image relative to Earth, but also may include camera
positioning information such as camera angle, date and time, orbit
track, etc.
• Image quality information might include cloud cover estimates, NIIRS
rating and air quality at time of collection.
Using GML, an application schema can be written that captures this
metadata structure; this could be defined and provided by the imagery
vendor (see Figure 2). Within their supplied JP2 imagery, GML instance
data would provide the actual metadata
content. The application schema doesn’t even have to be resident in the
file, but could be hosted at the vendor’s or satellite company’s Web
site via the standard XML linking and referencing mechanisms.
Multiple Images and Feature Identification
JPEG 2000 permits multiple images to be contained within the same file.
Consider a workflow in which images of a particular region are to be
captured and analyzed during a period of months:
• Multiple sets of stereopair imagery are to be stored.•
Each pair has associated metadata, including date and time stamps.
• The images all share a common coordinate system, but all are at
slightly different coordinate positions.
• Any individual image may have specific features identified on that
image.
All of this information may be contained in a single JP2 file. In
addition to the data corresponding to the individual images, the file
may contain “shared” GML data describing the coordinate system and
feature descriptions for all images and “private” GML data for each
image describing specific feature instances and offsets within the image
(see Figure 3).
The Spatial Web
Increasingly geospatial data are being used in the “Spatial Web”
environment, where Web services are used to transparently perform
operations such as:
• Providing catalog access to large archives of geospatial data.
• Exporting the data itself, based on query parameters.
• Performing mosaicking and layering operations.
• Performing simple feature classification, description and extraction.
• Styling the data for presentation.
Such workflows are already in use today with
vector data; GML is the “language” of the GeoWeb used to describe
regions and extents, define and label features and express queries. As
users add raster imagery to this system,
label features and express queries. As users add
raster imagery to this system, they must be able to use GML for
characterizing the imagery and its associated features. Furthermore,
they need to use an imaging format that is highly standardized, capable
of supporting geospatial-sized imagery, and—often overlooked, but
critical—bandwidth efficient. The JPEG 2000
standard provides for a rich set of primitives for transporting
compressed image data in a Web services environment.
Consider Department of the Interior (DoI) imagery and two agencies that
rely on the imagery: the Forest Service and the National Parks Service.
The Forest Service may wish to view an image with forested regions
accentuated through false color imaging, while the Parks Service may
wish to view only the visual spectra of the same image. Using JPEG 2000,
DoI creates and stores a single image—annotated with GML for the
classifications—that can be styled by a Web service according to
particular viewer needs (see Figure 4).
Bridging the Gap
JPEG 2000 is coming on fast—vendors are investigating its use, and some
already support the standard—but so far JP2 is viewed simply as a “new
compression format.” JPEG 2000 is much more than that; it offers the
geospatial community the opportunity for a single, common,
well-standardized format for most imagery needs during the coming
decade. And with its support for networked environments, it’s perfectly
aligned with the vision that groups such as OGC have for future
networked GISs.
Still, getting there will require some work. The JPEG 2000 standard
doesn’t say how to mosaic an image with other data projected to some
common ellipsoid, nor does it help users label a region of the image as
“a two-lane, paved road.” Fortunately, the standard does allow users to
leverage another new ISO standard, GML, to bridge the gap between raw
imagery and existing geospatial systems.
Publisher’s note: To read more about GML and JPEG 2000, visit the
following Web sites:
•
www.jpeg.org—the Joint Photographic Experts Group
•
xml.coverpages.org/geographyML.html—a
good GML Web site hosted by the Organization for the Advancement of
Structured Information Standards (OASIS), a non-profit, international
consortium that drives the development, convergence and adoption of
e-business standards.
• www.opengeospatial.org—Open
Geospatial Consortium Inc.