|
| |
HYDRO1k Documentation
Table of Contents
- 1.0. Introduction
- 2.0. Data
Layers
- 3.0. Data
Set Development
-
- 3.1. Data
Processing Procedures
- 3.1.1. Project the
DEM
- 3.1.2. Identify
Natural Sink Features
- 3.1.3. Filling
the DEM
- 3.1.4. Verification
of the DEM
- 3.2. Generation
of Derivative Raster Data Sets
- 3.2.1. Aspect
- 3.2.2. Flow
Directions
- 3.2.3. Flow
Accumulations
- 3.2.4. Slope
- 3.2.5. Compound
Topographic Index
- 3.3. Generation
of Derivative Vector Data Sets
- 3.3.1. Drainage
Basin Boundaries
- 3.3.2. Stream
Lines
- 4.0. Data
Formats
- 4.1. Vector
Data Formats
- 4.2. Raster
Data Formats
- 4.2.1. Image
File (.bil)
- 4.2.2. Header
File (.hdr)
- 4.2.3. World
File (.blw)
- 4.2.4. Statistics
File (.stx)
- 5.0. Data
Distribution
- 6.0. Notes
and Hints for HYDRO1k Users
- 7.0. Summary
- 8.0. References
- 9.0. Disclaimers
HYDRO1k, developed at the U.S. Geological Survey's (USGS) EROS Data Center, is a geographic
database providing comprehensive and consistent global coverage of
topographically derived data sets. Developed from the USGS' recently released
30 arc-second digital elevation model (DEM) of the world (GTOPO30),
HYDRO1k provides a standard suite of geo-referenced data sets (at a resolution
of 1 km) that will be of value for all users who need to organize, evaluate,
or process hydrologic information on a continental scale.
Constructive comments from users of the HYDRO1k data sets are welcomed.
Please send your comments to kverdin@edcmail.cr.usgs.gov or
sgreenlee@edcmail.cr.usgs.gov.
The HYDRO1k data sets are being developed on a continent by continent
basis, for all landmasses of the globe with the exception of Antarctica and
Greenland. The HYDRO1k package provides, for each continent, a suite of six
raster and two vector data sets. These data sets cover many of the common
derivative products used in hydrologic analysis. The raster data sets are the
hydrologically correct DEM, derived flow directions, flow accumulations,
slope, aspect, and a compound topographic (wetness) index. The derived
streamlines and basins are distributed as vector data sets.
The HYDRO1k data sets are the result of the cooperative project at the U.S.
Geological Survey's (U.S.G.S.) EROS Data Center. The goal of the project is
the development of a globally consistent hydrologic derivative data set. The
effort has been led by U.S.G.S. scientists in collaboration with the United
Nations Environment Programme/Global Resource Information Database (UNEP/GRID) located in Sioux Falls, South
Dakota.
Development of the HYDRO1k database was made possible by the completion of
the 30 arc-second digital elevation model at the EROS Data Center in 1996,
entitled GTOPO30. This data set, with its nominal cell size of 1 km, has been
and will continue to be applied by many scientists and researchers to
hydrologic and land form studies. Inevitably, these studies require
development, at a minimum, of a standard suite of derivative products. In the
past, users would obtain the DEM data, process the data, extract the
derivative information, use the derived products in their studies and,
perhaps, share the derived information with others. In an attempt to reduce
repetition of these procedures by every user of the data set, the HYDRO1k data
base aims to provide these standard products, developed in a consistent
fashion for the entire globe and make them available for the entire user
community.
-
The basis of all of the data layers available in the HYDRO1k database is
the hydrologically correct DEM. This DEM is, of course, based on the GTOPO30
data set. However, to ensure that the DEM is able to reproduce the correct
movement of water across its surface, the DEM is processed to remove
elevation anomalies that can interfere with hydrologically correct flow. The
procedures followed in development of this DEM are iterative. Some of the
techniques used in the DEM development are documented in Danielson (1997).
-
In order to properly perform area calculations on the DEM, the data are
projected into an equal area projection. The Lambert Azimuthal Equal Area
projection was selected for this database. (Steinwand et al, 1995). The cell
size for all continents is 1,000 meters and the radius of the sphere of
influence is 6,370,997 meters. Projection parameters that vary by continent
are given in the following table. Other geo-referencing information is
available in the projection file that is included with each continental data
set.
| Continent |
Longitude of Origin |
Latitude of Origin |
| Africa |
20° 00' 00"E |
5° 00' 00"N |
| Asia |
100° 00' 00"E |
45° 00' 00"N |
| Australasia |
135° 00' 00"E |
15° 00' 00"S |
| Europe |
20° 00' 00"E |
55° 00' 00"N |
| North America |
100° 00' 00"W |
45° 00' 00"N |
| South America |
60° 00' 00"W |
15° 00' 00"S |
All continents contain some closed basins; drainage basins with no
natural outlet to the sea. In processing the HYDRO1k DEM to replicate
natural flow patterns, techniques were developed to (1). identify which sink
features in the DEM are, indeed, natural features and (2). preserve these
sink features during the processing. Identification of the natural sinks in
the DEM was begun by creating a "sink layer" containing all sink features
contained in the projected GTOPO30 DEM. This sink layer was then thresholded
to extract only sinks with a surface area greater than a specified minimum.
This was used as a "first-cut" on identification of the natural sink
features.
To allow filling of the DEM using standard GIS techniques while still
maintaining the sinks identified in step 3.1.2., the identified sinks are
"seeded" by placing a NODATA point at the bottom of each sink. Since the
standard GIS implementation of the hydrologic filling technique allows flow
only off the edge of the DEM or to NODATA points, this procedure "tricks"
the GIS into letting water flow to the sink. All spurious sinks, those not
identified as potential natural features in 3.1.2, are removed.
Following filling of the DEM, initial streamline and basin data sets are
generated for use in the verification of the DEM. Flow direction and flow
accumulation grids are generated and the vector stream lines and basin
boundaries are produced. The streamlines and basins thus derived are
compared against existing digital data. In most cases, the Digital Chart of
the World (DCW) drainage cover was used for comparison (Defense Mapping
Agency, 1992; Danko, 1992). However, all available map sources were used.
Comparison of the generated streamlines with mapped hydrography allows
identification of essentially two types of errors in the DEM:
(1). Errors of omission or inclusion of natural sink features.
Examination of mapped hydrography often serves to identify whether or not
the first pass identification of the natural sinks features was adequate. In
the case of an error of omission, the newly identified sink feature is
"seeded" in the DEM and in the case of inclusion, the "seeded" sink is
removed ("unseeded").
(2). Errors in the DEM which prevent proper flow across its surface.
These errors can be caused by the DEM generation or resampling techniques or
can simply be caused by the 1-km horizontal or the 1-m vertical resolution
of the DEM. Comparison with mapped hydrography serves to identify locations
where the generated streamlines or basin boundary deviate. If the difference
between the two sources of information proves to be the DEM, editing of the
DEM is done to guarantee that flow progresses in the required direction.
These type of DEM edits usually involve only small changes in the elevation
of one or two pixels.
The procedures in 3.1.3. and 3.1.4. are repeated until the DEM is able to
produce streamlines and basins that adequately match mapped hydrography.
-
Following generation of the hydrologically correct DEM, the final
versions of the additional derivative data layers are produced. Along with
the hydrologically correct DEM, the following five raster data layers are
developed using standard GIS techniques. All derivative raster data layers
were produced using ARC/INFO’s GRID module (ESRI, 1992).
-
The aspect data set describes the direction of maximum rate of change in
the elevations between each cell and its eight neighbors. It can essentially
be thought of as the slope direction. It is measured in positive integer
degrees from 0 to 360, measured clockwise from north. Aspects of cells of
zero slope (flat areas) are assigned values of -1.
The flow direction data layer defines the direction of flow from each
cell in the DEM to its steepest down-slope neighbor. Values of flow
direction vary from 1 to 255. Defined flow directions follow the convention
adopted by ARC/INFO's flow direction implementation:
Cells with undefined direction of flow represent sinks and have flow
directions that are simple combinations of its neighbors' flow direction
values.
The flow accumulation data layer defines the amount of upstream area
draining into each cell. It is essentially a measure of the upstream
catchment area. The flow direction layer is used to define which cells flow
into the target cell. Since the cell size of the HYDRO1k data set is 1 km,
the flow accumulation value translates directly into drainage areas in
square kilometers. Values range from 0 at topographic highs to very large
numbers (on the order of millions of cells) at the mouths of large rivers.
The slope data layer describes the maximum change in the elevations
between each cell and its eight neighbors. The slope is expressed in integer
degrees of slope between 0 and 90.
The Compound Topographic Index (CTI), commonly referred to as the Wetness
Index, is a function of the upstream contributing area and the slope of the
landscape. The implementation used in the HYDRO1k data set is based on Moore
et al (1991). The CTI is calculated using the flow accumulation (FA) layer
along with the slope as:
CTI = ln ( FA / tan (slope) )
In areas of no slope, a CTI value is obtained by substituting a slope of
0.001. This value is smaller than the smallest slope obtainable from a 1000
m data set with a 1 m vertical resolution.
-
The stream line and basin data in the HYDRO1k data set are distributed as
vector layers.
-
The drainage basins distributed with the HYDRO1k data set are derived
using the vector streamlines along with the flow direction layer. The basins
are seeded following procedures first articulated by Otto Pfafstetter, a
Brazilian engineer, and adapted for use in the HYDRO1k data set (Verdin,
1997). Each polygon in the basin data set has been tagged with a
Pfafstetter code uniquely identifying each sub-basin. The six-digit
Pfafstetter code assigned to each basin carries basin linkage information.
This permits determination of basin interconnectedness through simple
examination of the Pfafstetter code.
The drainage basin polygons are attributed with the following attributes:
Level1 to Level6 = Pfafstetter units of each polygon
Slope_mean = Mean value of the slopes within the subbasin (degree)
Slope_stdev = Standard deviation of the slopes within the subbasin
(degree)
Aspect_mean = Mean value of the aspects within the subbasin (degree from
N)
Aspect_stdev = Standard deviation of the aspects within the subbasin
(degree from N)
Dem_mean = Mean elevation value within the subbasin (m)
Dem_stdev = Standard deviation of the elevations within the subbasin (m)
The stream line data layer distributed with the HYDRO1k data set is
derived from the flow accumulation and flow direction layers. Cells with
upstream drainage areas greater than 1000 km2 are selected from
the flow accumulation layer and processed through the STREAMLINK function.
The resulting links are attributed with the maximum flow accumulation
occurring within that link and the result is vectorized using the STREAMLINE
function. These procedures result in a vector data layer of streamlines with
each segment of stream attributed with the upstream contributing drainage
area. The vector streamlines are attributed with the following fields:
Flowacc = The maximum flow accumulation value of the stream segment. This
value corresponds directly with the upstream watershed contributing area.
(10-3 km2)
Pf_type = The Pfafstetter level at which the stream segment is considered
"main stem".
Level1 to Level6 = The Pfafstetter units in which the stream segment lie.
Frmelevation = The elevation value of the stream segment's from-node (m)
Toelevation = The elevation value of the stream segment's to-node (m)
Strorder = Strahler stream order of the segment
Gradient = Gradient of the stream segment calculated as the difference of
the from and to-node elevations divided by the length of the segment
Frmup_flowlen = The upstream flowlength from the from-node. Calculated
using ARC/INFO's FLOWLENGTH function, it is the longest path from the
from-node to the drainage basin divide. (m)
Toup_flowlen = The upstream flowlength from the to-node. (m)
Frmdn_flowlen = The downstream flowlength from the from-node. Again from
ARC/INFO's FLOWLENGTH function, it is the length from the from-node to the
ocean or a terminal sink. (m)
Todn_flowlen = The downstream flowlength from the to-node. (m)
-
The vector data sets, stream lines and basins, distributed with HYDRO1k
are being made available in a ARC/INFO Export Format (.E00 extension).
The six raster data layers distributed for each continent are being
distributed as simple binary raster data. Each raster data layer is provided
as four files, with the extension of each file defining the file type.
| File Extension |
File Type |
| .bil |
Raster Data File |
| .hdr |
Header File |
| .blw |
World File |
| .stx |
Statistics File |
The raster data for each layer are provided as signed integer data in a
simple binary raster format. All the data layers are 16-bit data with the
exception of the flow accumulation layer, which, due to the range of values
needed, is 32-bit. There are no header or trailer bytes embedded in the
image. The data are stored in row major order (all the data for row 1,
followed by all the data for row 2, etc.).
The raster data header file is an ASCII text file containing size and
coordinate information for the layer. Many standard software packages
require the .hdr file to provide important geo-referencing information for
the image. The following keywords are used in the header file:
| BYTEORDER: |
Byte order in which image pixel values are stored M = Motorola
byte order (most significant byte first) |
| LAYOUT: |
organization of the bands in the file BIL: band interleaved by
line (note: the raster layers are all single band images) |
| NROWS: |
number of rows in the image |
| NCOLS: |
number of columns in the image |
| NBANDS: |
number of spectral bands in the image (1) |
| NBITS: |
number of bits per pixel (16 or 32) |
| BANDROWBYTES: |
number of bytes per band per row (twice the number of columns for
a 16-bit image; four-times for the 32-bit image) |
| TOTALROWBYTES: |
total number of bytes of data per row (twice the number of columns
for a single band 16-bit image; four-times for the 32-bit image) |
| BANDGAPBYTES: |
the number of bytes between bands in a BSQ format image
(0) |
The world file is an ASCII text file containing coordinate information.
It is used by some packages for geo-referencing of image data.
| XDIM: |
X-dimension of a pixel (1000) |
| Rotation term: |
Always zero |
| Rotation term: |
Always zero |
| Negative YDIM: |
Negative Y-dimension of a pixel (-1000) |
| XMIN: |
X-location of center of upper-left pixel (projected meters) |
| YMAX: |
Y-location of center of upper-left pixel (projected
meters) |
The statistics file is an ASCII text file that lists the band number,
minimum value, maximum value, mean value, and standard deviation of the
values in the raster data file.
HYDRO1k data for each continent are distributed electronically as tar
files. The data files are identified by the two-digit continental identifier
according to the following scheme:
| Two-digit Identifier |
Continent |
| AF |
Africa |
| AS |
Asia |
| AU |
Australasia |
| EU |
Europe |
| NA |
North America |
| SA |
South America |
Users have the option of obtaining the entire HYDRO1k data set for a
continent (all eight data layers) or selectively choosing layers for download.
In either case, the data are distributed as tar files. In the case of raster
data sets, the .bil files have been compressed with the gzip function before
creation of the tar file. The vector data export files have been compressed
(gzipped) as well prior to creation of the tar file. As an example of the
naming convention used, the North American data sets that are available are:
| Na.tar |
A tar file containing all the North American data
layers along with README |
| Na_asp.tar |
Tar file containing the aspect data layer (compressed bil file,
three ancillary files and README) |
| Na_bas.tar |
Vector basin data layer in compressed ARC/INFO Export format
along with README |
| Na_cti.tar |
Tar file with CTI data layer (compressed bil file, three
ancillary files and README) |
| Na_dem.tar |
Tar file with DEM data layer (compressed bil file, three
ancillary files and README) |
| Na_fd.tar |
Tar file with flow direction data layer (compressed bil file,
three ancillary files and README) |
| Na_fa.tar |
Tar file with flow accumulation data layer (compressed bil file,
three ancillary files and README) |
| Na_slope.tar |
Tar file with slope data layer (compressed bil file, three
ancillary files and README) |
| Na_str.tar |
Vector streams data layer in compressed ARC/INFO Export
format along with README |
As well as being available via a web page interface, the HYDRO1k data sets
are available electronically through an Internet anonymous File Transfer
Protocol (FTP) account at the EROS Data Center (at no cost).
To access this account:
- 1. FTP to edcftp.cr.usgs.gov
- 2. Enter anonymous at the Name prompt.
- 3. Enter your email address at the Password prompt.
- 4. Change to the /pub/data/gtopo30hydro subdirectory
- 5. Enter binary to set the transfer type.
- 6. Use get or mget to retrieve the desired files.
To use the HYDRO1k data files, the individual data files must first be
extracted from the tar files. Within the tar files, the image data files
(.bil) are compressed. These files, along with the compressed vector export
files, must be uncompressed. If you do not have the gzip and tar utilities,
they can be obtained from the following locations:
- Unix gzip:
- ftp://prep.ai.mit.edu/pub/gnu
- ftp://wuarchive.wustl.edu/systems/gnu
- Macintosh gzip and tar:
- ftp://mirrors.aol.com/pub/mac/util/compression
- macgzip0.3b2.sit.hqx
- suntar2.03.cpt.hqx
- DOS gzip and tar:
- ftp://prep.ai.mit.edu/pub/gnu
- gzip-1.2.4.tar
- ftp://ftp.uu.net/systems/ibmpc/msdos/pcroute
- tar.exe
Because the image (.bil) data are stored in a 16-bit binary format, users
must be aware of how the bytes are addressed on their computers. The data are
provided in Motorola byte order, which stores the most significant byte first
("big endian"). Systems such as Sun SPARC and Silicon Graphics workstations
use the Motorola byte order. The Intel byte order, which stores the least
significant byte first ("little endian"), is used on DEC Alpha systems and
most PCs. Users with systems that address bytes in the Intel byte order may
have to "swap bytes" of the BIL data unless their application software
performs the conversion during ingest. The statistics file (.stx) provided for
each data set gives the range of values in the image file, so that users can
check if they have the correct values stored on their system.
Users of ARC/INFO or ArcView can display the image data directly. However,
if a user needs access to the actual pixel values for analysis in ARC/INFO the
image must be converted to an ARC/INFO grid with the command IMAGEGRID.
IMAGEGRID does not support conversion of signed image data, therefore the
negative 16-bit image values will not be interpreted correctly. After running
IMAGEGRID, an easy fix can be accomplished using the following formula in
GRID:
out_grid = con(in_grid >= 32768, in_grid - 65536, in_grid)
The converted grid will then have the negative values properly represented,
and the statistics of the grid should match those listed in the .stx file. If
desired, the -9999 ocean mask values in the grid could then be set to NODATA
with the SETNULL function.
The HYDRO1k data set provides many of the derivative products useful in
earth science applications. The hydrologically correct DEM and ancillary data
layers are useful in studies of earth systems including watershed analysis,
landform studies and global change scenarios. Development of a standard set of
data layers minimizes duplication of effort and will provide consistent global
coverage.
Danielson, J.J., 1996. Delineation of drainage basins from 1 km African
digital elevation data. In: Pecora Thirteen, Human Interactions with the
Environment - Perspectives from Space, Sioux Falls, South Dakota, August
20-22, 1996.
Danko, D.M., 1992. The digital chart of the world. GeoInfo Systems,
2:29-36.
Defense Mapping Agency, 1992, Development of the Digital Chart of the
World: Washington, D.C., U.S. Government Printing Office
ESRI, 1992, "Cell based modeling with GRID", ESRI, Inc., Redlands,
California.
Moore, I.D., R.B. Grayson and A.R. Ladson, 1991, Digital Terrain Modelling:
A Review of Hydrological, Geomorphological and Biological Applications. In:
Hydrological Processes An International Journal, January - March, 1991, pp. 3
- 30.
Steinwand, D.R., Hutchinson, J.A., and Snyder, J.P. ,1995, Map projections
for global and continental data sets and an analysis of pixel distortion
caused by reprojection: Photogrammetric Engineering and Remote Sensing, v. 61,
p. 1,487-1,497.
Verdin, K.L., and Greenlee, S.K., 1996. Development of continental scale
digital elevation models and extraction of hydrographic features. In:
Proceedings, Third International Conference/Workshop on Integrating GIS and
Environmental Modeling, Santa Fe, New Mexico, January 21-26, 1996. National
Center for Geographic Information and Analysis, Santa Barbara, California.
Verdin, K.L., A System for Topologically Coding Global Drainage Basins and
Stream Networks. In: Proceedings, 17th Annual ESRI Users
Conference, San Diego, California, July 1997.
Any use of trade, product, or firm names is for descriptive purposes only
and does not imply endorsement by the U.S. Government. Please note that some
U.S. Geological Survey (USGS) information contained in this data set and
documentation may be preliminary in nature and presented prior to final review
and approval by the Director of the USGS. This information is provided with
the understanding that it is not guaranteed to be correct or complete and
conclusions drawn from such information are the sole responsibility of the
user. |