The
Geographical Ontology Page
In the geographic discipline, in
both the physical and human sectors, there is a need for interoperability between
researchers at different sites and different computer systems. The development of a definitive and
authoritative nomenclature of the geospatial domain – an
ontology of the geospatial domain is one way to achieve this state of
interoperability. Ontology design is
grounded on the idea that a knowledge base can be defined through the
development of a set of unique, domain-specific concepts for objects and
processes. A concept is an idea
or notion that we apply to classify those things around us. For instance, if we were to list all those
objects to which the concept of “being mortal” applies, we would form the set
of all mortals. If we were to list all
of those objects that relate to geospatial information, we would form the set
of all things geographic. The following
discussion of ontology design principles sets the groundwork for the
development of just such a data structure – objects and concepts describing
geospatial information. Because the final product has taken the form of both a hierarchical
taxonomy (and ultimately an associated formal thesaurus), this data structure
will be referred to as the Visual Objects Taxonomy/Thesaurus (VOTT).
The essence of any
ontology development project is to form a set of link-node relationships
between categories in category space and their corresponding concept nodes in
concept space. Figure 1 illustrates the
relationship between category space and concept space. In category space, there exists a unique set
of categories within each different classification schema. In concept space, a set of domain specific
concepts exists, as an interlinked network of nodes between and within
domains. Based on existing equivalency
between concept definitions and category meanings, each node in category space
can be linked to its corresponding concept node in concept space. By explicitly defining those links between
category space and concept space, a formal ontological data structure can be
created.

![Text Box: Figure 1. The Concept of Ontological Links and Nodes. In category space, there exists a set of categories within each different classification schema. In concept space, a set of domain specific concepts exist, as a network of nodes. Each node in category space is linked to its corresponding concept node in concept space [After Ng, 1998].](VOTT_desc_files/image016.gif)
In the case of geospatial
information, category space can be thought of as the set of all heritage
classification schemas. These heritage
classification schemas take the form of a wide variety of existing map legends,
word lists, dictionaries, thesauri, taxonomies, and ontologies. By reconciling all of these different forms
of classification schema into one single, integrated, non-duplicitous
ontological structure, a unified knowledge base of geospatial information can
be achieved. Once mediated, this unified
knowledge base would then represent all of those objects that are currently
addressed in the broad expanse of geospatial information data handling. However, Patrick Hayes has estimated that
this comprehensive list of both scientific and common sense objects would be in
excess of 10,000 unique conceptualizations [Hayes, 1985]. Arriving at this list of 10,000 unique
objects would require the systematic analysis of over 30,000 categories – all
the accumulated categories from each of the available heritage classification
schema. The systematic transformation of
these 30,000 heritage categories, into a logical, hierarchical, non-duplicitous
form, the actual mediation of this large data set, this is the subject of this
research effort.
The
Visual Objects Taxonomy/Thesaurus (VOTT) was created using existing
classification schema as the basis for the hierarchical structure. Table 3, at the end
of this document, lists those 173 heritage classification schemas used so far
in this effort. The list includes a wide
variety of classification schema to include systems for land cover/land use
inventory, property management, urban planning, facility inventorying, wetlands
mapping, mapping symbologies, industrial and
occupational codes, standard data models, and ontologies
of various types.
The
primary intent in developing the VOTT has been to devise a system that allows
the efficient and logical inventorying of natural and cultural features to such
a level of detail that all those cultural and natural features seen while
walking through a typical natural or cultural landscape could be
inventoried. These features, for lack of
a better term are tangible features because they have substance and occupy
visible space on the landscape. A
secondary intent has been to allow the exploitation of many other forms of
readily available geospatial information generated outside the traditional
mapping and charting areas of interest.
The widespread use of Geographic Information System (GIS) technology has
introduced a wealth of geospatial data into the public and private
sectors. However, the GIS community is
concerned with a more diverse set of objects than just those objects that are
visible on or above the terrain.
Routinely, GIS practitioners collect information on less tangible object
than the mapping community does. These
intangible features, i.e. events, situations, phenomena, and objects that are
hidden from view, are an essential part of the overall human and physical
geography discipline [Bitters, 2002].
The VOTT includes both tangible and intangible features. It was created by merging a wide variety of
existing classification schema (Table 3) using a semantic integration process.
The
VOTT, in its current state is an extensive list of concepts that have been, or
are currently used in a broad range of disparate classification schema. The VOTT defines a geospatial nomenclature in
the form of defined concepts of objects that span much of the physical and
human geographic discipline. It contains
more than 13,000 different entries.
Approximately half of these concepts include explicit natural language
definitions, associations, and relationships to other concepts within the data
set.
Each
VOTT concept not only has a unique textual name, but also has a unique short
name in the form of an eight-digit, hierarchical, numeric short name. Figure 2 illustrates the generalized
hierarchical structure of the VOTT and provides an example of the naming
convention for the numeric short names.
Each VOTT concept is referenced by a unique eight-digit, numeric
code. The two left most digits indicate
the top-level VOTT group designation. To
the right, the next two digits indicate a class value. To the right of the class, the next two
digits reference a subclass value and the right-most two digits identify the
unit value. In the example above, the
concept, “Gulch” would have a VOTT short name designation of 13020470.

The
Visual Objects Taxonomy (VOT) Data
Structure
The
working version of the Visual Objects
Taxonomy
(VOT) is stored as a relational database using Microsoft AccessTM. Figure 3 shows the current structure of the
VOT working database. The Heritage
Description Document is a database of all heritage data structures that have
been used in the development during this project. It contains bibliographic references to each
heritage data structure and a unique identifier for each record. Heritage Master Databases have been created
for each of the heritage taxonomic structure used in this project and each
contains the original classification schema used to describe each heritage data
set. For those heritage data structures that contained definitions, those heritage
definitions have been have been stored in the VOT Definitions database and have
been used as a starting point in developing VOTT compliant definitions. The same is true for heritage data that
contained explicit association and relationship data. The final VOT hierarchical structure is
stored in the Master VOT database with relational links to the VOT Definitions,
VOT Associations, and VOT 3-D Models databases.
Provisions have been made for the future addition of relational links to
data models to support several GIS and modeling and simulation software
packages – SEDRISTM, Terrain Experts (TerraVistaTM),
MultiGen® (Creator Terrain StudioTM)
and ESRI (ARC/INFO and AarGISTM). Additionally, provisions have also been made
to allow the future generation of Heritage Conversion Tables to capture the
affiliation of each record in these heritage taxonomic structures to its
counterpart record in the VOT. This will
allow the generation of a near “lossless” data conversion capability for future
implementations of the VOT data structure.


The
distribution version of the VOTT contains only valid VOTT classes; their definitions,
associations, relationships; associated 3-D models; and bibliographic
references for the derivation of each concept definition. As a separate directory structure, the
geometry and texture for the associated 3-D models from the VOTT 3-D model library
are also available. This version of the
VOTT is available in several different data formats shown in Table 1. The taxonomy is distributed as a single
relational database in Microsoft AccessTM
in standard .mdb
format. It will be distributed with a
unique user’s interface - the VOTT Browser that will allow searching the entire
database for keywords and phrases. The
taxonomy database will also be available in standard eXtensible
Markup Language (XML). The VOTT
thesaurus hierarchical and full file listings are available in ASCII text, XML,
and HTML format. The 3-D model library
is composed of a file directory containing model geometry files, each in
standard OpenFlightTM format [MultiGen-Paradigm, 2000] and a file directory containing
model texture files in standard SGI image format.
Table 1. Formats
Used to Distribute the VOTT Data.
|
Data Set |
Format |
|
TAXONOMY |
MSAccessTM Database in .mdb format |
|
TAXONOMY |
MSAccessTM Database in .xml format (Not Yet Available) |
|
THESAURUS |
ASCII text
format (Not Yet Available) |
|
THESAURUS |
.xml format (Not Yet Available) |
|
THESAURUS |
HTML format (Not Yet Available) |
|
3-D MODELS |
OpenFlightTM format |
|
MODEL TEXTURES |
SGI image format
(.rgb .rgba .int .inta) |
VOTT Top-Level Groups
Table
2 identifies the 48 top-level groups used in this version of the VOTT. Of the 48 top-level groups in the taxonomy,
the first 15 groups represent those broad top-level categories of cultural and natural
features that have been traditionally used in mapping and charting
classification schema – in particular in the Digital Feature Analysis Data
(DFAD), Feature and Attribute Coding Catalogue
(FACC), and most recently in the SEDRISTM EDCS classification
schema. Groups 16 through 18 are an
extension to these traditional groups and address the categorization of all
forms of vehicles, human forms, and animal forms. The objects within the first 18 broad
top-level groups represent the preponderance of feature classes for those
tangible objects that would be encountered on the Earth’s surface. Groups 19 through 47 contain a set of classes
for various forms of tangible and intangible feature data that can be encountered
in the GIS community. Group 50 contains
a set of standard units of measure and group 70 contains a set of non-feature
concepts and their related definitions.
Table 2. The
Visual Objects Taxonomy/Thesaurus (VOTT) Top-Level Groups.
|
ID |
Top-Level Class |
Status |
Number of Valid Concepts |
Number of 3-D Models |
|
1 |
1867 |
59 |
||
|
1343 |
334 |
|||
|
124 |
||||
|
460 |
32 |
|||
|
377 |
124 |
|||
|
330 |
25 |
|||
|
128 |
14 |
|||
|
234 |
30 |
|||
|
428 |
58 |
|||
|
571 |
68 |
|||
|
229 |
26 |
|||
|
885 |
16 |
|||
|
592 |
3 |
|||
|
233 |
37 |
|||
|
62 |
5 |
|||
|
1246 |
37 |
|||
|
0 |
||||
|
105 |
1 |
|||
|
19 |
Demarcation |
In-Work |
96 |
0 |
|
20 |
Map
Symbology |
In-Work |
12 |
0 |
|
0 |
||||
|
160 |
||||
|
0 |
||||
|
26 |
0 |
|||
|
20 |
0 |
|||
|
60 |
0 |
|||
|
130 |
0 |
|||
|
10 |
0 |
|||
|
27 |
20 |
|||
|
207 |
0 |
|||
|
301 |
0 |
|||
|
44 |
0 |
|||
|
31 |
0 |
|||
|
704 |
0 |
|||
|
10 |
0 |
|||
|
7 |
0 |
|||
|
5 |
0 |
|||
|
4 |
0 |
|||
|
18 |
0 |
|||
|
70 |
0 |
|||
|
35 |
0 |
|||
|
78 |
0 |
|||
|
31 |
0 |
|||
|
75 |
0 |
|||
|
67 |
0 |
|||
|
52 |
0 |
|||
|
210 |
N/A |
|||
|
70 |
Non-Object
Definitions |
Complete |
909 |
N/A |
As an example of the specificity and
granularity of this data structure, Figure 4 identifies the top-level classes
within the Physiography Group. This group includes concepts concerning the
types of natural material surfaces that may be encountered and the types of
physiographic features (landforms), and topographic symbologies
that are commonly used to portray the Earth’s surface.


As
a demonstration of the increased granularity and specificity of the VOTT
compared to other heritage classification systems, Table 4 compares the number
of feature concepts for the Physiography group in the
VOTT with the number of Physiography concepts in
SEDRIS, FACC, DFAD, and SDTS. Notice
that there is a 10 to 30-fold increase of unique Physiography concepts in the VOTT compared to any of the
heritage systems; more precisely, an 831% increase over SEDRIS, a 1364%
increase over FACC, a 2800% increase over DFAD and a 1716% increase over
SDTS. In the VOTT data set, this
increase in the clarity can be seen across the entire spectrum of traditional top-level
groups. This increase is primarily
because the VOTT was created by performing an intersection of all heritage
categories and contains a superset of all physiographic categories used within
all the different heritage classification schemes.
Table 4. Physiography concepts in the VOTT and Physiography categories in selected heritage
classification systems.
|
Taxonomic Structure |
Total No. of Categories |
No. of Surface Categories |
% of Whole |
Percent Increase in Concepts |
|
VOTT |
12,500 |
532 |
4.2 |
|
|
SEDRIS |
1225 |
64 |
5.2 |
831% |
|
FACC |
550 |
39 |
7.0 |
1364% |
|
DFAD |
309 |
19 |
6.1 |
2800% |
|
SDTS |
201 |
31 |
15.5 |
1716% |
Concept names have not been created in natural
language text, but are a concatenated form without spaces and using initial
uppercase characters. Although this is a
difficult format for human understanding, it is readily machine recognizable. This convention was adopted to insure that
naming was compatible with existing upper-level ontologies. To insure that this ontology could be merged
with other existing ontologies, all concept names
were compared to those used in the top-level and mid-level Suggested Upper
Merged Ontology (SUMO). When VOTT
concepts were found to correspond exactly to SUMO concept, SUMO names were
adopted in the VOTT.
Concept Definitions
Developing explicit definitions has been a
very challenging component of this dissertation project. One would think that when establishing any
classification schema, the subject matter expert would have generated natural
language definitions for each class.
During the initial conceptualization of this project, we assumed that
explicit definitions would be available for the preponderance of the categories
within heritage schema. However, in many
of the older heritage systems used here, the original authors depended on
stand-alone class names to define their concepts and for this reason, many
heritage systems contained no explicit definitions. Those subtle nuances in meaning that can be
expressed in a natural language definitions, of course were missing. Therefore, deciphering exactly what was meant
by each stand-alone class name has often been problematic. To say the least, this has made it
exceedingly difficult to decipher the precise meaning of many heritage class
concepts.
However, in the IT ontology environment it
is commonplace to create very large ontological structures based purely on
words devoid of definitions. In these
situations, lexical differences and association differences are the only
factors that can be used in the creation of a hierarchy of concepts. When available, definitions provided the
following factors in the overall structure of the VOTT data set:
Throughout
the semantic integration, our approach was to use definitions of categories,
whenever they were available, as the primary determinant of class
structure. Further, because definitions
could provide concept-meaning clarification through a linkage to other terms,
definitions have been considered a critical element in the overall VOTT
design.
Figure
5 provides an example of a formal VOTT definition. Definitions are composed of two parts: the
explicit natural language definition and as a suffix, an explicit natural
language form of the concept name (highlighted in blue). An unconcatenated,
natural language concept name is suffixed onto the definition to insure that
there is a full understanding of the expanded concept name – expanded from the
concatenated computer form. Embedded
within each definition are XML tagged keywords (highlighted in red). The opening (<KW>) and closing
(</KW>) XML keyword tags identify those words within the definition text
that are further defined within the taxonomic data structure. This will allow for future automated
identification of keywords that can be used to expand the definitions into a
broad networked knowledge base.

![]()
Table 1.
Heritage Taxonomic Data Structures Used in this Research Effort
The following
173 heritage taxonomic structures have been used to varying extents during the
development of the VOTT data structure.
This list includes formal ontological domain studies,
taxonomies, thesauri, classification schemes, formal map legends, geospatial
data models, and basic word lists developed for purely information technology
purposes.
|
ID No. |
Data Set Name |
Short Name |
Responsible Agency |
|
1 |
Land Cover/Land-use
Determinants, Impact |
HUD_lulc |
|
|
2 |
Land Cover Classification
System |
UNFAO |
United Nations Food and Agriculture Organization |
|
3 |
|
USDOE |
|
|
4 |
|
|
|
|
5 |
|
|
|
|
6 |
|
AuroraCO |
|
|
7 |
Barton Aschman and Associates: Summary
of Recommended Activity Coding System |
Barton |
Barton Aschman Associates |
|
8 |
|
CodSLU |
|
|
9 |
|
CodCC |
|
|
10 |
|
CodPTC |
|
|
11 |
|
Chicago_FH |
|
|
12 |
|
ClarkCO_LU |
|
|
13 |
|
ClarkCO_EL |
|
|
14 |
|
ClarkCO_AC |
|
|
15 |
|
ClevelandCC |
|
|
17 |
International Association of Assessing Officers - Standard on
Property Use Codes |
IAAO_PU |
International Association of Assessing Officers |
|
18 |
|
ITE_Man |
|
|
19 |
|
CobbCO |
|
|
20 |
|
Corine |
European Commission |
|
21 |
|
DechCC |
Pratt Institute |
|
22 |
Color Coding of Land Uses from DeChiara,
Simplified Scheme |
DechSCC |
Pratt Institute |
|
23 |
Land-Use Inventory Categories |
|
|
|
24 |
|
DonaAC |
|
|
25 |
|
DonaEL |
|
|
26 |
|
DonaPD |
|
|
27 |
|
DuPageCO |
|
|
28 |
Eagle Point Software: Graphic Database Structure Color Codes |
EagleCC |
Local Affairs Inc. |
|
29 |
Eagle Point Software: Graphic Database Structure Hatch Style |
EagleH |
Local Affairs Inc. |
|
30 |
|
EauClaire |
|
|
31 |
|
FairfaxLU |
|
|
32 |
|
FairfaxELU |
|
|
33 |
|
FairfaxPLU |
|
|
34 |
FEMA: HAZUS, Airport System Classifications |
HAZUS |
|
|
35 |
FEMA: Rapid Visual Screening of Buildings for Potential Seismic
Hazards, Building Structure Codes |
Screen |
|
|
36 |
Federal Geographic Data Committee Cadastral Data Content
Standard for Parcel Type, Parcel Area Type, and Restriction Type |
FGDC-P |
|
|
37 |
Federal Geographic Data Committee Types of Facilities
(Informative) |
FGDC_Fac |
|
|
38 |
Federal Geographic Data Committee Ground Transportation Network
and Attributes |
FGDC_G |
|
|
39 |
Federal Geographic Data Committee Utilities Feature Classes |
FGDC_U |
|
|
40 |
Federal Geographic Data Committee National Vegetation
Classification Standard |
FGDC_V |
|
|
41 |
Guttenberg’s Multiple Land Use Classification System |
Gut |
|
|
42 |
Coastal Change Analysis Program (C-CAP) |
CCAP |
|
|
43 |
Land-Use Compatibility Zones Coding |
AFCC |
|
|
44 |
Hacienda & |
LAC_HAPC |
|
|
45 |
Santa Clarita Valley Area Plan Classification |
LAC_SAPC |
|
|
46 |
|
LincNE |
|
|
47 |
|
MAC_LCC |
|
|
48 |
Water Resources Administration: Wetlands Mapping |
MDDNR |
|
|
49 |
|
MassGIS |
|
|
50 |
|
MichDNR |
Michigan Department of Natural Resources |
|
51 |
|
MissLC |
|
|
52 |
Région dlle-de-France
Categories for Land-Use Modes |
RFR-LU |
Institut dAméngagement
dUbanisme |
|
53 |
North American Industry Classification System (NAICS) |
NAICS |
|
|
54 |
|
|
France Ministry of the Environment |
|
55 |
|
NCSAIP |
APA - |
|
56 |
|
NCLUC |
|
|
57 |
|
NCDNR |
|
|
58 |
|
NCLCC |
|
|
59 |
|
NKAPC |
|
|
60 |
|
OHDNR |
|
|
61 |
|
OCCA |
|
|
62 |
|
PBFL-CPAC |
|
|
63 |
|
PBFL-ELUC |
|
|
64 |
SANDAG: 1968 Standard Land-Use Codes |
Sandag-68 |
|
|
65 |
SANDAG: Generalized Land Ownership Map Categories |
Sandag- |
|
|
66 |
SANDAG: Property Use Codes |
Sandag-PUC |
|
|
67 |
|
SCAG |
|
|
68 |
Standard Industrial Classification Manual (SIC) |
SIC |
|
|
69 |
Standard Land-Use Coding Manual (SLUCM) |
SLUCM |
|
|
70 |
|
StLoMO |
St. Louis Community Development Agency |
|
71 |
Classification System for Substandardness
Criteria |
UrbIL |
|
|
72 |
|
USAF_Sugg |
|
|
73 |
|
USAFLUC |
|
|
74 |
|
USAFFac |
|
|
75 |
|
USAFPark |
|
|
76 |
|
USARPC |
|
|
77 |
|
USDAFarm |
|
|
78 |
|
DLG |
|
|
79 |
|
Washoe |
|
|
80 |
|
|
|
|
81 |
WISCLAND: Land Cover Classes |
WISCLAND |
Wisconsin Department of Natural Resources |
|
82 |
SEDRIS: Environmental Data Coding System (EDCS) |
EDCS_CLASS |
www.SEDRIS.org |
|
83 |
|
TDI_cat |
|
|
84 |
|
SSCM_Clas |
|
|
85 |
FACC |
FACC_Code |
DIGEST |
|
86 |
MDA Railroads |
MDA-R |
MDA |
|
87 |
National Wetland Inventory (NWI) |
NWI |
|
|
88 |
Digital Line Graph (DLG) |
DLG |
|
|
89 |
TigerLine Files |
TIGER |
|
|
90 |
National Vegetation Classification System |
NATVEG |
|
|
91 |
Landuse Compatibility Codes |
AFCC |
|
|
92 |
Standard Occupational Classification System |
SOC |
|
|
93 |
Coastal Change Analysis Program |
CCAP |
|
|
94 |
Industry and Occupational Codes |
IOC |
|
|
95 |
EuroRegionalMap |
ERM |
EuroGraphics.org |
|
96 |
Digital Feature Analysis Data (DFAD) |
DFAD |
National Geospatial Intelligence Agency (NGA) |
|
97 |
|
DDDS |
|
|
98 |
EuroGlobalMap |
EGM |
EuroGraphics.org |
|
99 |
|
CWC |
|
|
100 |
|
FGDC |
|
|
101 |
Geosym |
GEOSYM |
National Geospatial Intelligence Agency (NGA) |
|
102 |
Geographic Names Information System |
GNIS |
|
|
103 |
ISO-DIS-19110 |
ISO19110 |