Published on GeoInt MOOC (https://www.e-education.psu.edu/geointmooc)

Home > Lessons > Lesson 2

Lesson 2

GEOINT Data, Data Sources, & Collection Strategies

Introduction

GEOINT data has been said to be "any data used to create GEOINT." Geospatial information, imagery, and imagery intelligence are the primary sources from which GEOINT data are extracted. However, there is a more foundational perspective. These data are records of human and physical qualities at a location that uniquely created a place. Signs of human activity and measures of the land become GEOINT Data. Traditionally data are thought of as fixed statements of "what is;" however, GEOINT Data are part of a conversation driven by the need to explain what is recorded in the data in the context of a place. GEOINT Data are extracted chunks of information from other intelligence and geographic products. Inherent in using this data, GEOINT Data require knowledge of the data and how the spatial features might be organized on the landscape. Using the data and understanding spatial organization allows us to frame patterns of human activities to make predictions. GEOINT Data fitted into a frame also guides the search for additional information and ultimately provides a plausible story or map to account for what the data shows.

Learning Objectives

This lesson addresses the fundamental that the analyst works to complete the picture provided by geospatial intelligence data. Throughout this lesson, we will demonstrate the concepts of Geospatial Intelligence. By the end of this lesson, you will be able to:

  • Introduce the fundamental concepts of locating points on the Earth, scale, and the representation of geographic features in a digital form.
  • List examples of the major categories of GEOINT Data
  • Discuss the concept of Techne and its importance to GEOINT Data
  • Describe the role of GEOINT Data in sensemaking

L2.02: Video Lecture 2.1

Please view my video lecture discussing the concepts of GEOINT data.

Need the video embed code from Coursera.

L2.03: Locating Things on the Earth

I begin this lesson with a brief introduction of the concepts of locating points on the Earth, scale, and the representation of geographic features in a digital form. With these fundamentals, the lesson then discusses GEOINT data.

Coordinate Systems

Essential to creating geospatial data is locating something on the Earth. The following discussion of coordinate systems was drawn from the Penn State course, GEOG 482, The Nature of Geographic Information [1].

A Cartesian coordinate system
Figure 2.1: A Cartesian coordinate system.
Source: Bacastow

As you might know, locations on the Earth's surface are measured and represented in terms of coordinates. A coordinate is a set of two or more numbers that specifies the position of a point, line, or other geometric figure in relation to some reference system. The simplest system of this kind is a Cartesian coordinate system (named for the 17th century mathematician and philosopher René Descartes). A Cartesian coordinate system is simply a grid formed by juxtaposing two measurement scales—one horizontal (x) and one vertical (y). The point at which both x and y equal zero is called the origin of the coordinate system. In Figure 2.1, above, the origin (0,0) is located at the center of the grid. All other positions are specified relative to the origin. The coordinate of the upper right-hand corner of the grid is (6,3). The lower left-hand corner is (-6,-3).

Cartesian and other two-dimensional (plane) coordinate systems are handy due to their simplicity. For obvious reasons, they are not perfectly suited to specifying geospatial positions, however. The geographic coordinate system is designed specifically to define positions on the Earth's roughly-spherical surface. Instead of the two linear measurement scales, x and y, the geographic coordinate system juxtaposes two curved measurement scales termed longitude and latitude. This is a geographic coordinate system.

Longitude specifies positions east and west as the angle between the prime meridian and a second meridian that intersects the point of interest. Longitude ranges from +180° (or 180° E) to -180° (or 180° W). 180° East and West longitude together form the International Date Line.

Latitude specifies positions north and south in terms of the angle subtended at the center of the Earth between two imaginary lines, one that intersects the equator and another that intersects the point of interest. Latitude ranges from +90° (or 90° N) at the North pole to -90° (or 90° S) at the South pole. A line of latitude is also known as a parallel.

At higher latitudes, the length of parallels decreases to zero at 90° North and South. Lines of longitude are not parallel, but converge toward the poles. Thus, while a degree of longitude at the equator is equal to a distance of about 111 kilometers, that distance decreases to zero at the poles.

Geodetic coordinate system
Figure 2.2: The geographic (or "geodetic") coordinate system.
Source: Bacastow
Geographic coordinate system
Figure 2.3: The geographic coordinate system.
Source: Bacastow

Try This!

Geographic Coordinate System Practice Application

Nearly everyone learned latitude and longitude as a kid. But how well do you understand the geographic coordinate system, really? My experience is that while everyone who enters this class has heard of latitude and longitude, only about half can point to the location on a map that is specified by a pair of geographic coordinates. The Flash application linked below lets you test your knowledge. The application asks you to click locations on a globe as specified by randomly generated geographic coordinates.

You will notice that the application lets you choose between "easy problems" and "hard problems." Easy problems are those in which latitude and longitude coordinates are specified in 30° increments. Since the resolution of the graticule (the geographic coordinate system grid) used in the application is also 30°, the solution to every "easy" problem occurs at the intersection of a parallel and a meridian. The "easy" problems are good warm-ups.

"Hard" problems specify coordinates in 1° increments. You have to interpolate positions between grid lines. You can consider yourself to have a good working knowledge of the geographic coordinate system if you can solve at least six "hard" problems consecutively and on the first click.

Click here to download and launch the Geographic Coordinate System practice application (5.7 Mb) [2]. (If the globe doesn't appear after the Flash application has loaded, right-click and select "Play" from the pop-up menu.)

Screenshot of the Geographic Coordinate System practice application
Figure 2.4: Geographic Coordinate Practice Tool.
Note: You will need to have the Adobe Flash player installed in order to complete this exercise. If you do not already have it, you can download the Flash Player for free at the adobe website. [3]
 

Datums

We said that all of our positions on the Earth are specified relative to the origin. So, how is the origin established? A geodetic datum is a spatial reference system that describes the shape and size of the Earth and establishes an origin for coordinate systems. Two main types of datums include horizontal datums and vertical datums. Horizontal datums are used to describe what we typically think of as x and y coordinates. Vertical datums describe position in the vertical direction and are often based on height above sea level.

People have created hundreds of datums that are in use around the world today. The main reason that people have developed different datums for different places is so that they can choose an ellipsoid that best matches the shape of the Earth at an area of local interest (usually a country). There are two main types of datums: local datums and geocentric datums. In local datums, a point of the ellipsoid is matched to a point on the Earth’s surface (e.g., the North American Datum of 1927 intersects the surface of the Earth at Meades Ranch in Kansas, while the Australian Geodetic Datum intersects with the Johnston Geodetic Station in the Northern Territory). Geocentric datums, on the other hand, are based on the Earth’s center of mass. Our knowledge of where that center of mass is located has improved with modern satellite data. Many countries are now shifting to geocentric datums because GPS measurements are based on a geocentric datum. This switch avoids the need for transforming GPS-collected data from one coordinate system to another. It is important to understand which datum was used when your data were created, because the position of features may be different depending on which datum was used. In some cases, there may be a positional discrepancy of up to one kilometer! These shifts are especially important in large-scale mapping applications, as these discrepancies will be much larger than any projection-induced error.

L2.04: Scale

You hear the word "scale" often when you work around people who produce or use geographic information. If you listen closely, you'll notice that the term has several different meanings, depending on the context in which it is used. You'll hear talk about the scales of geographic phenomena and about the scales at which phenomena are represented on maps and aerial imagery. You may even hear the word used as a verb, as in "scaling a map" or "downscaling." The goal of this section is to help you learn to tell these different meanings apart and to be able to use concepts of scale to help make sense of geographic data.

Scale as Scope

Often "scale" is used as a synonym for "scope" or "extent." For example, the title of an international research project called The Large Scale Biosphere-Atmosphere Experiment in Amazonia (1999) uses the term "large scale" to describe a comprehensive study of environmental systems operating across a large region. This usage is common not only among environmental scientists and activists, but also among economists, politicians, and the press. Those of us who specialize in geographic information usually use the word "scale" differently, however.

Map Scale

When people who work with maps and aerial images use the word "scale," they usually are talking about the sizes of things that appear on a map or aerial photo relative to the actual sizes of those things on the ground.

Map scale is the proportion between a distance on a map (Dm) and a corresponding distance on the ground (Dg): (Dm / Dg).

By convention, the proportion is expressed as a "representative fraction" in which map distance (Dm) is reduced to 1. The proportion, or ratio, is also typically expressed in the form 1 : Dg rather than 1 / Dg.

The representative fraction 1:100,000, for example, means that a section of road that measures 1 unit in length on a map stands for a section of road on the ground that is 100,000 units long.

If we were to change the scale of the map such that the length of the section of road on the map was reduced to, say, 0.1 units in length, we would have created a smaller-scale map whose representative fraction is 0.1:100,000, or 1:1,000,000. When we talk about large- and small-scale maps and geographic data, then, we are talking about the relative sizes and levels of detail of the features represented in the data. In general, the larger the map scale, the more detail is shown. This tendency is illustrated below in Figure 2.5.

Three maps of different scales showing the same location
Figure 2.5: Geographic data are generalized according to scale in these maps of the town of Gorham. (Adapted from Thompson, 1988.)
Source: GEOG 482

One of the defining characteristics of topographic maps is that scale is consistent across each map and within each map series. This isn't true for aerial imagery, however, except for images that have been orthorectified. Large-scale maps are typically derived from aerial imagery. One of the challenges associated with using aerial photos as sources of map data is that the scale of an aerial image varies from place to place as a function of the elevation of the terrain shown in the scene. Assuming that the aircraft carrying the camera maintains a constant flying height (which pilots of such aircraft try very hard to do), the distance between the camera and the ground varies along each flight path. This causes aerial photo scale to be larger where the terrain is higher and smaller where the terrain is lower. An "orthorectified" image is one in which variations in scale caused by variations in terrain elevation (among other effects) have been removed.

You can calculate the average scale of an unrectified aerial photo by solving the equation Sp = f / (H-havg), where f is the focal length of the camera, H is the flying height of the aircraft above mean sea level, and havg is the average elevation of the terrain. You can also calculate aerial photo scale at a particular point by solving the equation Sp = f / (H-h), where f is the focal length of the camera, H is the flying height of the aircraft above mean sea level, and h is the elevation of the terrain at a given point.

Graphic Map Scale

Another way to express map scale is with a graphic (or "bar") scale. Unlike representative fractions, graphic scales remain true when maps are shrunk or magnified.

Example of a bar scale (left) and a variable scale
Figure 2.6: Graphic scales
Source: GEOG 482

If they include a scale at all, most maps include a bar scale like the one shown above left (Figure 2.6). Some also express map scale as a representative fraction. Either way, the implication is that scale is uniform across the map. In fact, except for maps that show only very small areas, scale varies across every map. As you probably know, this follows from the fact that positions on the nearly-spherical Earth must be transformed to positions on two-dimensional sheets of paper. Systematic transformations of this kind are called map projections. As we will discuss in greater depth later in this chapter, all map projections are accompanied by deformation of features in some or all areas of the map. This deformation causes map scale to vary across the map. Representative fractions may, therefore, specify map scale along a line at which deformation is minimal (nominal scale). Bar scales denote only the nominal or average map scale. Variable scales, like the one illustrated above right, show how scale varies, in this case by latitude, due to deformation caused by map projection.

Scale as a Verb

The term "scale" is sometimes used as a verb. To scale a map is to reproduce it at a different size. For instance, if you photographically reduce a 1:100,000-scale map to 50 percent of its original width and height, the result would be one-quarter the area of the original. Obviously, the map scale of the reduction would be smaller too: 1/2 x 1/100,000 = 1/200,000.

Because of the inaccuracies inherent in all geographic data, particularly in small-scale maps, scrupulous geographic information specialists avoid enlarging source maps. To do so is to exaggerate generalizations and errors. The original map used to illustrate areas in Pennsylvania disqualified from consideration for low-level radioactive waste storage shown below (Figure 2.7), for instance, was printed with the statement "Because of map scale and printing considerations, it is not appropriate to enlarge or otherwise enhance the features on this map."

Example of a bar scale and a variable scale
Figure 2.7: Areas (in gray) disqualified as potential sites for a low level radioactive waste storage facility depicted on a small-scale map (original 1:1,500,000) mask small suitable areas large enough to contain the 500-acre facility (Chem-Nuclear Systems, Inc., 1994).
Source: GEOG 482

Some or all of the above content is used with permission from GEOG 482—The Nature of Geographic Information [4], Penn State's College of Earth and Mineral Sciences and licensed for use under CC BY 3.0 [5].

L2.05: Rasters and Vectors

Digital geospatial data are encoded as alphanumeric symbols that represent locations and attributes of locations measured at or near Earth's surface. No geographic data set represents every possible location, of course. The Earth is too big, and the number of unique locations is too great. In much the same way that public opinion is measured through polls, geographic data are constructed by measuring representative samples of locations. Just as serious opinion polls are based on sound principles of statistical sampling, geographic data represent reality by measuring carefully chosen samples of locations. Vector and raster data are, in essence, two distinct sampling strategies.

The vector approach involves sampling locations at intervals along the length of linear entities (like roads), or around the perimeter of areal entities (like property parcels). When they are connected by lines, the sampled points form line features and polygon features that approximate the shapes of their real-world counterparts.

vector.gif
Figure 2.8: Aerial photograph of reservoir and highway using a "Vector Approach."
Source: Penn State, GEOG 482 [6]

The aerial photograph in Figure 2.8 shows two entities, a reservoir and a highway. The graphic in Figure 2.8 illustrates how the entities might be represented with vector data. The small squares are nodes—point locations specified by latitude and longitude coordinates. Line segments connect nodes to form line features. In this case, the line feature colored red represents the highway. Series of line segments that begin and end at the same node form polygon features. In this case, two polygons (filled with blue) represent the reservoir.

The vector data model is how surveyors measure locations at intervals as they traverse a property boundary. Computer-aided drafting (CAD) software used by surveyors, engineers, and others, stores data in vector form. CAD operators encode the locations and extents of entities by tracing maps mounted on electronic drafting tables, or by key-entering location coordinates, angles, and distances. Instead of graphic features, CAD data consist of digital features, each of which is composed of a set of point locations.

The vector strategy is well suited to mapping entities with well-defined edges, such as highways or pipelines or property parcels. Many of the features shown on paper maps, including contour lines, transportation routes, and political boundaries, can be represented effectively in digital form using the vector data model.

raster.gif
Figure 2.9: Aerial photograph of reservoir and highway using a "Raster Approach."
Source: [6]Penn State, GEOG 482 [1]

The raster approach involves sampling attributes at fixed intervals. Each sample represents one cell in a checkerboard-shaped grid. Figure 2.9 illustrates a raster representation of the same reservoir and highway as shown in the vector representation. The area covered by the aerial photograph has been divided into a grid. Every grid cell that overlaps one of the two selected entities is encoded with an attribute that associates it with the entity it represents. Actual raster data would not consist of a picture of red and blue grid cells, of course; they would consist of a list of numbers, one number for each grid cell, each number representing an entity. For example, grid cells that represent the highway might be coded with the number "1" and grid cells representing the reservoir might be coded with the number "2."

The raster strategy is a smart choice for representing phenomena that lack clear-cut boundaries, such as terrain elevation, vegetation, and precipitation. Digital airborne imaging systems, which are replacing photographic cameras as primary sources of detailed geographic data, produce raster data by scanning the Earth's surface pixel by pixel and row by row.

Both the vector and raster approaches accomplish the same thing—they allow us to caricature the Earth's surface with a limited number of locations. What distinguishes the two is the sampling strategies they embody. The vector approach is like creating a picture of a landscape with shards of stained glass cut to various shapes and sizes. The raster approach, by contrast, is more like creating a mosaic with tiles of uniform size. Neither is well suited to all applications, however. Several variations on the vector and raster themes are in use for specialized applications, and the development of new object-oriented approaches is underway.

Note: Some or all of the above content is used with permission from GEOG 482—The Nature of Geographic Information [7], Penn State's College of Earth and Mineral Sciences and licensed for use under CC BY 3.0 [5].

L2.06: GEOINT Data

The NGA doctrine [8] said that GEOINT data "is any data used to create GEOINT." This doctrine also goes on the say that geospatial information, imagery, and imagery intelligence are the main source of data for GEOINT. We take a more general view that GEOINT data are data and information collected to understand the human and physical qualities of a location. So, what is GEOINT data? GEOINT data, including geospatial information, imagery, and imagery intelligence, fundamentally consists of:

  • structured and unstructured geospatial data, and
  • information about the physical and human qualities of a location on the Earth.

Table 2.1 illustrates these general data types and the relation between them:

Table 2.1: Data Content versus Geospatial Data Organization
  Data Organization:
Structured Geospatial Data
Data Organization:
Unstructured Geospatial Data
Data Content:
Physical Geography
Example:
Digital imagery of a land feature
Example:
Report describing facts about a land feature
Data Content:
Human Geography
Example:
Geospatial data of incidences of bacterial infections
Example:
Scholarly article describing
facts associated with bacterial infections at a location

Structured GEOINT Data are geospatial data organized to be immediately usable by technologies, such as Geographic Information Systems (GIS). A formal definition is:

Structured geospatial data is information about locations and shapes of geographic features and the relationships between them. It is usually stored as coordinates [9] and topology [10] with a high degree of organization to be readily searchable.

What often goes unstated and unappreciated is a broad class of data known as “unstructured geospatial data,” a catchall term because much of the data included under that term actually has elements of structure. Email, for instance, may contain a street address, senders, times, and the like. A definition for unstructured geospatial data is:

Unstructured geospatial data refers to geographic information that either does not have a predefined data model or is not organized in a predefined manner. Unstructured geospatial data may be text containing geographic information such as street addresses and site descriptions. Unstructured data is not readily searchable.

While unstructured geospatial data may be organized into a digital file, the data are still "unstructured" because they cannot be easily accessed for mapping. However, unstructured data are extremely important when completing an analysis and help to complete the partial picture provided by heavily structured data. Some suggest that between 50 and 80 percent of the data in an organization is unstructured. If this is correct, then most of the data that an analyst might encounter is unstructured. However, extracting geographic features from unstructured data into defined fields for analysis poses a challenge. To illustrate, I will compare three common datasets encountered in developing GEOINT:

  • a vector data file of streets with building addresses,
  • an orthorectified [11] satellite image with a known resolution stored as a raster image, and
  • a text report discussing construction types of the individual buildings, including the building addresses.

The vector data files and satellite image represent structured data. With the vector file, we can use a GIS to geocode a list of street addresses, plot these as points on the satellite image, and determine the extent of vegetation around each house. However, the report of construction types is unstructured geospatial data. We cannot immediately import it into into a GIS and map the building construction types on the street map and satellite image.

GEOINT Data can be divided into two other major data content categories of physical and human geospatial data. Physical geospatial data is a record of the spatial characteristics of the various natural phenomena associated with the Earth's hydrosphere, biosphere, atmosphere, and lithosphere. Examples are:

  • Landforms that are a result of the natural processes
  • Vegetation
  • Climate and weather

Human geospatial data record the imprint of human activity on Earth. Examples include:

  • Cities, buildings, roads, bridges, power/utility lines, and airfields
  • Farms, orchards, irrigation channels, deforestation, dams, and mining
  • Birth and death rates, population clusters, media accessibility, political culture, medical facilities, education, tribal boundaries, and other cultural details.

The World-Wide Human Geography Data Working Group (WWHGD WG [12]) is a voluntary partnership around human geography data focused on the general principle of making appropriate human information available to promote human security. This data helps us understand the behavior of people during different times and in different places and informs human security and humanitarian assistance initiatives. Human geography data enables us to understand why people do what they do and where they do it. WWHGD Working Group builds voluntary partnerships around geospatial datasets for human geography to support human security, humanitarian assistance, disaster relief and emergency preparedness, and response and recovery efforts globally. The WWHGD Working Group has catalogued more than 1200 data sources and links. This global mapping community has more than 1500 members with participating organizations representing the Department of Defense (DoD), civil agencies, academia, non-government organizations (NGO), international organizations, and private corporations.

Commercial firms also amass human geography data and transform it into a structured form for easy access and use by Geographic Information Systems. Their products are made possible by the fact that the original data exist in digital form, and because the companies have developed systems that enable them to structure the data efficiently.

Try This!

Try out the demo of what Claritas used to call the "You Are Where You Live" tool. The Nielson Company has acquired Claritas and the tool is now called "MyBestSegments." Use the following link to access the My Best Segments - ZIP Code Look-up [13] page. Unfortunately this tool only works for locations within the United States; if you don't live in the United States, consider entering a ZIP code for a town or city you are familiar with or try Penn State's Zip code: 16802.

If you do live in the US, enter the ZIP code for your home town and then enter the security code provided on the page and click the Submit button. You will see a list of lifestyle segments listed on the left. Click on some of the lifestyle segment names to see if they seem accurate for the community you selected.

Does the market segmentation match your expectations?

The key point is that human, physical, structured, and unstructured data are used together to create a picture of place. The use of human geography data and imagery in the response to Ebola is an example [14]. Analysts utilize this data to better understand where infrastructure is located, where the disease has the greatest risk of transmission, and what populations are most at risk.

L2.07: Baking the Cake

I know this will sound a little strange, but GEOINT Data is not really data in the strict definition of the term. GEOINT Data are extracted chunks of information from other intelligence and geographic products such as imagery and maps. Let's examine this statement by first viewing a few definitions, and then we'll explore this as an analogy of baking a cake. The definitions of data, information, and insights are:

  • Data: Facts and statistics collected together for reference or analysis
    (Source: Oxford Dictionaries: Data [15]).
  • Information: What is conveyed or represented by a particular arrangement or sequence of things
    (Source: Oxford Dictionaries: Information [16]).
  • Insight: The capacity to gain an accurate and deep intuitive understanding of a person or thing
    (Source: Oxford Dictionaries: Insight [17]).

Now for our cake baking analogy:

A chef has a request to bake a carrot cake. The carrots were previously picked from the ground, cleaned, packaged, and transported to the grocery store. The carrots are purchased by a chef at the store to be shredded, mixed with other ingredients, and baked into the cake. The carrots could also be sliced for a salad. It is the chef's knowledge, skill, and wisdom that determines how well the ingredients are made into a cake.

The analogy is that GEOINT data are previously collected (picked), inconsistencies removed (cleaned), structured (packaged), and distributed (transported). The analyst acquires the data from a warehouse (grocery store), subsets the data (shreds the carrots), and combines (mixes) it with other information to create insights (the cake). The same data might be used for other analytic purposes (making a salad). It is the analyst's (bakers) knowledge, skill, and wisdom that determines how well the information is converted into insights.

Satellite imagery is one of the key ingredients of GEOINT. A satellite image is processed data. A satellite image is made up of raw data (thousands of pixels) that are arranged in a particular way. This raw data is processed, organized, structured, and presented so as to make it useful as a satellite image. Importantly, this satellite imagery can then be used to create other data. For example, the height of a particular building extracted from the imagery is new data.

Mapmaking uses data in the creation of information and has been an integral part of human history for possibly up to 8,000 years. Maps and imagery are representations of the Earth. Before remote sensing technologies, if we wanted to know something about a location on the Earth, we would have had to visit the location. In those days, our knowledge was limited by our direct experiences. Information, such as maps and satellite imagery, allows us to expand our insights beyond the range of our experiences. Information can be shared and used by others at different times and places.

Primary data is raw information collected for a specific purpose. For example, the direct measurement of a building's height would be primary data. Secondary data are extracted from information developed by others. The advantage of primary data is the opportunity to tailor it to our need. We "know" the data. The disadvantage of collecting primary data is that it is costly and time consuming. The main advantages of secondary data are that it can be quicker and less expensive. It is easier to examine information, such as imagery, collected over a long period of time to identify changes. However, the information may be outdated, or inaccurate, or too vague. This is to say, we might not "know" the data and be able to fully articulate our insights. Figure 2.10 below illustrates this data-information-insights process.

Data_Information_Process.png
Figure 2.10: Data-information-insights process
Source: Bacastow

People create data as a means to help understand how natural and human systems work. Such systems can be hard to analyze because they're made up of many interacting phenomena that are often difficult to observe directly, plus they tend to change over time. We attempt to make systems and phenomena easier to study by measuring their characteristics at certain times. We measure selectively because it's not practical to measure everything, everywhere, all the time. How accurately data reflect the phenomena they represent depends upon how, when, where, and what aspects of the phenomena were measured. Thus, all measurements contain a certain amount of error that the wisdom of the analyst must take into account.

L2.08: Video Lecture 2.2

Need the video embed code from Coursera.

Please view my second lecture video discussing the concepts of GEOINT data.

L2.09: The Craft

Data without the understanding brought by geospatial thinking and reasoning are really just meaningless symbols. This is because GEOINT data are separated from the knowledge of the place, this is to say, the physical and human aspects of the location the data represent. GEOINT data are fragmented and incomplete. We use the crafts of geospatial thinking and reasoning to help make it whole. Reginald Golledge makes this point in his Thinking Spatially [18] article in Directions Magazine:

Our knowledge about a geographic area is never perfect, but we still make effective decisions in that area because we use mental processes of perceptual closure (interpolation), or overlay (aggregation), or dissolve (disaggregation), and summarization. When we start to get overwhelmed with detail, we spatially classify (as by proximity) or cluster (as in "next to") so as to collapse lots of separate bits of information into meaningful "clumps" or "chunks." Sometimes, we make gross classifications ("all cats are gray in the night," or "all these trees are the same" when looking at a eucalyptus forest). We mentally cluster food stores, clothing stores, bars, beaches, and other phenomena into largely undefined generic classes and then give place-specific identifiers to single out particular members (e.g., "Albertson's is the supermarket that has fresh Maine lobsters;" "Google's is the beach with the bad undertow"). And we all realize that geographic data can be perceived at a variety of scales. We might use the same thought processes to reason about a colony of ants or bees as we do to think about people's activities in cities. We may use the same concepts when looking at our neighborhood as we would when studying San Francisco or Sydney (Australia). And, often, we use the vaguest of principles to guess about where things might be found (e.g., from trying to find a missing glove to searching for a bus stop in an unfamiliar area).

Thinking Spatially [18] - Directions Magazine, January 12, 2003

The human craft of bringing meaning to the models and data is called techne. Techne is a term derived from Greek that means "craftsmanship, craft, or art." It might be termed GEOINT data techne. Much of it is implicit and ambiguous, and is acquired largely by experience. The geospatial analyst uses techne when handling the data and forming judgments about a place. GEOINT data techne includes:

  • The judicious application of knowledge. This is the ability to think and act using knowledge, experience, understanding, common sense, and insight. This implies a possession of knowledge to use data in a given situation. This involves an understanding of people, things, events, situations, and the willingness as well as the ability to apply perception, judgment, and action. It often requires control of one's cognitive biases. In short, data wisdom is a disposition to make optimum judgments about the nature of things on the Earth to deliver the highest quality outcome.
  • The application of know-how. Apprentices, for example, work with their mentors and learn craftsmanship not through language but by observation, imitation, and practice. The key to acquiring tacit knowledge about using data is experience. Without some form of shared experience, it is extremely difficult for people to share each other's thinking processes. This has been described as "know-how" as opposed to "know-what" (facts), "know-why" (science), or "know-who" (networking).

L2.10: Bridging the Gap Between Data, Information, and Insights

Models, both mental and otherwise, play a critical role in bridging the gap between data, information, and insights. They are an idealized representation of how to use data to solve problems. Recalling the table from Lesson 1, let's further explore this relationship between data, models, and insights.

Table 2.2: Data Certainty versus Model Certainty
  Model Certainty:
Low
Model Certainty:
High
Data Certainty:
High
Model Building Puzzle Solving
Data Certainty:
Low
Mystery Solving Data Foraging

Models represent our understanding of how the world works. We construct models for:

  • simplification and organization
  • communication with oneself or others
  • prediction
  • manipulation

Models also give us the ability to overcome incomplete data by mentally filling in gaps, making an intuitive leap with only the sparsest of data. This is a sophisticated form of geospatial reasoning. Expertise in geospatial reasoning increases with experience because as we learn or experience additional models, our mind expands to accommodate them.

You can "preload" your mental models with typical understandings of place; these include conceptual models of how a place is organized and works. For example, why individuals use a store at a particular location is related to the number of people in the surrounding community. Theories of spatial organization can be a shortcut in our attempts to model an unfamiliar pattern. A few important geographic models are:

  • Gravity Model [19]: The model states that the potential use of a service at a particular location is directly related to the number of people in a location and inversely related to the distance people must travel to reach the service. This explains why stores are clustered in malls.
  • Christaller's Central Place Theory [20]: Christaller's theory explains the distribution of services, based on the fact that settlements serve as market area centers for services. Larger settlements, e.g., large cities, are fewer and farther apart than smaller settlements, e.g., small towns, and provide services for a larger number of people who are willing to travel farther.
  • Weber Model of Industrial Location [21]: Weber invented a least cost theory of industrial location, which explains the pattern of the industry at a macro-scale. It is based on the notion that firms seek a site of minimum transport and labor cost. It looks at the factors of the cost of transporting goods, cost of labor, and capital.
  • Von Thunen's Agricultural Model [22]: Based on a center market area, the Von Thunen model uses rings based on how long the good will last before becoming unusable, the product weight, and transportation networks around a central market to depict the best location for an agricultural activity. For example, farmers nearest to a city produced milk since it is perishable and commands a higher price.
  • Core Periphery Model [23]: A model that describes how economic, political, and/or cultural power is spatially distributed between dominant core regions, and more marginal or dependent semi-peripheral and peripheral regions.

Mental models are a normal, everyday human activity essential for geographic problem solving. To make the point, you will predict part of a map by literally "using the data you see to predict the data beyond what is given."

Try This!

Predict the missing part of a map by using the data you see and models you assume to predict the data beyond what is given. Using a pen and paper, sketch the missing half of the map using the symbols for roads and mountains. Remember that you will have to go beyond the information that is given. As you complete this exercise, think about the following:

  • What assumptions (models) are you applying about the area that is depicted in the map?
  • What strategies are you using for completing the map?
  • How sure are you of the answers that you have produced?
complete_the_map2a.jpg
Figure 2.11: Complete the bottom half of the map
Source: Bacastow

You can compare your results with this solution map [24]. This is a link to Coursera.

A word of caution is in order. While models provide a useful way of understanding the world around us, blind adherence to a model can be disastrous. When we close our mind to disconfirming evidence and the possibility of alternative outcomes, we fail to see the weaknesses of our model and we will fail. History is replete with examples of people adhering stubbornly to an outdated paradigm (model) despite overwhelming evidence that a new way of thinking (a new model) is necessary.

Models, Frames, and Insights

Models and theories provide a basis for a structure that is called a frame. A frame helps us understand the world and may be influenced by a model or theory. For example, an individual's day-to-day interactions with family and associates is termed their "Pattern of Life." The family goes to work at a particular time using a particular route. The Central Place Theory may help to form a narrative that explains the Pattern of Life. Here, a theory suggests that our family may travel to larger settlements for services not available in smaller settlements. Models and theories may also direct our attention toward the information we seek.

In summary, we frame things to make sense of what we see in the real world and to fill in missing data. The frame may be influenced by an existing model or theory. A frame can take the following forms:

  • A narrative explaining the chronology of events and the causal relationships between them.
  • A map showing distances, directions, and connections of things such as landmarks, routes, and destinations.

Insights result from a narrative that accounts for what the data reveals within the frame; this human process of explanation is termed sensemaking. There are three primary outcomes for GEOINT sensemaking:

  1. To identify patterns
  2. To describe patterns
  3. To predict future patterns

As you can see, GEOINT data is a key part of a discourse to make sense of indefinite and ambiguous situations. Fitting the GEOINT data to the frame involves cognitive work to understand the relationships among data and sequence of events.

L2.11: Assignment

Questions About Geographic Entities

GEOINT addresses questions that technology alone is not particularly good at answering. These are explanatory questions that help you make sense of what you see. Examples are:

  • Why entities are located where they are.
  • Why they have the attributes they do.
  • Why they have changed as they have.

In addition, GEOINT is often concerned with predictive questions such as what will happen at this location if this happens at that location? In general, GEOINT data and software packages cannot be expected to provide clear-cut answers to explanatory and predictive questions right out of the box. Typically, analysts must turn to specialized statistical packages and simulation routines. Information produced by these analytical tools may then be re-introduced into the GEOINT database. It is always important to keep in mind that decision support tools are not substitutes for human experience and judgment.

Most of us are interested in data only to the extent that they can be used to help understand the world around us and to make better decisions. Analytic processes vary a lot from one organization to another. In general, however, the first steps in making an analysis are to articulate the questions that need to be answered and to gather and organize the data needed to answer the questions. To follow are examples of the kinds of questions GEOINT asks about how space is organized.

Questions concerning individual geographic entities. Such questions include:

  • Questions about space
    • Where is the entity located?
    • What is its extent?
  • Questions about attributes
  • What are the attributes of the entity located there?
  • Do its attributes match one or more criteria?
  • When were the entity's location, extent, or attributes measured?
  • Has the entity's location, extent, or attributes changed over time?
  • Questions about time

Questions concerning multiple geographic entities. Such questions include:

  • Questions about spatial relationships
    • Do the entities contain one another?
    • Do they overlap?
    • Are they connected?
    • Are they situated within a certain distance of one another?
    • What is the best route from one entity to the others?
    • Where are entities with similar attributes located?
  • Questions about attribute relationships
  • Do the entities share attributes that match one or more criteria?
  • Are the attributes of one entity influenced by changes in another entity?
  • Have the entities' locations, extents, or attributes changed over time?
  • Questions about temporal relationships

The ability to identify and make sense of the patterns we see are critical skills in GEOINT. Maps are a fundamental source of information about spatial patterns. I will teach you a beginning approach to pattern sensemaking. After a while, the approach becomes natural. I caution you not to treat this as a recipe to be followed in a rote-learning fashion. It is a guide to start you on the sensemaking process. There are two general steps in the process:

Step 1: Referring to the above questions, describe what you see to answer "what is the pattern and where is the pattern?" This is done by describing the major parts or components to the pattern as:

  • Are there contiguous clusters of roughly similar values for places in an extended geographic area?
  • What are the trends? This is, do the patterns extend in a general direction?
  • Are there discontinuities? This is, abrupt and significant spatial changes in map values for places separated by relatively short distances.
  • Are there exceptions? This is, are there isolated places that stand out from either the surrounding regional grouping or the spatial trend?

Step 2: Explanation of the pattern to answer how and why the pattern occurred. Explain the pattern by:

  • listing all of the factors, which can have an effect on the value of the phenomenon being mapped at a particular location.
  • Include in the list what effects these factors have. Is the factor likely to increase or decrease the value of the phenomenon?

Apply the factors in order to explain the components of the pattern that you described in the first step. Remember that:

  • You are trying to explain why the groupings, spatial trends, discontinuities, and exceptions exist where they do.
  • You must explain why the exceptions exist.

Before we try to make sense of a pattern, let's practice.

Practice Question A:
Examine the map of earthquakes in Figure 2.12. (You can click on the map to open the ArcGIS Online earthquake webmap.) The earthquakes are obviously geological. But, why does this pattern occur? You can find an answer here [25].

ArcGIS Online map of earthquakes around the world. [26]
Figure 2.12: Earthquakes.
Source: ArcGIS Online [26]webmap [26]

Practice Question B:
What causes the global pattern of malaria cases in Figure 2.13? (You can click on the map to open the ArcGIS Online malaria webmap.) Click here for an answer to Question B [27]. Coursera link.

ArcGIS Online map of the distribution of malaria around the world. [28]
Figure 2.13: Malaria cases.
Source: ArcGIS Online [28]webmap [28]

L2.12: Discussion

In this week's discussion, I'm focusing on the craft of making sense of patterns. This involves techne which is acquired largely by experience. It is knowledge allowing the use of data in a given situation and an understanding of people, things, events, situations, and the ability to apply perception, and judgment.

You might not have had the knowledge, experience, or understanding to correctly answer the practice questions in L2.11. This was intentional and to make the point that knowledge and experience are necessary. Let's now examine a pattern with which you have some background knowledge. Figure 2.14 is a map of the students in this MOOC. You can click on the map to open the ArcGIS Online student location webmap. Post a short paragraph to the Lesson 2 Discussion Forum explaining:

  1. The pattern of students participating in the course. Be sure to answer how and why the pattern occurred.
  2. How your knowledge of the data collection process and your understanding of other students in the MOOC allowed you to make a more informed judgment as to the causes of the pattern. You might consider reviewing the Lesson 1 Discussion Forum to get a sense if individuals were deceptive in where they placed the point.
ArcGIS Online map showing student location pins. [29]
Figure 2.14: ArcGIS Online webmap showing student location pins.
Source: ArcGIS Online [29]webmap [29]

Credits
The ArcGIS Online capabilities used here were developed by Joseph Kerski [30], Esri Education Manager.

L2.13: Summary

This lesson focused on how the analyst's craft completes the picture provided by GEOINT data. GEOINT data is an incomplete record of the human and physical qualities at a location that uniquely created a place. These data are extracted chunks of information from other intelligence and geographic products. Inherent in using this data, GEOINT data require knowledge of the data and how the geospatial features might be organized on the landscape. Using the data within the analyst's craft allows us to frame patterns of human activities to develop insights and make predictions.

Reminder

 

Don't forget to complete the Lesson 2 Quiz!


Source URL: https://www.e-education.psu.edu/geointmooc/node/1977

Links
[1] https://www.e-education.psu.edu/geog482fall2/node/1405
[2] https://www.e-education.psu.edu/natureofgeoinfo/sites/www.e-education.psu.edu.natureofgeoinfo/files/flash/coord_practice_geo_v22.swf
[3] http://www.adobe.com/shockwave/download/index.cgi?P1_Prod_Version=ShockwaveFlash
[4] https://www.e-education.psu.edu/natureofgeoinfo/
[5] http://creativecommons.org/licenses/by-nc-sa/3.0/us/
[6] https:=
[7] https://www.e-education.psu.edu/geog482fall2/
[8] https://www.fas.org/irp/agency/nga/doctrine.pdf
[9] http://support.esri.com/en/knowledgebase/GISDictionary/term/coordinate%20system
[10] http://support.esri.com/en/knowledgebase/GISDictionary/term/topology
[11] http://support.esri.com/en/knowledgebase/GISDictionary/term/orthorectification
[12] http://www.geoplatform.gov/wwhgd-home
[13] http://www.claritas.com/MyBestSegments/Default.jsp?ID=20
[14] http://investor.digitalglobe.com/phoenix.zhtml?c=70788&p=irol-newsArticle_print&ID=2004765
[15] http://www.oxforddictionaries.com/us/definition/american_english/data
[16] http://www.oxforddictionaries.com/us/definition/american_english/information
[17] http://www.oxforddictionaries.com/us/definition/american_english/insight
[18] http://www.directionsmag.com/articles/thinking-spatially/123985
[19] https://people.hofstra.edu/geotrans/eng/methods/ch5m1en.html
[20] http://www.csiss.org/classics/content/67
[21] http://www.csiss.org/classics/content/51
[22] http://www.csiss.org/classics/content/9
[23] http://siteresources.worldbank.org/DEC/Resources/84797-1251813753820/6415739-1251813951236/krugman.pdf
[24] https://class.coursera.org/geoint-001/wiki/L2_Try_This_Solution
[25] https://class.coursera.org/geoint-001/wiki/L2_Practice_Answer_A
[26] http://education.maps.arcgis.com/home/webmap/viewer.html?webmap=167402e476f84e2e95b52e2b19cd10f7
[27] https://class.coursera.org/geoint-001/wiki/L2_Practice_Answer_B
[28] http://education.maps.arcgis.com/home/webmap/viewer.html?webmap=9c2e94d9f4c14a23844c9e5b4ee5c8a1
[29] http://education.maps.arcgis.com/apps/Viewer/index.html?appid=d6a1a0b213b04172878ef7db395dfdc4
[30] http://www.josephkerski.com/