< Previous |TOC | Next >

Robert Allison PhD Dissertation - Chapter 3 - RESEARCH OBJECTIVES


The objectives of this research are divided into three major categories which are described in the following three sections: Section 3.1 Creation of a Unique Data Warehouse of Econometric and Demographic Information Relating to Textiles and Apparel, Section 3.2 Creation of the Textile and Apparel Business Information System (TABIS) to provide user-friendly access to the data, and Section 3.3 Utilization of TABIS's Unique Data Analysis, Graphical Presentation, and Networking Capabilities.

The remainder of this section describes the objectives in greater detail, and then Section 4, Section 5, and Section 6 describe how the objectives were met, respectively. Although all aspects of the TABIS project are described for completeness, the major emphasis is placed on the new, unique, and innovative aspects.

3.1 Creation of a Unique Data Warehouse of Econometric and Demographic Information Relating to Textiles and Apparel

The first research objective entails the creation of a unique data warehouse of textile- and apparel-related econometric and demographic data. The number and variety of data sources in the proposed system, and the number of years covered (see Section 2.4.2) should far exceed any of the existing systems, thereby making it unique in content. With the data converted and stored in an integrated structure, and meta data to make programmatical access possible, the proposed system should also be unique in the capabilities it can provide.

Traditionally, most database activities involved transaction processing of "live" (operational) data into and out of a single database. Data warehouses, on the other hand, primarily contain historical snapshots of data, and the data from multiple sources can be easily integrated. Data warehouses are particularly useful for comparison reporting, trend analysis, forecasting, and ad-hoc queries and reports. [Comm, p.35]

Although there is no standard definition of a data warehouse, the following general data warehouse characteristics are essential to this research: 1) A data warehouse draws its data from a disparate collection of unintegrated legacy applications spanning a long time horizon; 2) the data must be transformed into an integrated structure and format; and 3) "meta data" (or, data about the data) must be encoded into the system, describing how the data is stored and integrated, thereby allowing queries to access the data programmatically and automatically. [Inmon] [Comm]

Most of the existing disparate systems for analyzing textile- and apparel-related econometric and demographic data only contain data from a single source, and for a single year, therefore integrating data from multiple sources is extremely difficult, and in some cases practically impossible. Even integrating data across several years within the same data set is very difficult in the existing systems, because most only contain data for the current year -- to perform time series analyses, a copy of the data set must be obtained for each desired year, and then somehow integrated across all the years. Also, the variable names and definitions change from year to year, making it difficult to integrate the data in the existing systems.

Meta data can exist in many forms, probably the simplest being mnemonic variable names and labels (for example, YEAR, AGE, RACE, and SEX) in the data sets themselves. Other meta data information, such as how the variables in one data set relate to another data set, is often more difficult to encode. For example, CONSUMER=MEN in data set A might relate to SEX=MALE, AGE=21_and_over, and RACE=ALL in data set B. The meta data in the proposed system should be stored programmatically, and should be transparent to the users -- the users should only have to make simple selections from the menu-based interface (as described in the second objective, in Section 3.2).

3.2 Creation of the Textile and Apparel Business Information System (TABIS)

A menu-based Textile and Apparel Business Information System (TABIS) should be designed to provide convenient, user-friendly, and highly flexible programmatical access to all of the information stored in the data warehouse.

The system should allow novice users with no programming knowledge to easily select data from any source, integrate data from multiple sources or years where applicable, create publication-quality plots and maps, and easily perform useful "canned" analyses. To accomplish this, meta data will need to be encoded into the interface so it will have the "intelligence" to perform these functions automatically. Although most of the existing systems do have a menu-based interface, most do not provide all of the desired basic functionality.

In addition to the basic functionality, the proposed system should provide advanced users with the flexibility and extensibility described below. These features are not generally found in any of the existing systems.

The proposed system should provide advanced users with the flexibility to add their own "extensions" (i.e., the proposed system should be "extensible" or "open"). For example, users should be able to customize the queries, and make modifications to the graphical output and analyses, or even write entirely new programs to run against the data warehouse. The system should also allow advanced users to integrate their own data with the TABIS data, while protecting the TABIS data from accidental (or malicious) modification.

In addition to the high degree of functionality, the TABIS interface should also be versatile enough to allow access from several different computing environments (such as Unix workstations from various vendors, PCs, MacIntoshes, dial-up connections, remote logins, etc), allowing it to be accessed by as many users as possible. Portable programming methodology and networking technology should be utilized to provide this versatility that is missing in most of the existing systems which only run on a single, standalone hardware platform.

To accomplish these objectives, the interface should be custom-written using a programming language which provides total flexibility, instead of relying on the built-in interface capabilities of an existing database or information delivery system.

3.3 Utilization of TABIS's Unique Data Analysis and Graphical Presentation Capabilities

The final objective will be to utilize TABIS's unique capabilities, both to obtain useful results, and to demonstrate how TABIS can be used.

Perhaps the most important and unique aspect of TABIS is the large collection of data from disparate sources that are all contained in a unified system which allows users to easily integrate data from multiple sources. This unique characteristic should be utilized to provide information not available in the existing single-source systems. For example, data should be integrated to perform transformations such as constant-dollar, per-capita, and per-square mile, thereby improving the quality and forecastability of the data by reducing the effects of inflation, population increases, and differing population densities. Such transformations can be used to gain new insights which were not available using the existing single-source systems.

In addition to using mathematical transformations, TABIS's highly flexible graphical capabilities should be utilized to graphically integrate the data from previously disparate sources. For example, plots from multiple sources could be overlaid, allowing the users to identify trends and differences in data collected over time. Also, maps could be generated using both the raw and integrated data, and then compared to determine the differences.

Most existing systems can not produce detailed customizable publication-quality high-resolution graphics, but this capability should be developed and utilized in TABIS to summarize voluminous data such as Census population projections and Textile and Apparel employment by county.

Much of the TABIS data will be in the form of time series, so it will be important to provide analytical tools that allow the users to study the dynamic properties of the data over time. Graphical animations, as well as other techniques, should be utilized. In particular, animations that show the projected shifts in the U.S. population structure, such as the aging "baby boomers," should be created.

With the ability to easily integrate time series data from several sources, and to programmatically analyze that data, TABIS will be a valuable tool for forecasting. Sample code should be written to demonstrate this capability, but research into using TABIS for forecasting is beyond the scope of this dissertation.

Unlike the standalone PC database systems, TABIS will run on computers directly connected to the Internet. This will provide an opportunity to evaluate sharing a data warehouse with graphical capabilities on a network similar to the evolving national information infrastructure, a research effort encouraged by the Demand Activated Manufacturing Architecture (DAMA) Center. The data warehouse should be accessed using a variety of methods, to evaluate whether the speed and functionality will be acceptable. In addition to researchers running TABIS interactively using the TABIS interface, the World Wide Web (WWW) viewers should be evaluated as a possible means of providing remote access to the data.

The final objective is to utilize TABIS to satisfy real world information needs, both in academia and in industry. These real world trials will serve several purposes: 1) as a "proof of concept" test to show that TABIS can produce results, 2) to test TABIS and find areas where it needs enhanced functionality, and 3) to provide examples to help guide others in using TABIS to solve similar problems.

< Previous |TOC | Next >