www.BusinessGeomatics.com

DATA VISUALIZATION AND ANALYSIS OF COMMERCIAL ACTIVITIES

By Weislaw Michalak, CSCA

The development of clusters of commercial enterprises is a major part of the process of economic evolution in a knowledge-based economy. Given the fundamental significance of this sector to local economies it is remarkable that there has been so little research into the changing location of clusters of commercial activity. The goal, therefore, of this presentation is to capture spatial and temporal indicators of supply-side change, through the construction of a visual representation of multiple georeferenced variables at several time intervals. These interactive visual representations will provide invaluable decision support information to all types of business.

Visualization provides additional insights to results which would otherwise be displayed as text or numbers. It is a form of communication which is universal, and which has the ability to form and abstraction of the real world into graphical representation which is comprehensible to a wide range of people.

Data or information visualization is crucial for the success of the so-called 'information revolution'. Visualization in computer science terms involves both the conversion of 'abstract' data into 'concrete' visual representations and the creation of user interfaces to support tasks such as:

- searching, data mining (DM)

- exploratory data analysis (EDA)

- analysis and modeling of data

- representation and display of data

There has been a long history of visualization in the social sciences (e.g. John Snow's maps of the 1854 cholera outbreak in Soho and Charles Booth's maps of poverty in London 1889). Figure 1, demonstrates traditional mapping techniques used to visualize complex data. The choropleth map summarizes on a static, flat surface a number of attributes to represent change. However, as the number of units of measurement increase (in this case over 3,500 FSA units) the representation is increasingly difficult to understand. More importantly, significant amount variance within the data is lost due to generalization.

Figure 1. Greater Toronto Area retail sales change, 1989-1997: choropleth method.

Changes in visualization technology in the last few decades are profoundly affecting the way in which the social sciences are researched, and in which studies are communicated. These changes have been largely initiated by the rapid development of computer technology since the 1980s, resulting in the availability of powerful and affordable computing. With respect to the social sciences, four distinct visualization technologies have evolved:

- advanced computer graphics

- multimedia

- World Wide Web

- Virtual Reality

The principal area of development of visualization tools and technologies within spatial sciences has been the domain of GIS; specifically integrating GIS with different software packages and environments. Traditional uses of GIS in this field have been to visualize the spatial aspect of the data, particularly with respect to error visualization (e.g. Cockings et al. 1997), and spatial associations (e.g. Anselin et al., 1996). When integrated with advanced visualization tools however, GIS can become very effective in the analysis and presentation of complex data in a wide range of disciplines such as planning and resource management (e.g. Conners, 1996; Bishop & Karadaglis, 1997; Davis & Keller, 1997).

Increasingly, these integration strategies have been 'tight', using software packages written in the C programming language to build directly within the GIS. For instance, SimLand (Wu, 1998) is a prototype model to simulate land use conversion based on cellular automata (CA) and multi-criteria evaluation sciences are researched, and in which studies are communicated. These changes have been largely initiated by the rapid development of computer technology since the 1980s, resulting in the availability of powerful and affordable computing. With respect to the social sciences, four distinct visualization technologies have evolved:

- advanced computer graphics

- multimedia

- World Wide Web

- Virtual Reality

The principal area of development of visualization tools and technologies within spatial sciences has been the domain of GIS; specifically integrating GIS with different software packages and environments. Traditional uses of GIS in this field have been to visualize the spatial aspect of the data, particularly with respect to error visualization (e.g. Cockings et al. 1997), and spatial associations (e.g. Anselin et al., 1996). When integrated with advanced visualization tools however, GIS can become very effective in the analysis and presentation of complex data in a wide range of disciplines such as planning and resource management (e.g. Conners, 1996; Bishop & Karadaglis, 1997; Davis & Keller, 1997).

Increasingly, these integration strategies have been 'tight', using software packages written in the C programming language to build directly within the GIS. For instance, SimLand (Wu, 1998) is a prototype model to simulate land use conversion based on cellular automata (CA) and multi-criteria evaluationbeen with respect to VR techniques on the Web. There are a number of tools that have been developed in order to facilitate data fusion and visualization in VR environment. Iris Explorer (from Numerical Algorithms Group) and MineSet (from Silicon Graphics International) are used at the CSCA. Figure 2 and 3, demonstrate screen snapshots of a dynamic 3D visualization of sales data for FSAs in the GTA.

Figure 2. Greater Toronto Area retail sales in 1989 by FSA: random colour method. Figure 3. Greater Toronto Area retail sales in 1995 by FSA: equal interval method.

Parallel to these visualization developments in GIS has been a radical transformation within cartography (Grelot, 1994; Kraak et al, 1995; Krygier, 1995). Consequently, a significant amount of 'cutting edge' GIS visualization research and development on the Web is actually computerised cartography, recently re-labelled as scientific visualization. Scientific visualization is a growing area of computing with the underlying philosophy that displaying visual representations of data assists researchers in generating ideas and hypotheses about the data (Fisher et al., 1993). Accordingly, Dykes (1996) suggests that cartographic visualization systems may represent the principal technology for the scientific visualization of digital spatial information. He argues that many statistical and GIS software programmes do not regard the map as a real-time tool for analyzing data, or as an interface to access the underlying information. Cartographic visualization systems, however, can provide intelligent assistance to GIS users by allowing data mining and/or exploratory data analysis. Compared to merely automating previous mechanical and manual technologies, more dramatic changes in visualization in cartography have been due to developments in computer graphics. For instance, cartograms (Dorling, 1995) are increasingly being recognised as a major solution to many spatial visualization problems of human societies. The gross misrepresentation of many groups of people on conventional topographic maps has long been seen as a major problem of thematic cartography, highlighting difficulties such as the modifiable areal unit problem. Cartograms are now being used in the visualization of high-resolution spatial social structures and in the mapping of long-run historic changes in society.

Figure 4. Greater Toronto Area retail sales in 1996 by FSA: centroid method.


In terms of complex visualization techniques, however, one of the leading uses in business statistics has been in exploratory data analysis (EDA). EDA is an inductive approach to statistical analysis, and can be extremely useful for investigating complex relationships within datasets. This is becoming more essential as the typical business datasets become more complex. Converting these data into useful, meaningful information can be extremely difficult and haphazard (Ondrechen, 1997). Visualization is being used increasingly as a method to overcome these difficulties, with recent software developments providing new tools for visualising multivariate data (Colet & Aaronson, 1995). For instance, Levin & Mitra (1994) describe a curve-fitting visualization programme designed to generate initial parameter estimates for non-linear equations, illustrating the process by modelling mortality data. Non-linear equations are notoriously difficult to solve, since a given equation can have an infinite number of often quite dramatically different solutions, all meeting the same specified goodness-of-fit criteria. Advanced visualization techniques can remove some of the inevitable trial and error process involved in solving such equations.

Figure 5. Close-up of downtown Toronto retail sales data in 1995 by FSA.

Visualization in the social sciences continues to grow at a fast pace. However this growth is relatively uncoordinated. The activity does not fall easily within the remit of any particular discipline and the publication of results in the traditional form is very problematic. Without greater coordination the future of visualization in the social sciences is likely to be much like the past, but more diffuse and more ephemeral. This coordination is likely to arise only from direct funding for exemplar projects and centres from the research funding councils.

Further research at the CSCA focuses on determining how to integrate methods for dynamic manipulation presented here into geographical visualization tools (e.g. to merge exploratory data analysis methods with map animation) and to integrate VR technology with geographic data and principles of geographic representation (e.g. adding geofunctions to VRML or using immersive VR to explore abstract georeferenced data, such as output from GTA retail sales by FSAs model. In addition to research on the technical problems, research is also continuing to consider implications of new representational forms. Questions here relate to semiotics of extended representational environments, the relative merits of abstract versus realistic representations, and what the concept of representation means in visualizing commercial data. One of the components of an approach adopted at CSCA involves considering cognitive aspects of visualization.