Tuesday, February 17, 2015

Classifying Data: for a Map that Makes Sense

Appropriate data classification is one of those things that, though we are mostly unconscious of, can really mean the difference between a map that logically displays data accurately, and one that is imprecise and misleading.  When one creates a map that is meant to display some characteristic phenomena, like population percentage of a specified age, by division of some administrative boundary, like state, county, or census tract, one must choose carefully the categories that each administrative unit is placed in.  The technical term for this type of map is choropleth, and there are several different ways the data categories, or classes, can be decided.  


Through my creation of the above I personally discovered some of the challenges in deciding which classification method is most appropriate for accurate depiction of the characteristic phenomena being mapped.  The 4 maps are all displaying the same raw data- census information on the population percentage of adults 65 and older in Escambia County, by census tract area.  They differ in how they display that data, namely how the different percent values for each tract are grouped and shaded the same, or are classified, to create a map that accurately shows the viewer a general idea of the distribution of this population characteristic throughout the county.  The 2 bottom maps classify the data with the quantile and equal interval methods, which divide the values without consideration of where the natural groupings of values occur within the range.  The upper right map classifies values according to their distance from the mean percentage, and the upper left uses algorithms to determine the most accurate "natural breaks" in values, in order to minimize the difference between values in the same class, and maximize the difference between values in different classes.  Upon close inspection, in my opinion, there aren't many huge differences between the maps, and yet I was asked, as part of the assignment, to choose, and defend my choice of, which is the superior method of accurately displaying the actual data.  This was a bit of a challenge.  I concluded the superior method is the natural breaks, but a decent case could be made for any of the other 3.  This type of challenge becomes especially politically, and emotionally, loaded when one is mapping some characteristic like race, political affiliation, or crime statistics, as the decision to place areas in one category or another can be a cause of consternation for some.  One also has the ability, in certain situations, to present the raw data in a misleading fashion according to how it is categorized- in which situation certain causes can be championed, etc.  All in all, it's a very important lesson for a nascent cartographer to learn, as these issues may not be immediately evident in the process of map creation or assessment. 

No comments:

Post a Comment