Tuesday, November 29, 2016

Spatial Aggregation & Gerrymandering Political Districts

It might be fair to say our country is at a bit of a crossroads, in terms of political parties, and one might contend that gerrymandering, on the part of one party, may have unfairly and unduly influenced the 2016 election's results.  What we do know for sure, though, is that spatial data analyses can be greatly influenced by both the size and shape of the boundaries drawn to delineate zones or districts.  Political districts in the U.S. are ideally somewhat standardized- districts are optimally shaped with existing administrative boundaries (states, counties, census blocks, etc.) include roughly the same number of people, and encompass neatly contiguous spatial areas.  In real life, however, compact and contiguous districts can be broken up by any number of natural features, like water, mountains, and the like, but they can also be dismantled and haphazardly reassembled through gerrymandering, or changing political boundaries with the intent of benefiting someone or something through the changed voting districts.  This process inevitably leads to some fairly strangely shaped Congressional Districts in this country, which, because of the properties involved in changing scale and spatial aggregation, can have a rather disproportionate effect on election results.

One measure of voting district "compactness," or the extent to which the zone boundaries are logically shaped, and the area the zone encompasses is contiguous, is called the Polsby-Popper measure.  This formula gives the ratio of the zone's area to a circle with the circumference of the zone's perimeter.  The idea is that zones with oddly, convoluted shaped boundaries will produce a lower "score" on this measure, and those with smooth and adjacent boundaries will produce one that is higher.  The sense that zones do not divide existing political boundaries, namely counties, is another measure that provides an idea of the extent to which these political districts divide existing communities.  The direct measure of this can be provided by a count of the counties divided by Congressional Districts, and further a count of those that divide the greatest number of counties.





The images above represent two different types of oddly-shaped, non-contiguous districts- the ones on the left result from the natural shape of the areas in question, and the ones on the right are the result of gerrymandering.  As one might imagine, the districts on the right scored very low on the Polsby-Popper measure.  The haphazard construction and re-drawing of these boundaries changes the nature of the population contained, and, as mentioned, can greatly influence election results.  And, if one party does it, the other must necessarily follow suit, and the cycle continues, seemingly ad infinitum... as is apparently the nature of American democracy today.




Monday, November 21, 2016

LiDAR vs. SRTM: Resolution, Accuracy & Scale

LiDAR, or light detection and ranging, and SRTM, or the Shuttle Radar Topography Mission, are two different methods of wide-scale generation of DEMs, or digital elevation models.  SRTM is a NASA initiative, which aims to provide (relatively) accurate elevation data for (most of) the world, by remotely sensing via space shuttle.  LiDAR is a process of collecting surface data by aircraft, which involves radar, and provides a far more detailed- or high resolution- image than most other remote sensing methods.  Resolution, which we can derive from the size of the grid cells that compose a raster image, is an important consideration in choosing a DEM, as it can dictate the results of many terrain analyses one might perform using that elevation model.  


One terrain derivative one might require from a DEM is slope, a comparison of that which has been derived from the SRTM and the LiDAR data is shown above.  One can immediately recognize the lower resolution present in the SRTM image, and the higher degree of accuracy in the LiDAR data.  The scale one is working with could clearly influence the results of any spatial analyses performed, as is evidenced clearly from the above- the kind of generalization present in the SRTM data may be acceptable for very large, smaller scale projects, but obviously wouldn't be appropriate for anything requiring any kind of more localized detail.    

Tuesday, November 15, 2016

Geographically Weighted Regression (GWR) vs. Ordinary Least Squares (OLS)

If one is attempting to find any kind of statistically significant relationship between spatial variables, one might use a local Geographically Weighted Regression (GWR) model, which would attempt to demonstrate that change in one variable promotes a significant amount of change in another.  Alternatively, if one were looking to see if two or more variables were correlated, or just related to one another, it might be appropriate to use a global Ordinary Least Squares (OLS) model.  Both of these statistical models are specific to spatial analysis, as this type of modelling requires a slightly different perspective on the phenomena being modelled- illustrated conveniently with the spatial autocorrelation assumption inherent in Waldo Tobler's famous quote "Everything is related to everything else, but near things are more related than distant things."

GWR, by definition, involves regression- the modelling of the relationship between dependent and independent variables.  Regular statistical regression needn't take into account variables like spatial distribution and physical proximity, though, and thus the addition of "geographically weighted." OLS, on the other hand, involves simpler methods of correlation.  When two variables have a statistically significant relationship, that correlation found from running the OLS model can be used to justify performing further GWR analysis.  The appropriate model to use for spatial variables depends upon the context and the variables being examined- no one model is superior to another, and there is some amount of subjective measure required in the decision of which one to use in any given situation. 

Sunday, November 13, 2016

A(nother) Discussion of Regression Analysis in ArcGIS

Regression analysis in ArcGIS uses spatial analyses and autocorrelation, with the intention of predicting different facets and characteristics of spatial variables.  It is a fairly standard variety of empirical study- wherein one collects data, defines dependent and independent variables, performs all manner of arcane statistical procedures involving both numbers and Greek letters, all with the intent of attempting demonstration that one variable has a measurable positive or negative effect on another.

Issues can arise, however, when one needs some kind of certainty that the regression model employed is accurate and/or non-biased, among other things.  Fortunately ArcMap produces a lovely table with all of the calculated numbers required to determine various likelihood of errors, like the "Jarque-Bera" statistic, which gives a measure of model bias.  The R-squared and intercept coefficients are included as well, which are also used to determine the validity/reliability of the model.