Sunday, September 25, 2016

Vehicle Routing, Round 2

Road networks are complicated bastions of all manner of spatial quandaries, most notably, for our purposes, those involving vehicle routing.  There is no end to the permutations of various circumstances these kinds of problems can involve, and knowledge of some of their more common manifestations is invaluable.  When these problems involve ordered pairs of pick ups and drop offs, there are a multitude of parameters we can specify to maintain various characteristics of our intended route- time windows, driver breaks, wheelchair access (if required), distance, etc.  According to the specifications we input the actual resulting route can vary widely.




Above is one solution (series of routes) to a routing problem involving pick ups and drop offs of people at locations in south Florida. The parameters used to create these routes include drop off and pick up time windows, maximum working hours and breaks for drivers, and route zones, which constrain where certain routes can travel.  The initial routing solution created for the above had constraints which produced an output with five "unassigned" locations, which were not assigned to any of the routes.  The second solution, shown above, modified the parameters such that two additional vehicles/routes were included, which eliminated the unassigned locations.  The customer service aspect of this modification is notable with regards to this routing problem- the ability to include all of the desired stops, which the second set of routes (above) does, increases the number of customers served, which is obviously beneficial.  Additionally, the extra routes may relieve some of the burden on the rest of the vehicles, allowing more flexibility in service times, which also benefits the customer.  

Sunday, September 18, 2016

Road Networks and Vehicle Routing

Road networks are a feature of the landscape that is rife with potential for spatial analysis, and one of the more common problems involving these networks is the challenge of creating an appropriate vehicle route.  The time it takes a vehicle to get from point A to B varies according to a plethora of factors, including speed limit, road size, traffic patterns, etc., and the route chosen for the vehicle may need to take any number of these factors into consideration.



The above route maps are similar to one another, but the two do actually vary slightly.  The map on the left is identical to the one on the right, except that its route was created using rules involving traffic data, which changed the route, and actually added about an hour onto its travel time.  This kind of route optimization is invaluable to those who plan these kinds of routes- the more factors one can take into account when creating something like this, generally the better.     

Sunday, September 11, 2016

Road Networks and the Spectre of Exhaustive Data Completeness

Spatial data is, ideally, completely representative of the real-world entities it is portraying.  In reality there is necessarily some amount of generalization inherent to any spatial data, a result of translating and scaling it for its practical use.  The perfectly complete data set is a very rare entity, and road networks like TIGER, TeleAtlas, and locally sourced street centerlines more often than not exclude some amount of road segments that are actually present.  This exclusion results from things like conscious decisions to leave smaller, privately-owned roads out of the data set, and errors of omission in data digitizing/collection procedures.  



The above represents a county and two different road network data sets- one from the census bureau's TIGER database, and another with locally collected street centerline locations.  A 1 km by 1 km grid is overlaid, in order to systematically derive a spatially comparable measure of each data set's relative completeness.  The total length of road segments within each grid cell can be compared between the two, and a measure of the magnitude of difference between the two is depicted with the choropleth map above.  Thus we can conclude that the TIGER data set is more complete than the street centerlines, as it contains a larger amount of total road segments- which is our sole qualifier for data comprehensiveness.  The caveat to that is, of course, that we may wish to consider other factors in gauging the data's relative "completeness."  If the TIGER data contains more driveways and non-navigable road segments, for example, we may wish to reevaluate our definition of "complete," as the superior accuracy of the other street centerline data renders it somewhat more "complete" than the TIGER, so to speak.     

Sunday, September 4, 2016

Adventures in Metadata

This week's exercise in Data Quality includes examination of various data standards, namely the National Standard for Spatial Data Accuracy (NSSDA) and the older National Map Accuracy Standards (NMAS), which indicate the data set's accuracy, and are typically found in its metadata.  The actual procedure described in the NSSDA involves taking sample points from a test data set and a known, highly accurate reference value, and produces a value that indicates the tested data's horizontal or vertical accuracy at a 95% confidence level.



Above is a map, which includes 20 sample points, taken to be compared with a reference data set, which was created using aerial imagery.  The sample location distances from the reference points are used to calculate an error statistic, which is typically included in a data set's metadata.  The statistical and testing method prescribed in the NSSDA allows for a statement on the likely amount of error in the data, like the one for the map above-
tested 2492.12642 feet horizontal accuracy at 95% confidence level.  This indicates that a horizontal position taken on the map will generally be within about 2,492 feet of its actual location on the ground, as accurate as true location is, 95% of the time.