Monday, October 28, 2013

Geocoding



For this assignment we were tasked to download data from a county website and using that data connect to a database server and then geocode the data from the excel file found on the county website. The data that we used came from the Tremplealeau county land records. After downloading the starting geo-database from the trempealeau county web servers it became apparent that the records that were on file were all In different forms of measurement; for example we had data that would read as north sector 22 and“about a ¼ mile down country road M” this is data that we could not use in our mapping of the frac mines and would have to be normalized into correct address labels such as a street address. There were a total of about 132 total sand frac mines that were needed to be geocoded so instead of geocoding every sand mine the task was broken up into 22 separate areas so we were only tasked to map 14 – 18 by ourselves. These 14-18 geocoded points were then transferred into a geo-database and were used to show all the sand frac mines  operational in Wisconsin.



Normalizing data is the process of taking the unusable data and making it usable. The reason why we have to do this is because when the data came to us instead of having it in a usable format such as a street address it came to us in a jumble of coordinates and addresses. An example is that we had PLSS coordinates mixed with regular street addressess, this made it so that the geocoder could only geocode a couple of the addresses. The ones that were in the right spot were the ones with the regular street addresses. Knowing this we could then manually map the remaining missing addresses.

 
 
 
This is an example of data that is not normalized
This is an example of data that came with the Trempealeau county Database

In order to map the frac sand mines we were required to geocode the addresses acquired from the tremplou county data. Geocoding is the process of taking data from an excel file or other similar file system and extracting that data into a usable format that we can use to map as points polygons or other features. This can lead to certain errors however such as inherent errors. Inherent errors are errors that are present in the data from the start. An example of an inherent error that I encounted while geocoding would be the age of the maps that we used. In Google Earth we could see that the sand frac mine is operational but in the arcgis basemap that I utilized it wasn’t even constructed. After I found the areas that I needed to geocode for the point of the frac mines it was time to normalize the data so we could have a usable address.


This is an example of an inherent error from the arcgis basemap
in this picture it shows that no develpoment has occured.


But in this newer satellite image we can see a frac mine has been developed 
 
In order to rectify the inherent errors I used the geocode tool bar in the arc gis mainframe to present our own addresses this can be done a number of ways, one such way would to insert a point yourself but the way I accomplished this task was importing a base layer that had all the addresses in it from the start, then using google earth I entered the closest address that was presented to us from the Trempealeau county data. This type of Geocoding is called "Fuzzy Matching," it uses a predetermined address based on a scale of accuracy from 0-100% that I could selectso little editing would need to be done in the excel file. This way however, can lead to operational errors which is error in the form of user error. For example in one case I mapped an error just outside of the sand frac mine but one other student in the class mapped the opposite side of the frac sand mine.

 
 This is the Trempealeau county data normalized into a format that we can use for Geocoding.
 


This is an example of operational error.
Notice the distance between the other students point (red dot)
and my point (Green Triangle)
because of this error there could be accruracy issues associated
 

When we compared data to the other members of the class I was happy to see that our output points were fairly close together. This means that all our points were accurately labeled with little operational error involved. I still came across some problems that were my own doing. When I started the project the data that we were using did not come with unique ID codes so the data that I had been working on did not have an ID that I could match to the other student’s data. When the geo-database was released to us to my dismay my points could not be selected in an attribute search. I rectified this problem by starting up editor and manually inserting a unique ID field into my attribute table and from here I manually inserted the unique Id into each one of my cells by looking up the unique id in the geo-database that was provided to us.
The Final map with Geocoding Frac Sand mines.




This mapping exercise was very beneficial to me as it allowed me to work with real world data with real world problems associated with the data. It also allowed me to test me mettle against working with inherent and operational errors that can be a cause of many problems, and headaches.

No comments:

Post a Comment