Where do you live, where are you going, who will be there? These are important questions and thoughts that may cross the minds of everyone daily.
Locations, people, and their demographics are the true definition of spatial data. Geospatial, or GIS geographic information systems, are the basis for the data that helps us understand location patterns and trends needed to make decisions. This is what we can refer to as spatial analysis.
Increasing data visibility
Data visibility is at the peak of technology news, from AI concerns to the world of data science. Everyone is collecting data and trying to determine what to do with it.
This is happening in every industry, from helping to increase operations and production efficiency in the manufacturing and telecommunications industry to measuring kilowatt usage and medical trends to calculate dosages within the energy and healthcare fields. Data consumption is delivering key information needed for quick critical decisions and planning.
There are three main considerations when starting a geospatial project, which I will be discussing in this blog post:
- Collection sources
- Data storage
- Data access
Leveraging data from your company internal systems is valuable, but pulling together data from user surveys can truly be helpful. Depending on the use case and line of business, you will find several free sources of data. That can be useful searching for wildlife population in rural areas or future locations to build for real estate development.
|Collection source||URL||Information details|
|OpenGrid||opengrid.io||Neighborhood building renovations, city services|
|OpenStreetMap||openstreetmap.org||Free data created by users for map generation|
|Chicago Data Portal||data.cityofchicago.org||Open data portal with maps and facts about the city (another example is data.detroitmi.gov)|
|Open Geography Portal||geoportal.statistics.gov.uk||Provides free and open access to the definitive source of geographic products, web applications, story maps, services, and APIs|
|LIFARS||lifars.com||Top vulnerability tracking databases|
|U.S. Government Open Data||data.gov||Geospatial datasets in U.S. government’s geoportal|
|Feng Chia University||www.gis.tw||Research Center Feng Chia University|
|INRIX||inrix.com||Real-time camera surveillance data|
|topoView||topoview||USGS topographic maps|
|Open Geospatial Consortium||ogc.org||Open Geospatial Consortium podcast/community|
|Geospatial Analysis||spatialanalysisonline.com||Guide for geo analysis|
|Plenario||plenar.io||City permits, crime stats|
Storage in this context does not refer to the location or device type - you can easily house the data on-prem or a cloud of your choice. The context here refers to storage of data where it can be easily accessed, and the ability to store it in multiple formats. Given the availability and flexibility required, my suggestion would be a Postgres database, because of the relational benefits to store several data sources and have them organized into tables that can be queried and then ordered, allowing for filtering or aggregate queries.
PostgreSQL is truly a full-featured spatial database, with special indexing capabilities in the form of BRIN and sp_gist indexing that increase search capabilities and allow you to combine datasets, pooling critical information into a single format. SpatialLite, Oracle Spatial, and MSSQL spatial are other database options that can help with analysis, but what puts PostgreSQL on the front of the queue is the flexibility it provides with its support of many program languages that can be used to format your data, including popular languages such as Rust, Spark, Scala, Swift, Kotlin, R, and Python, just to name a few.
The method for accessing the data may be the most important decision when you leverage geospatial data. You should look for a software utility that will support multiple data formats such as WMS, WFS, GeoJSON, and Tiger/Line Shapefiles files, such as the ones used in U.S. in census data.
Having a basic understanding of the type of data processing that will be in use can help link GEOIDs for demographic data.
Types of data processing
Along with the types of data, another factor that you need to keep in mind is that data can typically be broken down into two main types.
- Vector – Represents locations using point, lines, and polygons. This data type is used in point to point calculations.
- Raster – Represents grids in the form of pixels, which can be further broken down into discrete rasters and continuous rasters. Used in DSM-digital surface models and DTM-digital terrain models.
From cholera to COVID-19 - Geospatial data in medicine
Use of georeferencing processing and vector data was recently presented and made visible to everyone by the WHO (World Health Organization) and the CDC (U.S. Centers for Disease Control and Prevention). Collection of the new data covered by HIPPA regulations was put to the test, as many users decided not to disclose PHI (Protected Health Information) pertaining to vaccination status and treatments.
Geospatial data has been used in the medical industry since the early 1800’s. To be exact, in the year 1854, London was faced with a major breakout of cholera. During this time, Dr. John Snow used geodata to track the outbreak trends to a specific area. His use of geodata became historic to the medical industry since then. The details of his research led to the Broad Street neighborhood, where people were allowed free refills on beer as long as they didn’t drink the water in the local pubs, since the Broad Street pump station was believed to be the source of the contaminated water.
At that time, the miasmatic theory was the most popular theory for the cause of the infections. It attributed to miasmas or foul-smelling vapors the spread of diseases such as cholera, tuberculosis, and malaria – the latter of which literally means “bad air” in Italian. The work of Dr. John Snow challenged this prevailing theory, with the 1851 London census data being key to determining the number of people within the possible affected area.
Unlocking geospatial insights
Using today’s technology, Dr. John Snow and the city of London would have been able to gather their results faster using some of the unique software tools available. When accessing the source data, you need tooling that offers you the flexibility to read CSV files from the local census and pulling in the street maps from your area.
The most popular tools can be found in use within major government geo departments today. For example, the UK may be on the leading-edge using geo to track ill health. They have been able to track health demographics within rural districts and city areas to trace the trends of ill health, which can lead to geo helping to aid in public health.
Software tool options
- PostGis: https://postgis.net
- Esri – ArcGIS: https://www.esri.com
- QGIS: https://qgis.org
Tooling will offer you the greatest possibilities to consume the data in many formats. The three main options listed above will provide flexibility to produce maps using georeferencing and historical information even when the point-to-point coordinates are not known.
All three tools have powerful resources to display raster data and shapefiles. PostGIS is not widely used in general, maybe because it is not well-known outside the Postgres open source community, where it is quite popular with users and developers who work with geospatial data.
Esri, with the ability to use ArcGIS online, has become extremely popular, and notably used within the public sectors.
And QGIS is unique in that it uses the OGR library to read and write vector data for AutoCAD, MapInfo, and ESRI Shapefiles. QGIS provides multiple plugins for data, along with support for PostgreSQL. It also makes it easy to integrate your PostgreSQL database and perform scripting to enhance the implementation of geoprocessing using PyQGIS. QGIS and PostGIS combined presents the most powerful open source matrimony for GIS.
Geospatial data in PostgreSQL
Installing the PostGIS extension in your Postgres database is all done within minutes. All it takes is a quick download of the software and creating the extension in a currently installed Postgres environment.
postgres=# SELECT name, default_version, installed_version
postgres-# FROM pg_available_extensions
postgres-# WHERE name LIKE 'postgis%' or name LIKE 'address%';
name | default_version | installed_version
address_standardizer | 3.3.3 |
address_standardizer_data_us | 3.3.3 |
postgis | 3.3.3 |
postgis_raster | 3.3.3 |
postgis_sfcgal | 3.3.3 |
postgis_tiger_geocoder | 3.3.3 |
postgis_topology | 3.3.3 |
postgres=# CREATE EXTENSION postgis;
postgres=# CREATE EXTENSION postgis_raster;
postgres=# CREATE EXTENSION postgis_sfcgal;
postgres=# CREATE EXTENSION fuzzystrmatch;
postgres=# CREATE EXTENSION address_standardizer;
postgres=# CREATE EXTENSION address_standardizer_data_us;
postgres=# CREATE EXTENSION postgis_tiger_geocoder;
postgres=# CREATE EXTENSION postgis_topology;
When it comes to understanding people, population, and their characteristics, geo data can be used to tell a story for any industry where data is consumed or produced - from statistical research and demographics to solving the problems of the world, as history has shown us in recent years.
Using GIS technology, I was able to track the decreasing deer population in my area, which was once on the rise, but is now diminishing. Local scientists are trying to find out why but have been distracted by the rise of wild turkeys in the inner city. Another popular source of information, OpenStreetMap has been helpful to scientists and students who may share a common interest in nature and tracking ecosystem development.
As a living and breathing source of data, you too can be a contributor to geo community portals. Sharing your collected data and maps may be more valuable than you realize. And on top of that, the next time the census comes around, remember to complete your updates and be counted so you may become the geo clue that helps save the lives of tomorrow.
Geospatial data plays a crucial role in understanding people, population, and their characteristics across various industries. By leveraging collection sources, data storage, and data access, valuable insights can be gained to make informed decisions and plan for the future. Whether it is tracking wildlife populations, analyzing medical trends, or solving global problems, geospatial data offers endless possibilities. With the use of the right software for you, accessing and analyzing geo data has become easier than ever.
By engaging with the geospatial community and sharing our own data, we can contribute to the advancement of this powerful tool. So, let's continue to explore the possibilities of geospatial data and unlock even more insights to shape a better future.