Chicago Neighborhood Analysis Indicators

Class classification

1. Working class

2.Service Class

3. Creative Class (creative class diagram)

Compared Neighborhood

  • Wicker Park (Creative-hipster subgroup)
  • Pilsen (Working class neighborhood)
  • Englewood (Service Class Neighborhood)
  • Lakeview/Lincoln Park (Established Creative Class Neighborhood)
  • Logan Square (Potential Creative Class Neighborhood)

Chicago Neighborhood Indicators (chicago neighborhood analysis)


  • size
  • ethnic diversity
  • education level (>bachelor)
  • income
  • job

Existing building fabric:

  • typology
  • rent cost
  • house value
  • ownership
  • density

Transport and infrastructure:

  • distance to public transportation
  • traffic flow
  • parking availability
  • bicycle lane

local context:

  • restaurant
  • bar
  • coffee shop
  • grocery shop
  • farmers market


  • residential
  • office
  • commercial

public facility/service:

  • public school
  • safety
  • health

green space:

  • recreational park
  • playground
  • dog park

Comparison Diagram

 chicago neighborhood comparison diagram

calculating bicycle lane percentage over neighborhood


rawdata as
–join the table, #of bike route in a neigh, total length of bike route in a neigh
select n.pri_neigh,
count(s.f_street) as totalroutes,
sum(st_length(s.geom)) as length,
sum(n.shape_area) as neigh_area
from jur_neighborhoods as n
left join cty_bikeroutes as s
on st_within(st_transform(s.geom,3435),n.geom)
group by pri_neigh
limit 100
–divide total lenght of bike route by neigh area
select pri_neigh,
(r.length/r.neigh_area)*100 as percentage
from rawdata as r
group by percentage, pri_neigh
order by percentage


bicycle percentage table

Tutorials and Resources for Spatial Analytics in PostGIS SQL and QGIS

This post is a repository of useful tutorials, workshops, demonstrations, and reference for spatial analytics and modeling in the Postgres, PostGIS, and QGIS. Please add links to this post as you discover useful sites (include a brief description of the site).

  1. w3schools – a very useful site on the fundamentals of (non-spatial) SQL queries, including tutorials on joins and SQL functions such as AVG()
  2. boundlessgeo – an excellent primer on many of the PostGIS spatial functions, great place to get your feet wet with PostGIS spatial analytics/modeling
  3. spatialthoughts – this site provides tutorials in a wide range of techniques in QGIS
  4. linfinity – this site provides a number of tutorials clearly illustrating spatial analytic approaches using QGIS
  5. Geospatial Analysis – 4th Edition” by de Smith, Goodchild, Longley – this textbook provides the conceptual and geometric principles of spatial analytics and modeling – an excellent resource.

Linking Point Data to Polygon Data Using QGIS and PostGIS SQL

This post explains two ways to use spatial joins between tables that have point and polygon geom fields. The first approach uses built in tools in QGIS to quickly make basic joins for initial analysis. The second approach uses PostGIS SQL queries with the potential to do far more sophisticated joins and data analysis.

QGIS Vector/Geometry Tools:

Screen Shot 2014-01-26 at 9.47.30 PM

QGIS provides a number of tools for analyzing and rebuilding vector layers in the “Vector” tab. This includes the “Geometry Tools/Multipart to Singleparts” command, which breaks a multipolygon into individual polygon elements. The “Analysis Tools/Points In Polygons” command is shown above. When a polygon layer and a points layer are selected, it will produce a new shapefile with the polygons + a new field that is a count of the number of points in each polygon. Here is a detailed walkthrough for this tool.

PostgreSQL Queries:

These queries use an INNER JOIN on rows where the result of the “st_within()” postgis function == TRUE. This function returns TRUE if the first geometry (“pts.geom”) is completely within the second geometry (“blocks.geom”).

This code will create a new table with all the fields from the point table for all of the points that fall within one of the polygons, and add an additional field that shows an ID for the polygon in which that point falls:

SELECT pts.*, blocks.gid as blockID
FROM “Detroit_BP1995_point” AS pts
INNER JOIN “Detroit_CityMap3_region_individualPoly” AS blocks
ON st_within(pts.geom, blocks.geom) AS result

This code joins the point table created by the SQL above to the original polygon table, generating a new polygon table with a count of points within each polygon. This allows for more sophisticated calculations, for instance, normalizing counts against the size of each polygon, etc. (This SQL needs some more work):

SELECT blockID, number_of_permits, total_cost, average_cost, geom FROM(
SELECT pts.blockID, COUNT(blockID) AS number_of_permits, SUM(“COSTBP”) AS total_cost, AVG(“COSTBP”) AS average_cost
FROM detroitTest04b as pts
LEFT JOIN “Detroit_CityMap3_region_individualPoly” as blocks
ON blocks.gid = pts.blockID
GROUP BY blockID) AS results
LEFT JOIN “Detroit_CityMap3_region_individualPoly” as blocks
ON results.blockID = blocks.gid

Making Routes and Visualizing in QGIS Using SQL

PgRouting analyzes graphs to find the shortest path between two nodes through edges. In a street network, you can think of nodes as intersections and edges as the streets themselves. I will be using the Chicago street network, which can be found here. I have also uploaded it to the server, named chicagostreets.

1. In order to develop data capable of using PgRouting a source and target geometry must be set up at the start and end of each road. These will serve as the start and end destination points. To perform this follow the “Creating a routable road network” section of this tutorial (blog also has many helpful QGIS tutorials). Another thing to get familiar with in SQL is the WITH Clause. This allows you to reference other SQLs within one query.

2. Once your network is made you can start routing. Follow this tutorial to get multiple options for routing. pgr_dijkstra and pgr_kdijkstraPath are two that work well for visualizing one path or one startpoint to many paths.

pgr_kdijkstraPath3. By joining the geometries in your network to the results of your routing you can import this into QGIS and see your paths.

To reference my SQL query from the tutorial in class, click link.

Creating Convex Hulls with SQL to QGIS: Part 1

In order to create clusters based on foursquare categories, we first made a table in PostgreSQL called upperlevelcat and assigned a gid for each of the nine categories.

Foursquare Categories:

upperlevel cat table

To link these categories to venues, we did a double join : fscategory table, fsvenue, and fsvenue_fscategory tables.

fscategory table


fsvenue table

fsvenue table



SQL statement for double join:

SELECT v.herenow, v.checkinct, v.userct, join1.upperlevelcat, as catname, as venuename,,v.lon,v.geom

FROM (SELECT * from fsvenue_fscategory

LEFT JOIN fscategory as fsc ON fsc.fsid=fsvenue_fscategory.cid) AS join1

LEFT JOIN fsvenue as v ON  v.fsid=join1.vid



SQL and PgRouting Continued

In order to get pg routing to work, first a transportation network must be formed with a source and target column created to specify which start and end points are used. The SQL below creates a network  in a buffer of a 1 mile radius and examines the Damen CTA Stop. To alter the name of the table created, change the buffer distance, or to change the CTA Stop, change the red values in the SQL below.

CREATE TABLE damennetwork AS
WITH ctastop AS
(SELECT * FROM cta_railstations
WHERE gid = 32),
transpo AS
row_number() OVER (ORDER BY transportation.gid)::integer AS gid,
transportation.gid AS id, 
transportation.street_nam AS name,
transportation.length AS cost,
st_startpoint(ST_LineMerge(transportation.geom)) as source,
st_endpoint(ST_LineMerge(transportation.geom)) as target
FROM ctastop
LEFT JOIN transportation
ON st_within(transportation.geom, 
st_setsrid(st_buffer(ctastop.geom, 5280), 3435))),
nodes AS
(SELECT DISTINCT transpo.source AS gid 
FROM transpo
FROM transpo)
SELECT transpo.gid,,, transpo.cost, transpo.geom, as source, as target
FROM transpo
(SELECT row_number() OVER (ORDER BY nodes.gid)::integer AS id, 
nodes.gid AS geom
FROM nodes
GROUP BY nodes.gid)
AS source ON transpo.source = source.geom
(SELECT row_number() OVER (ORDER BY nodes.gid)::integer AS id, 
nodes.gid AS geom
FROM nodes
GROUP BY nodes.gid) 
AS target ON = target.geom

To be able to find a route with this SQL use the below SQL. Input the name of your table used in the previous SQL where there are the “damennetwork” in red. To change to start and end points, change to “502” and “341” respectfully.

SELECT seq,id1 as node, id2 as edge, route.cost, as streetname, damennetwork.geom 
FROM pgr_dijkstra('
 SELECT gid AS id,
 cost::double precision AS cost
 FROM damennetwork', 502, 341, false, false ) 
AS route
LEFT JOIN damennetwork
ON route.id2 = damennetwork.gid;

What we have been working on is using the CTA stop as a start point and all the venues as the end points. The first thing I did was take all the venues in the same buffer as the network and corresponded them to their closest edge (the target). Different buffers and train stops can be chosen by the the red numbers.

WITH ctastop AS
(SELECT * FROM cta_railstations
WHERE gid = 32),
p AS
venues.gid AS gid, 
venues."legal name" AS company,
venues."doing busi" AS name,
venues.address AS address,
venues.latitude AS lat,
venues.longitude AS lon,
venues.geom AS geom
LEFT JOIN venues
ON st_within(venues.geom, 
st_buffer(ctastop.geom, 5280)))
SELECT rcost.*, p_to_l.targetID, p_to_l.geom FROM
(SELECT p.gid as restaurantID,
min(st_distance(p.geom,l.geom)) as distance
FROM p, damennetwork AS l
GROUP BY restaurantid) 
AS rcost
(SELECT p.gid as restaurantID, as targetID, l.geom,
st_distance(p.geom,l.geom) as distance
FROM p, damennetwork AS l) 
AS p_to_l
ON rcost.distance = p_to_l.distance
AND rcost.restaurantID = p_to_l.restaurantID

The same must be done to find the start point (source) which is the CTA stop. Change to red value to select a different trainstop.

(SELECT * FROM cta_railstations
WHERE gid = 32)
SELECT row_number() OVER (ORDER BY sourceid.distance)::integer AS gid,
sourceid.* FROM (
SELECT p.gid as restaurantID, l.source as sourceID, l.geom,
st_distance(p.geom,l.geom) as distance
FROM ctastop AS p, damennetwork AS l) as sourceid) as source
WHERE source.gid = 1

This gave me a result of a 292 start point, which I will be plugging into the next SQL (in red). The other numbers are the results from the venue target table formed two SQLs ago.

CREATE TABLE routetestwm AS
SELECT seq,id1 as path, id2 as node, id3 as edge, route.cost,, damennetwork.geom
FROM pgr_kdijkstraPath('
 SELECT gid AS id,
 cost::double precision AS cost
 FROM damennetwork', 292, array[336,457,227,179,179,291,351,515,424,244,427,250,349,457,138,291,291,263,230,117,115,208,325,594,284,301,368,116,91,471,543,398,251,251,251,230,230,365,138,463,404,301,356,212,342,203,203,342,478,286,292,519,41,242,242,242,242,242,242,356,57,57,289,310,286,452,277,251,212,427,251,182,242,471,445,251,251,93,291,286,305,262,46,262,46,262,288,288,371,117,398,105,105,286], 
 false, false ) 
AS route
LEFT JOIN damennetwork
ON route.id3 = damennetwork.gid;

This table can be brought into qgis as the network of paths from a CTA stop to any venue.

All Routes