In Class SQL Workshop

On Monday we wrote a SQL for determining the most culturally diverse neighborhoods in Chicago as an exercise to demonstrate the potential and functionality of SQL in PGAdmin.


Here are some preliminary SQLs we worked on. (the link is to an audio recording of the workshop for further reference).

This SQL sorts CTA stations based on which neighborhood they are located in. Audio Reference

 SELECT * FROM jur_neighborhoods AS n, trn_ctastations_4326 AS s
 WHERE st_within(s.geom, n.geom) -- -- SELECT n.gid, sum(st_distance(st_centroid(n.geom), st_transform(s.geom,3435))) AS dist, pri_neigh, count(s.gid)
 FROM jur_neighborhoods AS n
 LEFT JOIN trn_ctastations_4326 AS s
 ON st_within(st_transform(s.geom,3435), n.geom)
 GROUP BY n.gid, pri_neigh --
 SELECT n.gid, pri_neigh, st_convexhull(st_collect(st_buffer(st_transform(s.geom,3435),100)))
FROM jur_neighborhoods AS n
 LEFT JOIN trn_ctastations_4326 AS s
 ON st_within(st_transform(s.geom,3435), n.geom)
 GROUP BY n.gid, pri_neigh --
 WHERE st_within(s.geom, n.geom)

This SQL finds the standard deviation of the racial diversity in a neighborhood. The lower the standard deviation, the more diverse the neighborhood.  Audio reference

WITH rawdata AS
 (SELECT n.gid,
      sum(st_area(c.geom)) AS totalblockarea,
      sum("race_householder_one_race(n)") AS total,
      sum("race_householder_one_race_white(n)") AS white,
      sum("race_householder_one_race_black(n)") AS black,
      sum("race_householder_one_race_native_american(n)") AS namer,
      sum("race_householder_one_race_asian(n)") AS asian,
      sum("race_householder_one_race_pacific_islander(n)") as pacif,
      sum("race_householder_one_race_other(n)") AS other
FROM jur_neighborhoods AS n,
      dmo_census2010_01 AS c
WHERE st_within(c.geom, n.geom)
GROUP BY n.gid, n.pri_neigh
ORDER BY pri_neigh),
meandata AS (SELECT gid, pri_neigh, white/(total+.0001) AS whiteper, black/(total+.0001) AS blackper, asian/(total+.00001) AS asianper FROM rawdata),
calcdmean AS (SELECT *, (whiteper+blackper+asianper)/3 AS mean FROM meandata)
      sqrt((power(mean-whiteper, 2) + power(mean-blackper, 2) +power(mean-asianper, 2))/3) AS standard_dev
FROM calcdmean
ORDER BY standard_dev

Cultural Diversity

The result is a set of means for each sub-group and their standard deviation from the mean. The lowest standard deviation is inherently the most diverse neighborhood.  The map shows expected results; A relative lack of diversity around the perimeter and much more diversity in the center of the city.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s