We created clusters (convex hulls) of venues making distinctions between the venues by categories. We started off by creating a mathematical algorithm that combines multiple equations:
- We needed to determine a method of comparison for all the venues. Our method used the average distance between all the venues. We used the average distance, which we calculated by comparing the distance from one venue to all other venues and doing this for all the venues and then dividing the number of venues. In essence, it is a permutation with repetition calculation.
- After we calculated the average distance we set a threshold divider of the average distance to determine the size of the clusters that we wanted to use.
- Then we had to match the venue points of comparison to cluster only by category whereby venue points would only cluster when one venue category was equal to another venue category.
- The next step was to create the shape of the cluster creating a geometry polygon to import as a PostGIS shapefile.
CREATE TABLE avgdistances2 AS
SELECT * FROM (SELECT f1.gid as v1id, f1.fsid, f2.gid as v2id, st_distance(f1.geom,f2.geom) AS dist, f1.upperlevelcat as f1ul, f2.upperlevelcat as f2ul
FROM fsvenuewithupperlevelcat2 as f1
CROSS JOIN fsvenuewithupperlevelcat2 as f2 WHERE f1.checkinct > 200 and f1.gid<>f2.gid) AS q1
WHERE q1.dist < ( SELECT SUM(avgdist)/count(hack) as avgdisttotal
FROM (SELECT f1.gid, 1 as hack, SUM(st_distance(f1.geom,f2.geom))/COUNT(f1.gid) AS avgdist
FROM fsvenue as f1
CROSS JOIN fsvenue as f2
WHERE f1.gid <> f2.gid GROUP BY f1.gid) as subq GROUP BY subq.hack)/4 AND f1ul=f2ul ORDER BY q1.v1id;
CREATE TABLE fsvenue_cathulls( gid serial NOT NULL, fsid character varying(256), CONSTRAINT fsvenue_hulls_pkey PRIMARY KEY (gid) ) WITH ( OIDS=FALSE );
ALTER TABLE fsvenue_hulls OWNER TO poweruser; GRANT ALL ON TABLE hullmp3 TO poweruser;
INSERT INTO fsvenue_hulls (fsid,geom) SELECT ad.fsid, st_collect(v.geom) FROM avgdistances2 AS ad Left JOIN fsvenue AS v ON ad.v2id = v.gid WHERE v1id <> 42 GROUP BY ad.fsid ORDER BY ad.fsid;
CREATE VIEW fsvenue_cathulls AS SELECT v.gid, v.fsid, v.upperlevelcat, st_convexhull(h.geom)::geometry(‘Polygon’, 3435) as geom FROM fsvenue_hulls as h LEFT JOIN fsvenuewithupperlevelcat2 as v ON h.fsid = v.fsid WHERE st_npoints(h.geom) > 3;
SELECT * FROM fsvenue_cathulls;
Import PostGIS file to QGIS
Result from category 1: Arts & Entertainment
Bucktown & Wicker Park