Naming Conventions

For the database it is important to set up conventions for naming our files so everyone knows what files are and it will be more organized. A few of us met today to define major categories and this is what we came up with. We can discuss this tomorrow further, so it is more suitable for everyones projects, but until then use this naming for the new database. Just a reminder… do not use spaces or upper case letters in your table names.

City Geom:
– ex. Sidewalks, City Limits, Streets, Building Footprints
– should ideally just be an “id” and “geom” column as this will be used mainly for visualizing in QGIS
– start layers in this category with cty_

Transportation:
– ex. Bus Routes, CTA Stops
– start layers in this category with trn_

Demographics:
– ex. Census by Block 2010, Income by Block 2000
– start layers in this category with dmo_

Landscape:
– ex. Parks, Trees
– start layers in this category with lnd_

Jurisdiction:
– ex. Police Districts, TIF Districts, Neighborhoods, Wards
– start layers in this category with jur_

Report:
– ex. Crime 2013, 311 Requests, Permits, Business Licenses
– start layers in this category with rpt_

Social:
– ex. Facebook API, Yelp API
– start layers in this category with soc_

Agent:
– ex. Customer, Alderman, Dogs
– start layers in this category with agt_

conventions

Another place to get data: https://opendata.socrata.com/
(worldwide, but has all Chicago Data Portal data and then some)

Advertisements

how to export Rapidminer process

This PostgreSQL tutorial is designed for intermediate level users who are trying to export stored data as a CSV file.  This style of exporting can be achieved using a few different COPY TO queries, which can help you separate out the valuable information required for each data export.  By completing this tutorial users will also gain the skills for importing CSV data directly into their PostgreSQL database using the COPY FROM SQL syntax.

 

The CSV file format can be much more useful then other spreadsheet formats such as Microsoft Excel (XLS) because the CSV doesn’t have file length limitations included with it.  By exporting your PostgreSQL data into a CSV file you have the ability to store 100,000 or millions of rows of data, whereas XLS can only store up to ~ 65000 rows.  Therefore, if you wish to export a larger data set (more then 65,000 records), all your information can be saved into a single file using a COPY TO query.

 

Before we can start exporting the data to CSV file, we must make sure that we have setup a location on your Linux server that has the correct file permissions.  Navigate to the pgsql/data folder on your system, ex. /usr/local/pgsql/data.

 

cd /usr/local/pgsql

 

Enter the following commands to setup a new directory called csv, which is owned by the postgres user.

 

mkdir csv
chown postgres csv
chmod 755 csv
ls -la

 

Now that we have a csv directory which is owned by the postgres user, you have the necessary file permissions drwxr-xr-x postgres root, which are required for exporting CSV files.

 

PostgreSQL COPY TO: Creating a CSV File

 

Export all data in a table

COPY tablename
TO '/usr/local/pgsql/data/csv/alldata_export.csv'

 

Export defined columns in a table

COPY tablename(column1, column2, column3, column4)
TO '/usr/local/pgsql/data/csv/definedcolumns_export.csv'

 

Export with defined delimiter

COPY tablename(column1, column2, column3, column4)
TO '/usr/local/pgsql/data/csv/defineddelimiter_export.csv'
WITH DELIMITER ‘,’

 

Export with 1st row column name headers

COPY tablename(column1, column2, column3, column4)
TO '/usr/local/pgsql/data/csv/header_export.csv'
WITH DELIMITER ‘,’
CSV HEADER

 

Export using a SELECT WHERE query

COPY (SELECT column1, column2, column3, column4
FROM tablename
WHERE column1 = ‘value’)
TO '/usr/local/pgsql/data/csv/selectwhere_export.csv'
WITH DELIMITER ','
CSV HEADER

 

Export using an ordered query

COPY (SELECT column1, column2, column3, column4
FROM tablename
WHERE column1 = ‘value’
ORDER BY column4 DESC)
TO '/usr/local/pgsql/data/csv/ordered_export.csv'
WITH DELIMITER ','
CSV HEADER

 

Export maximum 100 records using a limit query

COPY (SELECT column1, column2, column3, column4
FROM tablename
WHERE column1 = ‘value’
ORDER BY column4 DESC
LIMIT 100)
TO '/usr/local/pgsql/data/csv/limit_export.csv'
WITH DELIMITER ','
CSV HEADER

 

Note: if your pgsql/data directory is not located under /usr/local/pgsql/data please update the path according to your server setup.

 

PostgreSQL COPY FROM: Loading a CSV File into a Table

 

When you are trying to import data into PostgreSQL via CSV file, it requires a little preparation before you can execute the COPY FROM query.  In order for the COPY FROM to succeed, we have to first create an empty table schema capable of storing the data.  Download the sample csv data here and open it up.  This data lists the 10 largest municipalities in British Columbia according to the 1996 population value.

It contains 4 columns:

gid prov name pop1996

 

Looking at the data you can determine the data types required for each column of data.  These data types are required when running the CREATE TABLE query.

 

CREATE TABLE bctop10 (gid numeric, prov character varying(2),"name" character varying(68), pop1996 numeric)

 

Now that the empty table schema is ready, you must upload the bctop10.csv file to your server and move the file into the csv directory.

 

mv bctop10.csv /usr/local/pgsql/data/csv

 

Once the file is in your csv directory you can import the data into your empty table schema using the following query.

 

COPY bctop10(gid, prov, "name", pop1996)
FROM '/usr/local/pgsql/data/csv/bctop10.csv'
WITH DELIMITER ','
CSV HEADER

 

Note: if your pgsql/data directory is not located under /usr/local/pgsql/data please update the path according to your server setup.