Cluster analysis

Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). It is a main task of exploratory data mining, and a common technique for statistical data analysis used in many fields, including machine learningpattern recognitionimage analysisinformation retrieval, and bioinformatics.

Cluster analysis itself is not one specific algorithm, but the general task to be solved. It can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them. Popular notions of clusters include groups with small distances among the cluster members, dense areas of the data space, intervals or particular statistical distributions. Clustering can therefore be formulated as a multi-objective optimization problem. The appropriate clustering algorithm and parameter settings (including values such as the distance function to use, a density threshold or the number of expected clusters) depend on the individual data set and intended use of the results. Cluster analysis as such is not an automatic task, but an iterative process of knowledge discovery or interactive multi-objective optimization that involves trial and failure. It will often be necessary to modify data preprocessing and model parameters until the result achieves the desired properties.

Besides the term clustering, there are a number of terms with similar meanings, including automatic classificationnumerical taxonomybotryology (from Greek βότρυς “grape”) and typological analysis. The subtle differences are often in the usage of the results: while in data mining, the resulting groups are the matter of interest, in automatic classification primarily their discriminative power is of interest. This often leads to misunderstandings between researchers coming from the fields of data mining and machine learning, since they use the same terms and often the same algorithms, but have different goals.

Cluster analysis was originated in anthropology by Driver and Kroeber in 1932 and introduced to psychology by Zubin in 1938 and Tryon in 1939 [1] and famously used by Cattell beginning in 1943 [2] for trait theory classification in personality psychology


things might affecting neighborhood definition

Discussions about the traits of strong downtowns and what makes them succeed usually focus on larger cities such as Vancouver, BC, Portland, OR, New York, NY or Charleston, SC. However, a lot can also be learned by looking at things on a smaller scale. This happened to the authors, when we recently looked at downtowns in two small Wisconsin communities. What we learned from them is applicable to many other communities of comparable size.

Our experiences in these two communities certainly confirmed that two basic and broadly held revitalization tenets are just as applicable to small communities as they are to large ones: the need for a comprehensive approach to downtown revitalization and the need to focus on leveraging existing assets. The focus here will be on three other topics that evidence these tenets and deserve our attention:

  • The surprisingly complex economic development challenges that many small downtowns typically face
  • Providing jobs, especially in more rural areas, is a chronic and seemingly intractable problem
  • These small communities too often lack the resources and full range of professionals to initiate and manage broad economic changes.

we again found an economy with numerous economic components and related markets that would have to be analyzed:

  •  Retail and restaurants
  •  Personal services
  •  Educational facilities
  •  A medical clinic
  •  A seniors’ home
  •  A high tech manufacturer

Complex Land Use and Transportation Issues. Even more surprising than the number of markets we had to investigate in Sherwood and the depth of the analyses they required were the complex land use and transportation issues that were hurting the downtown:

  • A high degree of dispersion that might be more readily expected in a larger, more urban community. Even with its small population, Sherwood has four commercial nodes including a growing highway node that intercepts a lot of residents before they reach the downtown and where significant new businesses want to locate, e.g. a supermarket, a childcare center, restaurants. There is really poor economic agglomeration, and in a small economy economic assets benefit even more from agglomeration
  • The downtown is “unfriendly” to pedestrians – it lacks “walkability.” It has significant traffic with lots of trucks. It lacks a solid building wall front and adequate parking spaces. Many of its businesses are closed to shoppers during the day
  • An inability to benefit from a nearby “captive market.” Access to an abutting popular state park was changed so visitors no longer had to drive through the downtown – or Sherwood
  • An underdeveloped local roadway system that does not bring residents in newer parts of town naturally to the downtown. Also, the State recently proposed a highway expansion through the heart of downtown, which would have demolished several businesses and undermined what little pedestrian activity currently exists.

our team found a number of complex land use and transportation issues to address. However, unlike Sherwood, which faces growing pains associated with exurban growth, Village X is facing strong, complex and seemingly intractable challenges, characteristic of other small, often more rural communities and their downtowns:

  • Its region is sparsely populated and has little or no growth
  • The regional economy has long been problematic
  •  Attracting or creating firms that can provide new jobs is tough.

Characteristics and Guidelines of Great Neighborhoods

Characteristics and Guidelines of Great Neighborhoods

A neighborhood can be based on a specific plan or the result of a more organic process.

Neighborhoods of different kinds are eligible — downtown, urban, suburban, exurban, town, small village — but should have a definable sense of boundary.

Neighborhoods selected for a Great Neighborhood designation must be at least 10 years old.

Description of the Neighborhood

It is important to identify the geographic, demographic, and social characteristics of the neighborhood. Tell us about its location (i.e. urban, suburban, rural, etc.), density (i.e. dwelling units per acre), or street layout and connectivity; economic, social, and ethnic diversity; and functionality (i.e. residential, commercial, retail, etc.). We also want to know whether a plan or specific planning efforts contributed to or sustained the character of the neighborhood, or if the neighborhood formed more organically and not through a formal planning process.

Neighborhood Form and Composition

How does the neighborhood …

  • Capitalize on building design, scale, architecture, and proportionality to create interesting visual experiences, vistas, or other qualities?
  • Accommodate multiple users and provide access (via walking, bicycling, or public transit) to multiple destinations that serve its residents?
  • Foster social interaction and create a sense of community and neighborliness?
  • Promote security from crime is made safe for children and other users (i.e. traffic calming, other measures)?
  • Use, protect, and enhance the environment and natural features?

Neighborhood Character and Personality

How does the neighborhood …

  • Reflect the community’s local character and set itself apart from other neighborhoods?
  • Retain, interpret, and use local history to help create a sense of place?

Neighborhood Environment and Sustainable Practices

How does the neighborhood …

  • Promote or protect air and water quality, protect groundwater resources, and respond to the growing threat of climate change? What forms of “green infrastructure” are used (e.g., local tree cover mitigating heat gain)?
  • Utilize measures or practices to protect or enhance local biodiversity or the local environment?

Great Neighborhoods – Characteristics and Guidelines for Designation

A neighborhood can be based on a specific plan or the result of a more organic process. Neighborhoods of different kinds are eligible — downtown, urban, suburban, exurban, town, small village — but should have a definable sense of boundary. Neighborhoods selected for a Great Neighborhood designation must be at least 10 years old.

Characteristics of a Great Neighborhood include:

  1. Has a variety of functional attributes that contribute to a resident’s day-to-day living (i.e. residential, commercial, or mixed-uses).
  1. Accommodates multi-modal transportation (i.e. pedestrians, bicyclists, drivers).
  1. Has design and architectural features that are visually interesting.
  1. Encourages human contact and social activities.
  1. Promotes community involvement and maintains a secure environment.
  1. Promotes sustainability and responds to climatic demands.
  1. Has a memorable character.

Description of the Neighborhood

  1. When was the neighborhood first settled?
  1. Where is the neighborhood located: in a downtown, urban area, suburb, exurban area (i.e., on the fringes of a metropolitan area), village, or small town? What is the neighborhood’s approximate density (e.g., in dwelling units per acre, or other)?
  1. What is the neighborhood’s location, its physical extent, and layout?  What are the boundaries of the neighborhood? Are these boundaries formal, defined by an institution or jurisdiction (i.e., wards or other political boundaries, neighborhood associations, other entities) or is the neighborhood defined informally?
  1. How large a geographic area does the neighborhood encompass (number of blocks, acres, or other measurement)?
  1. What is the layout (e.g., grid, curvilinear) of the streets? Is there street connectivity; is it easy to get from one place to another by car, foot, or bike within or beyond the neighborhood without going far out of one’s way?
  1. What is the mix of residential, commercial, retail and other uses?
  1. What activities and facilities support everyday life (e.g., housing, schools, stores, parks, green space, businesses, churches, public or private facilities, common streets, transit, etc.)?
  1. Is there diversity amongst the residents, including economic, social, ethnic, and demographic? Describe the neighborhood’s homogeneity or heterogeneity in those terms.
  1. How has a plan or planning contributed to or sustained the character of the neighborhood? Or did the neighborhood form more organically and not through a formal planning process?

Guidelines for Great Neighborhoods

1.0 Neighborhood Form and Composition

1.1 Does the neighborhood have an easily discernable locale? What are its borders?

1.2 How is the neighborhood fitted to its natural setting and the surrounding environs?

1.3 What is the proximity between different places in the neighborhood? Are these places within walking or biking distances? Does walking or bicycling within the neighborhood serve multiple purposes? Describe (access to transit, parks, public spaces, shopping, schools, etc.). How are pedestrians and bicyclists accommodated (sidewalks, paths or trails, designated bike lanes, share-the-road signage, etc.).

1.4 How does the neighborhood foster social interaction and promote human contact? How is a sense of community and neighborliness created?

1.5 Does the neighborhood promote security from crime, and is it perceived as safe? How are streets made safe for children and other users (e.g., traffic calming, other measures)?

1.6 Is there consistency of scale between buildings (i.e., are buildings proportional to one another)?

2.0 Neighborhood Character and Personality

2.1 What makes the neighborhood stand out? What makes it extraordinary or memorable? What elements, features, and details reflect the community’s local character and set the neighborhood apart from other neighborhoods?

2.2 Does the neighborhood provide interesting visual experiences, vistas, natural features, or other qualities?

2.3 How does the architecture of houses and other buildings create visual interest? Are the houses and buildings designed and scaled for pedestrians?

2.4 How is local history retained, interpreted, and used to help create a sense of place?

2.5 How has the neighborhood adapted to change? Include specific examples.

3.0  Neighborhood Environment and Sustainable Practices

3.1 How does the neighborhood respond to the growing threat of climate change? (e.g., local tree cover mitigating heat gain)?

3.2 How does the neighborhood promote or protect air and water quality, protect groundwater resources if present, and minimize or manage stormwater runoff? Is there any form of “green infrastructure”?

3.3 What measures or practices exist to protect or enhance local biodiversity or the local environment?

The Livehoods Project: Utilizing Social Media to Understand the Dynamics of a City

Studying the social dynamics of a city on a large scale has tra- ditionally been a challenging endeavor, requiring long hours of observation and interviews, usually resulting in only a par- tial depiction of reality. At the same time, the boundaries of municipal organizational units, such as neighborhoods and districts, are largely statically defined by the city government and do not always reflect the character of life in these ar- eas. To address both difficulties, we introduce a clustering model and research methodology for studying the structure and composition of a city based on the social media its res- idents generate. We use data from approximately 18 million check-ins collected from users of a location-based online so- cial network. The resulting clusters, which we call Livehoods, are representations of the dynamic urban areas that comprise the city. We take an interdisciplinary approach to validating these clusters, interviewing 27 residents of Pittsburgh, PA, to see how their perceptions of the city project onto our findings there. Our results provide strong support for the discovered clusters, showing how Livehoods reveal the distinctly charac- terized areas of the city and the forces that shape them.

Mapping Foursquare Check-Ins using QGIS

Here are some simple steps I used to map the Foursquare check-ins:

1) Doing  simple Foursquare search using the venue explore API, I selected coffee shop check-ins, using coordinates for The Loop within a 5000 meter distance.  These are the response results Apigee Snapshot: Foursquare The Loop Coffee Shop Check-Ins.

2) In order to geolocate these check-ins in QGIS, I had to import the JSON into Excel, filter and parse the information to show, name of coffee shop, latitude, and longitude.

3) Import the CSV file to QGIS

These maps are showing coffee shop check-ins in The Loop with purple dots.


Coffee check-ins, The Loop 09.23.13 830 amCoffee check-ins, The Loop 09.23.13 830 am 2


What is a Smarter City?

Infrastructure. Operations. People.

What makes a city? The answer, of course, is all three. A city is an interconnected system of systems. A dynamic work in progress, with progress as its watchword. A tripod that relies on strong support for and among each of its pillars, to become a smarter city for all.


Smarter cities of the future will drive sustainable economic growth. Their leaders have the tools to analyze data for better decisions, anticipate problems to resolve them proactively and coordinate resources to operate effectively.

As demands grow and budgets tighten, solutions also have to be smarter, and address the city as a whole. By collecting and analyzing the extensive data generated every second of every day, tools such as the IBM Intelligent Operations Center coordinate and share data in a single view creating the big picture for the decision makers and responders who support the smarter city.