Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters). It is a main task of exploratory data mining, and a common technique for statistical data analysis used in many fields, including machine learning, pattern recognition, image analysis, information retrieval, and bioinformatics.
Cluster analysis itself is not one specific algorithm, but the general task to be solved. It can be achieved by various algorithms that differ significantly in their notion of what constitutes a cluster and how to efficiently find them. Popular notions of clusters include groups with small distances among the cluster members, dense areas of the data space, intervals or particular statistical distributions. Clustering can therefore be formulated as a multi-objective optimization problem. The appropriate clustering algorithm and parameter settings (including values such as the distance function to use, a density threshold or the number of expected clusters) depend on the individual data set and intended use of the results. Cluster analysis as such is not an automatic task, but an iterative process of knowledge discovery or interactive multi-objective optimization that involves trial and failure. It will often be necessary to modify data preprocessing and model parameters until the result achieves the desired properties.
Besides the term clustering, there are a number of terms with similar meanings, including automatic classification, numerical taxonomy, botryology (from Greek βότρυς “grape”) and typological analysis. The subtle differences are often in the usage of the results: while in data mining, the resulting groups are the matter of interest, in automatic classification primarily their discriminative power is of interest. This often leads to misunderstandings between researchers coming from the fields of data mining and machine learning, since they use the same terms and often the same algorithms, but have different goals.
Cluster analysis was originated in anthropology by Driver and Kroeber in 1932 and introduced to psychology by Zubin in 1938 and Tryon in 1939  and famously used by Cattell beginning in 1943  for trait theory classification in personality psychology
Discussions about the traits of strong downtowns and what makes them succeed usually focus on larger cities such as Vancouver, BC, Portland, OR, New York, NY or Charleston, SC. However, a lot can also be learned by looking at things on a smaller scale. This happened to the authors, when we recently looked at downtowns in two small Wisconsin communities. What we learned from them is applicable to many other communities of comparable size.
Our experiences in these two communities certainly confirmed that two basic and broadly held revitalization tenets are just as applicable to small communities as they are to large ones: the need for a comprehensive approach to downtown revitalization and the need to focus on leveraging existing assets. The focus here will be on three other topics that evidence these tenets and deserve our attention:
- The surprisingly complex economic development challenges that many small downtowns typically face
- Providing jobs, especially in more rural areas, is a chronic and seemingly intractable problem
- These small communities too often lack the resources and full range of professionals to initiate and manage broad economic changes.
we again found an economy with numerous economic components and related markets that would have to be analyzed:
- Retail and restaurants
- Personal services
- Educational facilities
- A medical clinic
- A seniors’ home
- A high tech manufacturer
Complex Land Use and Transportation Issues. Even more surprising than the number of markets we had to investigate in Sherwood and the depth of the analyses they required were the complex land use and transportation issues that were hurting the downtown:
- A high degree of dispersion that might be more readily expected in a larger, more urban community. Even with its small population, Sherwood has four commercial nodes including a growing highway node that intercepts a lot of residents before they reach the downtown and where significant new businesses want to locate, e.g. a supermarket, a childcare center, restaurants. There is really poor economic agglomeration, and in a small economy economic assets benefit even more from agglomeration
- The downtown is “unfriendly” to pedestrians – it lacks “walkability.” It has significant traffic with lots of trucks. It lacks a solid building wall front and adequate parking spaces. Many of its businesses are closed to shoppers during the day
- An inability to benefit from a nearby “captive market.” Access to an abutting popular state park was changed so visitors no longer had to drive through the downtown – or Sherwood
- An underdeveloped local roadway system that does not bring residents in newer parts of town naturally to the downtown. Also, the State recently proposed a highway expansion through the heart of downtown, which would have demolished several businesses and undermined what little pedestrian activity currently exists.
our team found a number of complex land use and transportation issues to address. However, unlike Sherwood, which faces growing pains associated with exurban growth, Village X is facing strong, complex and seemingly intractable challenges, characteristic of other small, often more rural communities and their downtowns:
- Its region is sparsely populated and has little or no growth
- The regional economy has long been problematic
- Attracting or creating firms that can provide new jobs is tough.
Studying the social dynamics of a city on a large scale has tra- ditionally been a challenging endeavor, requiring long hours of observation and interviews, usually resulting in only a par- tial depiction of reality. At the same time, the boundaries of municipal organizational units, such as neighborhoods and districts, are largely statically defined by the city government and do not always reflect the character of life in these ar- eas. To address both difficulties, we introduce a clustering model and research methodology for studying the structure and composition of a city based on the social media its res- idents generate. We use data from approximately 18 million check-ins collected from users of a location-based online so- cial network. The resulting clusters, which we call Livehoods, are representations of the dynamic urban areas that comprise the city. We take an interdisciplinary approach to validating these clusters, interviewing 27 residents of Pittsburgh, PA, to see how their perceptions of the city project onto our findings there. Our results provide strong support for the discovered clusters, showing how Livehoods reveal the distinctly charac- terized areas of the city and the forces that shape them.
Here are some simple steps I used to map the Foursquare check-ins:
1) Doing simple Foursquare search using the venue explore API, I selected coffee shop check-ins, using coordinates for The Loop within a 5000 meter distance. These are the response results Apigee Snapshot: Foursquare The Loop Coffee Shop Check-Ins.
2) In order to geolocate these check-ins in QGIS, I had to import the JSON into Excel, filter and parse the information to show, name of coffee shop, latitude, and longitude.
3) Import the CSV file to QGIS
These maps are showing coffee shop check-ins in The Loop with purple dots.
What is a Smarter City?
Infrastructure. Operations. People.
What makes a city? The answer, of course, is all three. A city is an interconnected system of systems. A dynamic work in progress, with progress as its watchword. A tripod that relies on strong support for and among each of its pillars, to become a smarter city for all.
Smarter cities of the future will drive sustainable economic growth. Their leaders have the tools to analyze data for better decisions, anticipate problems to resolve them proactively and coordinate resources to operate effectively.
As demands grow and budgets tighten, solutions also have to be smarter, and address the city as a whole. By collecting and analyzing the extensive data generated every second of every day, tools such as the IBM Intelligent Operations Center coordinate and share data in a single view creating the big picture for the decision makers and responders who support the smarter city.