You are browsing the archive for Mapping.

Mapping Skillshare with Codrina

- October 10, 2014 in Community, Events, Geocoding, HowTo, Mapping, School_Of_Data

Why maps are useful visualization tools? What doesn’t work with maps? Today we hosted a School of Data skillshare with Codrina Ilie, School of data Fellow.

Codrina Ilie shares perspectives on building a map project

What makes a good map? How can perspective, assumptions and even colour change the quality of the map? This is a one-hour video skillshare to learn all about map making from our School of Data fellow:

Learn some basic mapping skills with slides

Codrina prepared these slides with some extensive notes and resources. We hope that it helps you on your map journey.


Hand drawn map

Resources:

(Note: the hand drawn map was created at School of Data Summer Camp. Photo by Heather Leson CCBY)

Flattr this!

Breaking the Knowledge Barrier: The #OpenData Party in Northern Nigeria

- October 1, 2014 in Community, Data Expeditions, Data for CSOs, Events, Follow the Money, Geocoding, Mapping, Spreadsheets, Storytelling, Uncategorized, Visualisation

If the only news you have been watching or listening to about Northern Nigeria is of the Boko Haram violence in that region of Nigeria, then you need to know that other news exist, like the non-government organizations and media, that are interested in using the state and federal government budget data in monitoring service delivery, and making sure funds promised by government reach the community it was meant for.

This time around, the #OpenData party moved from the Nigeria Capital – Abuja to Gusau, Zamfara and was held at the Zamfara Zakat and Endowment Board Hall between September Thursday, 25 and Friday, 26, 2014. With 40 participant all set for this budget data expedition, participants included the state Budget Monitoring Group (A coalition of NGOs in Zamfara) coordinated by the DFID (Development for International Development) State Accountability and Voice Initiative (SAVI),other international NGOs such as Society for Family Health (SFH), Save the Children, amongst others.

IMAG1553

Group picture of participants at the #OpenData Party in Zamfara

But how do you teach data and its use in a less-technology savvy region? We had to de-mystify teaching data to this community, by engaging in traditional visualization and scraping – which means the use of paper artworks in visualizing the data we already made available on the Education Budget Tracker. “I never believed we could visualize the education budget data of the federal government as easy as what was on the wall” exclaimed Ahmed Ibrahim of SAVI

IMAG1516

Visualization of the Education Budget for Federal Schools in Zamfara

As budgets have become a holy grail especially with state government in Nigeria, of most importance to the participants on the first day, was how to find budget data, and processes involved in tracking if services were really delivered, as promised in the budget. Finding the budget data of the state has been a little bit hectic, but with much advocacy, the government has been able to release dataset on the education and health sector. So what have been the challenges of the NGOs in tracking or using this data, as they have been engaged in budget tracking for a while now?

Challenges of Budget Tracking Highlighted by participants

Challenges of Budget Tracking Highlighted by participants

“Well, it is important to note that getting the government to release the data took us some time and rigorous advocacy, added to the fact that we ourselves needed training on analysis, and telling stories out of the budget data” explained Joels Terks Abaver of the Christian Association of Non Indigenes. During one of the break out session, access to budget information and training on how to use this budget data became a prominent challenge in the resolution of the several groups.

The second day took participants through the data pipelines, while running an expedition on the available education and health sector budget data that was presented on the first day. Alas! We found out a big challenge on this budget data – it was not location specific! How does one track a budget data that does not answer the question of where? When involved in budget tracking, it is important to have a description data that states where exactly the funds will go. An example is Construction of Borehole water pump in Kaura Namoda LGA Primary School, or we include the budget of Kaura Namoda LGA Primary School as a subtitle in the budget document.

Taking participants through the data pipelines and how it relates to the Monitoring and Evaluation System

Taking participants through the data pipelines and how it relates to the Monitoring and Evaluation System

In communities like this, it is important to note that soft skills are needed to be taught – , like having 80% of the participants not knowing why excel spreadsheets are been used for budget data; like 70% of participants not knowing there is a Google spreadsheet that works like Microsoft Excel; like all participants not even knowing where to get the Nigeria Budget data and not knowing what Open Data means. Well moving through the school of data through the Open Data Party in this part of the world, as changed that notion.”It was an interesting and educative 2-day event taking us through the budget cycle and how budget data relates to tracking” Babangida Ummar, the Chairman of the Budget Working Group said.

Going forward, this group of NGO and journalist has decided to join trusted sources that will be monitoring service delivery of four education institutions in the state, using the Education Budget Tracker. It was an exciting 2-day as we now hope to have a monthly engagement with this working group, as a renewed effort in ensuring service delivery in the education sector. Wondering where the next data party will happen? We are going to the South – South of Nigeria in the month of October – Calabar to be precise, and on the last day of the month, we will be rocking Abuja!

Flattr this!

A Weekend of Data, Hacks and Maps in Nigeria

- September 16, 2014 in charity data, Data Cleaning, Data Expeditions, event, Mapping, maps, School_Of_Data, Spreadsheets, Visualisation

It was another weekend of hacking for good all around the world, and Abuja, Nigeria was not left out of the weekend of good, as 30 participants gathered at the Indigo Trust funded space of Connected Development [CODE] on 12 – 14 September, scraping datasets, brainstorming creating technology for good, and not leaving one thing out – talking soccer (because it was a weekend, and Nigeria “techies” love soccer especially the English premiership).

Participants at the Hack4Good 2014 in Nigeria

Participants at the Hack4Good 2014 in Nigeria

Leading the team, was Dimgba Kalu (Software Architect with Integrated Business Network and founder TechNigeria), who kick started the 3 day event that was built around 12 coders with other 18 participants that worked on the Climate Change adaptation stream of this year #Hack4Good. So what data did we explore and what was hacked over the weekend in Nigeria? Three streams were worked :

  1. Creating a satellite imagery tagging/tasking system that can help the National Space Research Development Agency deploy micromappers to tag satellite imageries from the NigeriaSat1 and NigeriaSat2
  2. Creating an i-reporting system that allows citizen reporting during disasters to Nigeria Emergency Management Agency
  3. Creating an application that allows citizens know the next water point and its quality within their community and using the newly released dataset from the Nigeria Millennium Development Goal Information System on water points in the country.

Looking at the three systems that was proposed to be developed by the 12 coders, one thing stands out, that in Nigeria application developers still find it difficult to produce apps that can engage citizens – a particular reason being that Nigerians communicate easily through the radio, followed by SMS as it was confirmed while I did a survey during the data exploration session.

Coders Hackspace

Coders Hackspace

Going forward, all participants agreed that incorporating the above medium (Radio and SMS) and making games out of these application could arouse the interest of users in Nigeria.  “It doesn’t mean that Nigerian users are not interested in mobile apps, what we as developers need is to make our apps more interesting” confirmed Jeremiah Ageni, a participant.

The three days event started with the cleaning of the water points data, while going through the data pipelines, allowing the participants to understand how these pipelines relates to mapping and hacking. While the 12 hackers were drawn into groups, the second day saw thorough hacking – into datasets and maps! Some hours into the second day, it became clear that the first task wouldn’t be achievable; so much energy should be channelled towards the second and third task.

SchoolofData Fellow - Oludotun Babayemi taking on the Data Exploration session

SchoolofData Fellow – Oludotun Babayemi taking on the Data Exploration session

Hacking could be fun at times, when some other side attractions and talks come up – Manchester United winning big (there was a coder, that was checking every minutes and announcing scores)  , old laptops breaking (seems coders in Abuja have old  ones), coffee and tea running out (seems we ran out of coffee, like it was a sprint), failing operating systems (interestingly, no coders in the house had a Mac operating system), fear of power outage (all thanks to the power authority – we had 70 hours of uninterrupted power supply) , and no encouragement from the opposite sex (there was only two ladies that strolled into the hack space).

Bring on the energy to the hackspace

Bring on the energy to the hackspace

As the weekend drew to a close, coders were finalizing and preparing to show their great works.  A demo and prototype of streams 2 and 3 were produced. The first team (working on stream 2), that won the hackathon developed EMERGY, an application that allows citizens to send geo-referenced reports disasters such as floods, oil spills, deforestation to the National Emergency Management Agency of Nigeria, and also create a situation awareness on disaster tagged/prone communities, while the second team, working on stream 3, developed KNOW YOUR WATER POINT an application that gives a geo-referenced position of water points in the country. It allows communities; emergency managers and international aid organizations know the next community where there is a water source, the type, and the condition of the water source.

(The winning team of the Hack4Good Nigeria) From Left -Ben; Manga; SchoolofData Fellow -Oludotun Babayemi; Habib; Chief Executive, CODE - Hamzat

(The winning team of the Hack4Good Nigeria) From Left -Ben; Manga; SchoolofData Fellow -Oludotun Babayemi; Habib; Chief Executive, CODE – Hamzat

Living with coders all through the weekend, was mind blowing, and these results and outputs would not be scaled without its challenges. “Bringing our EMERGY application live as an application that cuts across several platforms such as java that allows it to work on feature phones can be time consuming and needs financial and ideology support” said Manga, leader of the first team. Perhaps, if you want to code, do endeavour to code for good!

 

Flattr this!

How to: Choropleth Maps with D3

- June 6, 2014 in Data Journalism, Geocoding, HowTo, Mapping, maps


[Guest Cross-post from Jonathon Morgan of Crisis.net. CrisisNET finds, formats and exposes crisis data in a simple, intuitive structure that’s accessible anywhere. Now developers, journalists and analysts can skip the days of tedious data processing and get to work in minutes with only a few lines of code. See the Original post]

syriamapcut

D3 is quickly become the de facto library for browser-based data visualizations. However while it’s widely used for line graphs and bar charts, its mapping features are still fairly underutilized — particularly in relation to more established tools like CartoDB, and of course Google Maps. Those tools have their place, but when you need fine-grained control over the presentation and interactivity of your geospatial data, D3 can be a powerful alternative.

Today we’ll walk through how to create a popular visualization; the choropleth map. These are used to show the relative concentration of data points within a given region. For example this might be the number of people within a particular age range in every county in a state, or the number of reported cases of the flu in each state in a country. The information we’ll be mapping is a little more exotic. I recently collaborated with Eliot Higgins, an arms transfer analyst focused on the ongoing conflict in Syria, to retrieve data from 1,700 Facebook pages and YouTube accounts associated with militant groups and humanitarian organizations working in Syria. We ingested that data into CrisisNET, which then made it possible for us to generate a “heat map” showing which parts of Syria are experiencing the most intense fighting.

In order to do this we’ll need to:

  • Work with projections to transform latitude, longitude pairs to x, y browser coordinates
  • Render city boundaries as SVG paths using D3 drawing tools
  • Shade each city relative to its reported level of violence

Let’s get started.

Before we can do anything we’ll need some data. A geospatial “feature” (like a city, state, etc), is defined as a polygon, which is represented as a list of latitude/longitude pairs. For example:

[
[ 36.712428478000049, 35.83274311200006 ],
[ 36.704171874000053, 35.830347390000043 ],

]

Each pair is a corner of the polygon, so if you plotted them on a map and connected the dots, you would get the outline of the feature. Awesome!

Geospatial data comes in a variety of formats, like shapefiles, and KML. However the emerging standard, particularly for use in web applications, is GeoJSON. Not surprisingly, this is the format supported by D3 and the one we’ll be using.

Depending on the region you’re trying to map, GeoJSON polygons defining features in that region may be easy to find — like these GeoJSON files for all counties in the United States. On the other hand, particularly if you’re interested in the developing world, you’ll probably need to be more creative. To map cities in Syria, I tracked down a shapefile from an NGO called Humanitarian Response, and then converted that shapefile to GeoJSON using a tool called ogr2ogr. Fortunately for you, I’ve made the GeoJSON file available, so just download that and you’ll be ready to go.

Let’s Talk Projections

With our polygons in hand, we can start mapping.

Remember that latitude and longitude coordinates denote positions on the surface of the Earth, which is not flat (it is an ellipsoid). Your computer screen is a plane (which means it’s flat), so we need some way to translate the position of a point on a curved surface to its corresponding point on a flat surface. The algorithms for doing this are called “projections.” If, like me, you’ve forgotten most of your high school geometry, you’ll be pleased to learn that D3 comes included with a number of popular projections, so we won’t need to write one. Our only job is to choose the correct projection for our visualization.

The Albers and Azimuthal Equal Area projections are recommended for choropleth maps, but I found both rendered my cities in a way that didn’t connect all the points in the polygons from our shapefile, so some of the city outlines didn’t form an enclosed shape. This made it impossible to shade each city without the color overflowing into other parts of the map. Although this is probably due more to my lack of familiarity with the specifics the Albers and Azimutha projections, I found that the Conic Conformal projection worked out of the box, so that’s the one I chose.

Drawing the Map

Now that you understand the background, we can start coding. First, attach an element to the DOM that will serve as our canvas.

Next create an SVG element and append it to the map DOM node we just created. We’ll be drawing on this SVG element in just a second.

// Size of the canvas on which the map will be rendered
var width = 1000,
height = 1100,
// SVG element as a JavaScript object that we can manipulate later
svg = d3.select(“#map”).append(“svg”)
.attr(“width”, width)
.attr(“height”, height);

Despite the rather lengthy explanation, defining the projection in our application is actually fairly straightforward.

// Normally you’d look this up. This point is in the middle of Syria
var center = [38.996815, 34.802075];

// Instantiate the projection object
var projection = d3.geo.conicConformal()
.center(center)
.clipAngle(180)
// Size of the map itself, you may want to play around with this in
// relation to your canvas size
.scale(10000)
// Center the map in the middle of the canvas
.translate([width / 2, height / 2])
.precision(.1);

With a projection ready to go, we’re ready to instantiate a path. This is the path across your browser window D3 will take as it draws the edges of all our city polygons.

// Assign the projection to a path
var path = d3.geo.path().projection(projection);

Finally, let’s give some geospatial data to our path object. This data will be projected to x, y pairs, representing pixel locations on our SVG element. When D3 connects these dots, we’ll see the outlines of all the cities in Syria.

Let’s use d3’s json method to retrieve the GeoJSON file I referenced earlier.

d3.json(“cities.json”, function(err, data) {
$.each(data.features, function(i, feature) {
svg.append(“path”)
.datum(feature.geometry)
.attr(“class”, “border”);
});
});

That’s it!

Most of the heavy lifting is taken care of by D3, but in case you’re curious about what’s happening, here’s a little more detail. Our GeoJSON file contains an array of features, each of which is a polygon (which is represented as an array of longitude, latitude coordinate pairs). We pass the polygon to our path using the datum method, and the polygon is then converted by our projection to a linestring of pixel positions which is used by the browser to render a path DOM node inside our svg element. Phew.

With a working map of the country, we can now change its appearence and add interactivity just like any other DOM node. Next week we’ll use the CrisisNET API to count reports of violent incidents for each city in Syria, and shade each city on the map with CSS based on those report counts.

In the meantime you can checkout the full, working map on our Syria project page.

Flattr this!

Putting Points on Maps Using GeoJSON Created by Open Refine

- May 19, 2014 in Data Cleaning, Data for CSOs, HowTo, Mapping

Having access to geo-data is one thing, quickly sketching it on to a map is another. In this post, we look at how you can use OpenRefine to take some tabular data and export it in a format that can be quickly visualised on an interactive map.

At the School of Data, we try to promote an open standards based approach: if you put your data into a standard format, you can plug it directly into an application that someone else has built around that standard, confident in the knowledge that it should “just work”. That’s not always true of course, but we live in hope.

In the world of geo-data – geographical data – the geojson standard defines a format that provides a relatively lightweight way of representing data associated with points (single markers on a map), lines (lines on a map) and polygons (shapes or regions on a map).

Many applications can read and write data in this format. In particular, Github’s gist service allows you to paste a geojson data file into a gist, whereupon it will render it for you (Gist meets GeoJSON).

Gists_and_test

So how can we get from some tabular data that looks something like this:

simple_geo_points-tab_-_OpenRefine

Into the geojson data, which looks something like this?

{"features": [   {"geometry": 
        {   "coordinates": [  0.124862,
                 52.2033051
            ],
            "type": "Point"},
         "id": "Cambridge,UK",
         "properties": {}, "type": "Feature"
    },
   {"geometry": 
        {   "coordinates": [ 151.2164539,
                 -33.8548157
            ],
            "type": "Point"},
         "id": "Sydney, Australia",
         "properties": {}, "type": "Feature"
    }], "type": "FeatureCollection"}

[We’re assuming we have already geocoded the location to get latitude and longitude co-ordinates for it. To learn how to geocode your own data, see the School of Data lessons on geocoding or this tutorial on Geocoding Using the Google Maps Geocoder via OpenRefine].

One approach is to use OpenRefine [openrefine.org]. OpenRefine allows you to create your own custom export formats, so if we know what the geojson is supposed to look like (and the standard tells us that) we can create a template to export the data in that format.

Steps to use Open Refine:

Locate the template export tool is in the OpenRefine Export drop-down menu:

export-_OpenRefine

Define the template for our templated export format. The way the template is applied is to create a standard header (the prefix), apply the template to each row, separating the templated output for each row by a specified delimiter, and then adding a standard footer (the suffix).

simple_geo_points_-_OpenRefine

Once one person has worked out the template definition and shared it under an open license, the rest of us can copy it, reuse it, build on it, improve it, and if necessary, correct it…:-) The template definitions I’ve used here are a first attempt and represent a proof-of-concept demonstration: let us know if the approach looks like it could be useful and we can try to work it up some more.

It would be useful if OpenRefine supported the ability to save and import different template export configuration files, perhaps even allowing them to be imported from and save to a gist. Ideally, a menu selector would allow column names to be selected from the current data file and then used in template.

Here are the template settings for template that will take a column labelled “Place”, a column named “Lat” containing a numerical latitude value and a column named “Long” containing a numerical longitude and generate a geojson file that allows the points to be rendered on a map.

Prefix:

{"features": [

Row template:

 {"geometry": 
        {   "coordinates": [ {{cells["Long"].value}},
                {{cells["Lat"].value}}
            ],
            "type": "Point"},
         "id": {{jsonize(cells["Place"].value)}},
         "properties": {}, "type": "Feature"
    }

Row separator:

,

Suffix:

], "type": "FeatureCollection"}

This template information is also available as a gist: OpenRefine – geojson points export format template.

Another type of data that we might want to render onto a map is a set of markers that are connected to each other by lines.

For example, here is some data that could be seen as describing connections between two places that are mentioned on the same data row:

point_to_point_demo_tab_-_OpenRefine

The following template generates a place marker for each place name, and also a line feature that connects the two places.

Prefix:

{"features": [

Row template:

 {"geometry": 
        {   "coordinates": [ {{cells["from_lon"].value}},
                {{cells["from_lat"].value}}
            ],
            "type": "Point"},
         "id": {{jsonize(cells["from"].value)}},
         "properties": {}, "type": "Feature"
    },
{"geometry": 
        {   "coordinates": [ {{cells["to_lon"].value}},
                {{cells["to_lat"].value}}
            ],
            "type": "Point"},
         "id": {{jsonize(cells["to"].value)}},
         "properties": {}, "type": "Feature"
    },
{"geometry": {"coordinates": 
[[{{cells["from_lon"].value}}, {{cells["from_lat"].value}}], 
[{{cells["to_lon"].value}}, {{cells["to_lat"].value}}]], 
"type": "LineString"}, 
"id": null, "properties": {}, "type": "Feature"}

Row separator:

,

Suffix:

], "type": "FeatureCollection"}

If we copy the geojson output from the preview window, we can paste it onto a gist to generate a map preview that way, or test it out in a geojson format checker such as GeoJSONLint:

GeoJSONLint_-_Validate_your_GeoJSON

I have pasted a copy of the OpenRefine template I used to generate the “lines connecting points” geojson here: OpenRefine export template: connected places geojson.

Finally, it’s worth noting that if we can define a standardised way of describing template generated outputs from tabular datasets, libraries can be written for other programming tools or languages, such as R or Python. These libraries could read in a template definition file (such as the gists based on the OpenRefine export template definitions that are linked to above) and then as a direct consequence support “table2format” export data format conversions.

Which makes me wonder: is there perhaps already a standard for defining custom templated export formats from a tabular data set?

Flattr this!

The World Tweets Nelson Mandela’s Death

- December 10, 2013 in Data Stories, Mapping, Storytelling, Visualisation

The World Tweets Nelson Mandela’s DeathClick here to see the interactive version of the map above 

Data visualization is awesome! However, it conveys its goal when it tells a story. This weekend, Mandela’s death dominated the Twitter world and hashtags mentioning Mandela were trending worldwide. I decided to design a map that would show how people around the world tweeted the death of Nelson Mandela. First, I started collecting tweets associated with #RIPNelsonMandela using ScraperWiki. I collected approximately 250,000 tweets during the death day of Mandela. You can check this great recipe at school of data blog on how to extract and refine tweets.

scraperwiki

After the step above, I refined the collected tweets and uploaded the data into CartoDB. It is one of my favorite open source mapping tools and I will make sure to write a CartoDB tutorial in future posts. I used the Bubble or proportional symbol map which is usually better for displaying raw data. Different areas had different tweeting rates and this reflected how different countries reacted. Countries like South Africa, UK, Spain, and Indonesia had higher tweeting rates. The diameter of the circles represents the number of retweets. With respect to colors, the darker they appeared, the higher the intensity of tweets is.

That’s not the whole story! Basically, it is easy to notice that some areas have high tweeting rates such as Indonesia and Spain. After researching about this topic, it was quite interesting to know that Mandela had a unique connection with Spain, one forged during two major sporting events. In 2010, Nelson Mandela was present in the stadium when Spain’s international football team won their first ever World Cup Football trophy as well. Moreover, for Indonesians, Mandela has always been a source of joy and pride, especially as he was fond of batik and often wore it, even in his international appearances.

Nonetheless, it was evident that interesting insights can be explored and such data visualizations can help us show the big picture. It also highlight events and facts that we are not aware of in the traditional context.

Flattr this!

So you want to make a map…

- November 9, 2013 in HowTo, Mapping

Antique map of the world

Image credit: Rosario Fiore

School of Data is re-publishing Noah Veltman‘s Learning Lunches, a series of tutorials that demystify technical subjects relevant to the data journalism newsroom.

This Learning Lunch is about using geographic data to make maps for the web.

Are you sure?

Just because something can be represented geographically doesn’t mean it should. The relevant story may have nothing to do with geography. Maps have biases. Maps can be misleading. They may emphasize land area in a way that obscures population density, or show “geographic” patterns that merely demonstrate an underlying demographic pattern. Before you proceed, make sure a map is what you actually want.

For a more detailed take on this question, read When Maps Shouldn’t Be Maps.

What maps are made of

Maps generally consist of geographic data (we’ll call this geodata for short) and a system for visually representing that data.

Part 1: Geodata

Latitude and Longitude

Most geodata you encounter is based on latitude/longitude coordinates on Earth’s surface (mapping Mars is beyond the scope of this primer).

Latitude ranges from -90 (the South Pole) to 90 (the North Pole), with 0 being the equator.

Longitude ranges from -180 (halfway around the world going west from the prime meridian) to 180 (halfway around the world going east from the prime meridian), with 0 being the prime meridian. Yes, that means -180 and 180 are the same.

Latitude/Longitude

If you are an old-timey sea captain, you may find or write latitude and longitude in degrees + minutes + seconds, like:

37°46'42"N, 122°23'22"W

Computers are not old-timey sea captains, so it’s easier to give them decimals:

37.77833, -122.38944

A latitude/longitude number pair is often called a lat/lng or a lat/lon. We’ll call them lat/lngs.

Want to quickly see where a lat/lng pair is on earth? Enter it into Google Maps, just like an address.

* Sometimes mapping software wants you to give a lat/lng with the latitude first, sometimes it wants you to give it with the longitude first. Check the documentation for whatever you’re using (or, if you’re lazy like me, just try it both ways and then see which one is right).

* Precision matters, so be careful with rounding lat/lngs. At the equator, one degree of longitude is about 69 miles!

Map geometry

Almost any geographic feature can be expressed as a sequence of lat/lng points. They are the atomic building blocks of a map.

A location (e.g. a dot on a map) is a single lat/lng point:

37.77833,-122.38944

Point

A straight line (e.g. a street on a map) is a pair of lat/lng points, one for the start and one for the end:

37.77833,-122.38944 to 34.07361,-118.24

Line

A jagged line, sometimes called a polyline, is a list of straight lines in order, a.k.a. a list of pairs of lat/lng points:

37.77833,-122.38944 to 34.07361,-118.24
34.07361,-118.24 to 32.7073,-117.1566
32.7073,-117.1566 to 33.445,-112.067

Polyline

A closed region (e.g. a country on a map) is just a special kind of jagged line that ends where it starts. These are typically called polygons:

37.77833,-122.38944 to 34.07361,-118.24
34.07361,-118.24 to 32.7073,-117.1566
32.7073,-117.1566 to 33.445,-112.067
33.445,-112.067 to 37.77833,-122.38944

Polygon

The bottom line: almost any geodata you find, whether it represents every country in the world, a list of nearby post offices, or a set of driving directions, is ultimately a bunch of lists of lat/lngs.

Map features

Most common formats for geodata think in terms of features. A feature can be anything: a country, a city, a street, a traffic light, a house, a lake, or anything else that exists in a fixed physical location. A feature has geometry and properties.

A feature’s geometry consists of any combination of geometric elements like the ones listed above. So geodata for the countries of the world consists of about 200 features.* Each feature consists of a list of points to draw a jagged line step-by-step around the perimeter of the country back to the starting point, also known as a polygon. But wait, not every country is a single shape, you say! What about islands? No problem. Just add additional polygons for every unconnected landmass. By combining relatively simple geometric elements in complex ways, you can represent just about anything.

Let’s say you have the Hawaiian islands, each of which is represented as a polygon. Should that be seven features or one?* It depends on what kind of map we’re making. If we are analyzing something by state, we only care about the islands as a group and they’ll all be styled the same in the end. They should probably be a single feature with seven pieces of geometry. If, on the other hand, we are doing a map of Hawaiian wildlife by island, we need them to be seven separate features. There is also something called a “feature collection,” where you can loosely group multiple features for certain purposes, but let’s not worry about that for now.

A feature’s properties are everything else that matter for your map. For the countries of the world, you probably want their names, but you may also want things like birth rate, population, largest export, or whatever else is going to be involved in your map.

* One of the lessons you will learn when you start making maps is that questions that you thought had simple answers – like “What counts as a country?” and “How many Hawaiian islands are there?” – get a little complicated.

Geodata formats

So we’ve learned that geodata is a list of features, and each feature is a list of geometric pieces, and each geometric piece is a list of lat/lngs, so the whole thing looks something like this:

Feature #1:
    geometry:
        polygon #1: [list of lat/lngs]
        polygon #2: [list of lat/lngs] (for Easter Island)
        ...
    properties:
        name: Chile
        capital: Santiago
        ...
Feature #2:
    geometry:
        polygon #1: [list of lat/lngs]
        polygon #2: [list of lat/lngs]
        ...
    properties:
        name: Argentina
        capital: Buenos Aires
        ...

So we just need a big list of lat/lng points and then we can all go home, right? Of course not. In the real world, this data needs to come in some sort of consistent format a computer likes. Ideally it will also be a format a human can read, but let’s not get greedy.

Now that you know that geodata is structured like this, you will see that most common formats are very similar under the hood. Four big ones that you will probably come across are:

Shapefiles

This is the most common format for detailed map data. A “shapefile” is actually a set of files:

  • .shp — The geometry for all the features.
  • .shx — A helper file that stores what order the shapes should be in.
  • .dbf — stores the properties of each feature in a spreadsheet-like format.
  • Other optional files storing things like a project description and styling (only the above three files are required).

If you open a shapefile in a text editor, it will look like gibberish, but it will play really nicely with desktop mapping software, also called GIS software or geospatial software. Shapefiles are great for doing lots of detailed manipulation and inspection of geodata. By themselves, they are pretty lousy for making web maps, but fortunately it’s usually easy to convert them into a different format.

GeoJSON

A specific flavor of JSON that is great for web mapping. It’s also fairly human readable if you open it in a text editor. Let’s use the state of Colorado as an example, because it’s nice and rectangular.

{
  "type": "Feature",
  "geometry": {
    "type": "Polygon",
    "coordinates": [
      [
        [-102.04,36.99],
        [-102.04,40.99],
        [-109.05,40.99],
        [-109.05,36.99],
        [-102.04,36.99]
      ]
    ]
  },
  "properties": {
    "name": “Colorado"
    “capital”: “Denver”
  }
}

This means: Draw a polygon by starting from the first point ([-102.04,36.99]), drawing a line to the next point ([-102.04,40.99]), and repeating until the end of the list.

Notice that the last point is the same as the first point, closing the loop – most software doesn’t require this extra point and will close the loop for you.

KML

A specific flavor of XML that is heavily favored by Google Maps, Google Earth, and Google Fusion Tables. The basic components behave very similarly to GeoJSON, but are contained in XML tags instead of curly braces. KML supports lots of extra bells and whistles like camera positioning and altitude for making movies in Google Earth. It plugs really nicely into Google products, but generally needs to be converted to something else in order to make other web maps. So what does Colorado look like in KML?

<Polygon id="Colorado">
    <altitudeMode>clampToGround</altitudeMode>
    <outerBoundaryIs>
        <LinearRing>
            <coordinates>
                -102.04,36.99
                -102.04,40.99
                -109.05,40.99
                -109.05,36.99
                -102.04,36.99
            </coordinates>
        </LinearRing>
    </outerBoundaryIs>
</Polygon>

The XML tags can be very confusing, but note that the meat of this data is quite similar to the GeoJSON example. Both of them are just a list of points in order, with a lot of scary braces and brackets as window dressing.

TopoJSON

The new hotness. TopoJSON takes in a basic geodata format, like GeoJSON, and spits out a clever reduction of it by focusing on the part of a map we usually care about: borders and connections (a.k.a. topology). The details are beyond the scope of this primer, but you can read about the TopoJSON magic here: https://github.com/mbostock/topojson/wiki

Remember: different software accepts different file formats for geodata, but at the end of the day everyone speaks lat/lng. Different file formats are just dialects of the mother tongue.

I’m not a cartographer. Where do I get geodata I can actually use?

There are lots of good sources for geodata online. Here are a few helpful sources:

Natural Earth
http://www.naturalearthdata.com/
Offers shapefile downloads of a few different data sets for the entire earth: Cultural (country boundaries, state/province boundaries, roads, railroads, cities, airports, parks, etc.) and Physical (coastline, islands, rivers, lakes, glaciers, etc.).

US Census Bureau
http://www.census.gov/geo/maps-data/data/tiger.html
Detailed shapefiles or KML files for the entire US.

Geocommons
http://geocommons.com/
A wide variety of user-contributed geodata, easy to search, browse, or preview. Data reliability may vary.

Wikimedia Commons
http://commons.wikimedia.org/
Lots of detailed maps in SVG format, which can be easily used and modified for the web (see “SVG/Canvas Maps” below).

OpenStreetMap
http://www.openstreetmap.org/
A well-populated database of land, boundaries, roads, and landmarks for the entire earth. This is available as a special XML format, and has to be converted to be used with most software. Because it covers the whole earth and has good coverage of roads, points of interest, etc., it is often used to generate a whole-earth set of map tiles (see “Slippy Maps” below).

Google
http://www.google.com/
If you don’t have the map data you need, look around! You’d be surprised how much is out there once you start looking.

ogr2ogr Converter
http://ogre.adc4gis.com/
Not a source of data, but a handy converter if you need to convert between a shapefile and GeoJSON.

Desktop GIS Software
http://www.qgis.org/ (free)
http://www.esri.com/software/arcgis (very not free)
You’ll want to start getting the hang of desktop GIS software, especially if you’ll be working with shapefiles. Quantum GIS is free and excellent. Arc GIS is also very powerful but very expensive. These are not a data source, per se, but an important method of whipping imperfect data into shape before mapping it.

A note of caution: be distrustful of any geographic data you find, especially if it’s complex or you’ll be combining data from multiple sources. Geographic data is not immune to the variability of accuracy on the internet. You will find no shortage of misshapen shapefiles, mislabeled locations, and missing puzzle pieces.


Part 2: Turning Geodata Into A Web Map

The point of all this data drudgery is to make a cool map, right? So let’s forget about the curly braces and the geometry lessons and get to it. Broadly speaking, you have three options for mapping your data for display on the web. Before we look at them, make sure you ask yourself the following question:

Does my map need to be interactive?
Just because you can make your map interactive and animated doesn’t mean you should. Some of the best maps in the news are just images. Images are great for the web because virtually everything supports them. This isn’t exactly an alternative to other methods, because in order to make an image, you’ll need to make a map with something else first: desktop GIS software, Adobe Illustrator, or one of the three options below. Once you have the display you want, you can either take a screenshot or export it as an image. Even still, avoiding unnecessary interactivity and complexity in favor of flat images will cut down on a lot of mapping headaches.

Option 1: Slippy Maps

An example of a slippy map

Whether you know it or not, you’ve used a lot of slippy maps. Google Maps is a slippy map. Yahoo! Maps is a slippy map. Does MapQuest still exist? If so, it’s probably a slippy map. I’m using the term “slippy” to refer to a web map with a background layer that “slips” around smoothly, allowing you to pan and zoom to your heart’s content. The underlying magic of a slippy map is a set of tiles in the background that are just flat images, so you could also call them tile-based maps. At each zoom level, the entire globe is divided into a giant grid of squares. Wherever you are on the map, it loads images for the squares you’re looking at, and starts loading nearby tiles in case you move around. When you move the map, it loads the new ones you need. You can add other layers and clickable objects on top of the tiles, but they are the basic guts of a slippy map.

How tiles work

Slippy maps are great because:

  1. They’re easy on browsers and bandwidth. They only need to load the part of the world you’re looking at, and they’re image-based. Every major browser and device supports them.
  2. People are used to them. Thanks to the popularity of Google Maps, everyone has lots of practice using slippy maps.
  3. They’re pretty easy to make responsive to screen size. Just shrink the map box, the underlying tiles don’t change.
  4. They gracefully support most key mapping features, like zooming, panning, and adding markers. They keep track of the messy details so you can focus on geography.

Slippy maps are not great because:

  1. Generating your own tiles can be a complicated task, requiring data, styling, and some technical savvy.
  2. Even if you reuse tiles from an old map, or borrow someone else’s tiles, you sacrifice fine visual control.
  3. Image-based tiles are not well-suited for making things dynamic.
  4. You are usually confined to the standard Web Mercator map projection and zooming behavior.

As a loose rule of thumb, slippy maps are a good choice to the extent that:

  • Browser compatibility and performance are paramount.
  • You need to display a large explorable area at different zoom levels.
  • You don’t need precise visual control over everything.
  • Not all of the map components need to be dynamic or interactive.

How do I make one?

In order to use geodata to make a slippy map, you are really making two maps:

  1. Background tiles – You have to feed a set of data about where land is, where roads are, where points of interest are, etc. into a piece of software and have it generate images based on that. You can skip this step if you are content with an existing set of tiles (see below).
  2. Other content – Once you have tiles, you can use geodata to add things on top like markers or highlighted lines.

One of the best resources for making your own slippy map is Leaflet. This library will do most of the dirty work of a slippy map and let you focus on customizing it. You’ll have to write a little bit of JavaScript, but probably a lot less than you think.

Here is how we might draw Colorado in Leaflet:

map.addLayer(new L.polygon([
    [36.99,-102.04],
    [40.99,-102.04],
    [40.99,-109.05],
    [36.99,-109.05]
]));

Colorado drawn in Leaflet

Leaflet also speaks GeoJSON, so if we had a GeoJSON file with these coordinates we could feed it in directly:

L.geoJson(states).addTo(map);

You can use any tiles you want in Leaflet, including Google Maps.

Tiles

You can generate your own background tiles with TileMill.

Here is a very detailed tutorial on making your own tiles with TileMill:
http://dataforradicals.com/the-insanely-illustrated-guide-to-your-first-tile-mill-map/

You can create custom-styled tiles based on OpenStreetMap data with CloudMade:
http://developers.cloudmade.com/projects/show/tiles

You can also borrow beautiful tiles from Stamen Design:
http://maps.stamen.com/

Or use Google Maps for your base tiles:
https://developers.google.com/maps/

MapBox has a detailed guide on the nuts & bolts of slippy maps:
http://mapbox.com/developers/guide/

Option 2: JavaScript + SVG/Canvas

An example of an SVG map

Another option is to draw a map from scratch right in a web page. This is typically done using either SVG or the HTML5 <canvas> element, which are both methods of creating a drawing space in a webpage and then drawing lots of lines and shapes based on a set of instructions. Unlike maps, which speak lat/lng, these methods speak pixels. The point in the upper-left corner is 0,0. Any other pixel is X,Y where X is the number of pixels to the right of that corner, and Y is the number of pixels below it.

SVG diagram

Under the hood, SVG looks like HTML because it basically is.

<svg width="400" height="200">
    [Your drawing instructions go here]
</svg>

It allows you to draw basic shapes like lines, circles and rectangles, and do other advanced things like gradients and animation. When dealing with maps you’ll be dealing with a lot of <path> elements, which are the standard SVG way of drawing lines, and, by extension, complicated polygons. These can be straight lines, jagged lines, curved lines, line gumbo, deep-fried lines, line stew, you name it. A path gets its instructions on what to draw from one big list of instructions called a data string.

<path d="M200,0 L200,200 Z" /> (Start at 200,0, draw a line to 200,200, and then stop)

The syntax is a little off-putting but it’s not really any different from a polyline generated by lat/lng pairs. It just uses different abbreviations: M for where to start, L for where to draw the next line to, Z to stop. You can use the exact same system to draw geographic things like countries, you just need a lot more points. You might draw Aruba like this:

<path id="Aruba" d="M493.4952430197009,554.3097009349817 L493.6271899111516,554.5053144486982 L493.7591368026024,554.5053144486982 L493.7591368026024,554.6357015002002 L493.8910836940532,554.7660710469861 L493.8910836940532,554.8964231416435 L493.7591368026024,554.8964231416435 L493.6271899111516,554.8964231416435 L493.6271899111516,554.8312492725454 L493.7591368026024,554.8312492725454 L493.7591368026024,554.7660710469861 L493.6271899111516,554.7008884583952 L493.36329612825017,554.5053144486982 L493.23134923679936,554.5053144486982 L493.23134923679936,554.4401143422353 L493.23134923679936,554.3749098398575 L493.23134923679936,554.3097009349817 L493.23134923679936,554.2444876210226 L493.23134923679936,554.1792698913926 L493.23134923679936,554.1140477395024 L493.36329612825017,554.2444876210226 Z" />

You’ll notice that these numbers are way outside the range of a lat/lng. That’s because they aren’t lat/lngs. They are pixel values for a drawing space of a particular size. To go from lat/lng to pixels, you need to use what’s called a map projection, a method for turning lat/lngs into a 2-D drawing. When you make a slippy map, you will generally be automatically using what’s called the Web Mercator projection, but there are lots of others and you’ll need to pick one when making an SVG/Canvas map. This is a bit beyond the scope of this primer, but you can read about the built-in d3 map projections.

SVG/Canvas maps are great because:

  1. You have total visual control. You’re starting with a blank canvas and you can dictate everything about how it looks.
  2. They’re easy to make interactive and dynamic in new and exciting ways. All the pieces of the map are elements on the page just like anything else, so you can style and manipulate them with CSS and JavaScript.
  3. You don’t necessarily need real geodata to make one. If you already have an SVG (like the Wikimedia maps of the world) you can use that instead and skip all the lat/lng business.

SVG/Canvas maps are not great because:

  1. They have browser compatibility issues. IE8 doesn’t support them. (You can add support for IE7 and IE8 with certain libraries)
  2. Because your data is lat/lngs and the output is pixels, you need to deal with map projections to translate it before you draw.
  3. Performance becomes an issue as they get more complex.
  4. Implementing them often requires a reasonably high level of comfort with JavaScript.
  5. Users won’t necessarily know what to do. You will have to quickly teach impatient users how your special new map works to the extent that it deviates from what they’re used to.

How do I make one?

By far the most popular method for dynamically-drawn maps is d3, a fantastic but sometimes mind-bending JavaScript library that is good for many things, of which maps are just one.

There are lots of d3 mapping examples and tutorials, but they probably won’t make sense without a healthy amount of JavaScript under your belt:

http://bost.ocks.org/mike/map/
http://www.schneidy.com/Tutorials/MapsTutorial.html
http://www.d3noob.org/2013/03/a-simple-d3js-map-explained.html

If you want to go easy on the JavaScript, Kartograph.js is also a good option, and as a bonus, it includes support for IE7 and IE8.

Option 3: Let someone else do most of the work

If you have geographic data ready there are a number of services out there that will handle a lot of the actual mapping for you, with varying levels of control over the output:

Google Maps
https://maps.google.com/

Google Earth
http://earth.google.com/

Google Fusion Tables
http://www.google.com/drive/start/apps.html#fusiontables

CartoDB
http://cartodb.com

BatchGeo
http://batchgeo.com/

Flattr this!

Proving the Data – A Quick Guide to Mapping England and Wales Local Elections

- May 2, 2013 in HowTo, Mapping

If the role of news journalists is in part to hold the powers that be to account, whose role is to make sure that claimed releases of public open data are fit for purpose, or that appropriately licensed data is available for civic public use?

With local elections coming round again in the UK (Full Fact provide a good overview of what’a going on: Local elections: the who, what and why), we have an opportunity to put some of the open data released by UK local and county council elections to a practical test. THis post will focus in particular on geographical data, which means we can also have some fun learning how to make maps…

Your mission, should you choose to accept it, is to poke around a UK county council website (and maybe a local council website too) to see if they provide the data to make it easy to generate maps of that area. This data could then be used as the basis not just for reports on a local election, but also potentially for other civic use cases. Whilst much of the data may also be available via national datasets, many users of civic data in particular are more likely to be interested in data at a local level. Moreover, these users won’t necessarily have the tools, or the skills, required to download and process sometimes quite large datasets in order to extract just the local data of interest. Which is why we need to prove them at a local level.

So here are a few quick recipes for generating maps around some of the election data using open data and a variety of free tools we can find scattered around the web.

We’ll look at a couple of things in particular:

  • create your own polling station map from a KML source file of location markers;
  • create your own boundary line map of electoral wards or divisions using a boundary line data.

The sorts of maps we’ll generate are of two kinds. The first is just to plot markers onto a map. This approach can be ideal for plotting the location of polling stations, for example. Markers also provide the basis for proportional symbol maps, where the area of a symbol (such as a circle) plotted at a particular point (such as the centroid, or “middle point” of the electoral area) is used signify a numerical quantity, such as some function of the size of the majority, for example. The second is to plot boundary lines, such as the boundaries of local wards for unitary authority councils, or the larger electoral wards used for county council elections, to mark out the different population areas that are used to elect each representative. This boundary lines mark out areas that can be coloured, for example according to the party that won the corresponding seat, or maybe the swing in the vote of the winner compared to the previous election.

Create Your Own Polling Station Map from a Source File
The first example we’ll consider is how easy it is to generate a map of polling station locations. A web search for uk polling station location data or polling station uk council election turns up some candidates, several of which appear to be linked to from data.gov.uk, the UK’s central public open data registry as KML formatted files. When looking for location data, KML is a “Good Thing” to find because KML is standardised document format for publishing geographical data. So let’s see what we can do with it…

Polling station data

Or not as the case may be. (So how do I flag this download link as broken? On the data.gov.uk site? Or should I try to contact someone at the council directly?!)

Rooting around the corresponding council website, it seems as if they’ve been having a redesign, and decided to publish the locations of the polling stations this time around in a PDF file.

Let’s go back to the web and search a little more; it seems the Lichfield Council website has a KML file of polling stations dating back to 2010 – http://www2.lichfielddc.gov.uk/geo/polling.kml – so let’s see what we can do with it… If you go to Google Maps, and paste the URL of the KML file into the search box, and then hit return, you should find that the file is loaded into Google Maps and the location data it contains plotted on to the map:

KML in Google maps

If you click on the link icon, you can grab a URL that points to that map, with the data plotted onto it, or an embed code that lets you embed that map in your own web page, subject to Google’s terms and conditions! (I haven’t found a reliable previewer for viewing KML data over OpenStreetMap?) Which could be useful?

It seems to be a rare council that actually publishes the location data as such, though. More likely, the data will be locked up (as we have seen) as an address – or even a printed map – in a PDF file. Some local news outlets at least manage to get the data onto a web page, but can we do better?

polling station list

I’ve posted a recipe elsewhere (A Simple OpenRefine Example – Tidying Cut’n’Paste Data from a Web Page) that describes how we can use a tool call OpenRefine to tidy up the address data a bit to get it to look like this (at least, when the data’s viewed in a table layout rather than as raw CSV;-):

data in a fusion table

Having got the data into a nice, tidy form, we can now import it into an application that can geocode the addresses; that is, that can find the latitude and longitude for the the locations represented by the addresses. I haven’t used Google’s new Maps Engine Lite yet, but I believe it can accept CSV files as long as one of the columns contains geocodable data, such as an address…

Maps ENgine Lite

Let’s create a new map:

new map

Clicking on the upload link means we can upload some data:

import data

The Maps Engine fully expects some location related data that ist can set it’s geocoder on to, so what column contains the address data?

WHere's the location data

We should also provide a meaningful label for each address marker (though this data set doesn’t really have that…)

Now pick a marker label (which I donlt really have)

Here’s the result:

and there they are...

Any locations the geocoder has a problems with are identified – click on the Data link to pop open a data table view that highlights the problematic rows:

something wrong with these?

If you double click in a cell, it becomes editable…

To share your own map, click on the green Share button in the top right hand corner and then change the privacy setting…

to share, you need to make the map viewable by all, or at least by others with the link...

Note: there are other ways we could geocode this data. Simple Map Making With Google Fusion Tables describes how to do it with Google Fusion Tables, and Geocoding Using the Google Maps Geocoder via OpenRefine, but again the co-ordinate data will be subject to Google license conditions. Check out the School of Data blog thread on geocoding to find some open alternatives.

Create Your Own Boundary Line Map from a Source File
As well as mapping polling stations, we could also try to map out the different electoral wards that are covered by a particular council area, to give us a map that looks something like this for example:

boundary line map in fusion table

Can you see the boundary lines marking out the wards in there?

Once we have boundary lines associated with areas, it’s possible to colour each area according to some other paramater. In the case of an election, this might ne a colour representative of the party that took the seat, for example. (As to how to actually do that, that’ll have to be the subject of another post!)

SO where can we find boundary line data? A quick search on the Cambridgeshire County Council website turns up a possible source for that area:

Let’s see what happens if we take the URL for the County Council electoral divisions KML file and paste it into the Google Maps search box:

http://data.cambridgeshire.gov.uk/data/democracy/cambridgeshire-county-ward-boundaries/ElectoralDivisions.kml

Can you see the black boundary lines marked on the map? Unfortunately, the areas donlt appear to be labelled (the listing down the left hand side is blank), but at least there’s something there.

To actually work with the data, we can load it into Google FUsion Tables. Download the KML file, save it with a .kml suffix, and then import it into Google Fusion Tables. From GOogle Drive, select a new Fusion Table:

fusion table create

and then import the KML file:

Fusion table import

Don’t forget to add provenance information:

Keep tack of whre data came from - provenance

Once the data is loaded, we should see the “geometry” column has been recognised as KML data. The yellow column also shows that FUsion Tables has recgnised that column as a location type too (though we can also change it to just a text column).

the kml data is loaded...

Here’s how the map looks:

and we have a map

To make the map shareable, we need to go to the Share button (top right hand corner of the window):

maye it shareable

and select something suitably public:

shareable by link

If you want to see the map, here it is.

If you need a more powerful mapping tool with which to work with the KML file, QGis is a good place to start…

If you struggle to find shapefiles on your local council website, this recipe might help: Boundary Files for Electoral Wards Covered by a Particular Geography

Summary
This has been a quick tour of how to start proving some of the open public geo-data that councils may be making available. If they aren’t, or if they are and there are problems with it, maybe you should let then know?

Flattr this!