You are browsing the archive for Milena Marin.

News from our School of Data Fellows

- December 2, 2014 in Events, Fellowship

We are back with some news about our amazing fellows from all over the world. One of our ways to keep in touch is having weekly written stand-ups in chat. We ask our fellows 3 questions plus a bonus:

  1. What you have done
  2. What you are doing
  3. Any lessons/obstacles
  4. Bonus music tracks.

##A busy month full of data trainings

  • In Philippines, Happy and Sam ran a Data Skills Training for the Civil Service Commission. They really enjoyed working with government employees who were so switched on. Keep an eye on this space for a follow up blog.

  • In Nigeria, our Olu just rounded up the #OpenDataParty in Abuja, Nigeria November 28 and 29 where they had 116 registered participants coming from the six region of the country to teach, and learn about how to use data for advocacy (NGOs) and storytelling (journalists); For those of you who can’t wait for the blog post, here are some pictures: Looking for pictures from this event.

15903607136_40b7ea5bb5_z

  • In Peru, our fellow Antonio and Juan Manuel, master of all School of Data things in Latin America, hosted Meetup with HacksHackers about private data and open data. Antonio is also working on a visualisation of climate emergencies in Peru over tge last 10 years.

  • In South Africa, Hannah is working on mapping the Cape Town budget for a beneficiary NGO, Ndifuna Ukwazi. For this she is experimenting with Carto DB, using a lot of their customisation functionalities.

  • In Romania, Codrina worked on and listed an application for the OGP Romanian awards – Political Colours of Romania, and preparing an open geodata workshop in this project.

  • In India, Nisha just finished a beginners workshop on data journalism and an open streets map mapping party with Mapbox. She is working on an online data journalism module and preparing a data expedition in Hyderabad with Milena.

  • In Tanzania, our Joachim lead a Open Refine deep dive last week with President’s office , Public Service Management and is now organizing another Open Refine and QGIS deep dive session for next week with an educational agency in Dar es Salaam.

  • In Macedonia, Dona and Milena organised a 2-day training in Macedonia covering basic data concepts, data analysis with spreadsheets and data visualisation. Here are some photos:

15904939886_8d12d196ff_z

  • In Hungary, Rita is in full preparation for next week’s two spreadsheet workshops for CSOs. This is the second series of spreadsheets training. Last time, the biggest challenge was assessing people’s skills to be able to tailor the training to their knowledge. This time, to be more accurate, the team has decided to require a few exercises to be completed, not just a self assessment survey.

  • In Indonesia, Yuandra talked about the usage of data at an event in Bandung and helped PWYP Indonesia create their first infographic. He is currently preparing for skill sharing session this November and for a survey trip with PWYP to kalimantan.

##Some lessons learnt

  • Never rely on internet at events! If possible bring a separate internet source to workshops like a internet dongle or a BRCK
  • When organizing events be patients , especially when dealing with public servants!
  • It’s never an easy task to find good datasets for trainings. We try to always use data that is relevant for our participants, that can get them to ask some interesting questions and is of course appropriate for the training.
  • It’s also quite hard to assess the skills of your participants before the training. Over-rating their skills might get you disappointed or at least you’ll have to cut and do a lot of adjustment to your training.

##Bonus music from around the world

Flattr this!

Announcing the School of Data Fellows

- July 15, 2014 in Community, News

We are proud to announce the School of Data Fellows 2014. During the next six months 12 amazing individuals will train and collaborate with civil society and journalists to drive accountability, transparency and social change across five continents. The Fellows are joining OKFestival this week in Berlin and will take part in a dedicated School of Data Summer Camp with trainers, partners and staff to share skills and develop action plans.

We are grateful for the interest from partners and members in the School of Data community. A special thank you to the more than 200 applicants who applied to join the programme.

Meet the School of Data Fellows

Antonio Cucho Gamboa, Peru
antoniocucho
Antonio is a specialist in website development – as a PHP and Python programmer. He is the founder of the Open Data community Peru and Co-organizer of Hacks / Hackers Lima. Participate in projects Open Data, Data Journalism. In Juny 2013 I participated in AbreLatam 2013 in Montevideo, Uruguay with my project Lima I/O (DAL Regional Winner 2012). In February 2014 I organized a Open Data Day Peru, we had workshops, hackaton and talks about open data. Also in March 2014 I went to Montevideo, Uruguay to participated in the first Databootcamp for journalists. This year, I’m teaching open data tools in some workshops for journalists, citizens and NGO’s.

Codrina Illie
codrina photo
Codrina is a PhD Student at the Technical University of Civil Engineering, Bucharest working within the Groundwater Engineering Research Center “CCIAS”. She is actively promoting free and open source software for geospatial and she is a dynamic supporter of the open data movement in Romania through her work within the geo-spatial.org community. Codrina is part of the GEodata Openness Initiative for Development and Economic Advancement in ROmania project team. The main objective of GEOIDEA.ro is to improve the scientific basis for open geodata model adoption in Romania. The project is built on the strong believe that publishing government geodata in Romania over the Internet, under an open license and in a reusable format can strengthen citizen engagement and yield new innovative businesses, bringing substantial social and economic gains. You can follow her on twitter.

Dona Djambaska, Macedonia
VA6_2297
Dona graduated in the field of Environmental Engineering and has been working with the Metamorphosis foundation in Skopje for the past 6 years in assisting on projects in the field of information society. There she has focused on organising trainings for computer skills, social media, online promotion, photo and video activism. Dona is also an active contributor and member of the Global Voices Online community. She dedicates her spare time to artistic and activism photography.

Hannah Williams, South Africa
hannah
Hannah is a graphic designer working in both web and print. She also does copy writing now and again and have worked on a couple of public art projects. Recently she she has been trying to focus more on doing work that has a positive social impact. You can find some of her work here: http://www.hannahwilliams.co.za

Happy Feraren, the Philippines
HeadShotferaren
Happy Feraren is the co-founder and CEO of Bantay.ph – a Manila based civil society organization (CSO) that monitors the quality of service in frontline government offices through volunteer reports. Along with the rest of her team, Bantay.ph has engaged over 100 student volunteers to monitor their local government offices and check for compliance of service standards mandated by the law. Her CSO aims to uplift the standard of public service and create a culture of active citizenship. Happy finished a degree in Literature at the De La Salle University, Manila before pursuing a career in advertising. After 4 years in the industry, she decided to leave advertising to work full time in the development sector. She is also a member of Manila’s premiere improvisational theater group, SPIT (Silly People’s Improv Theater). As a member of the group, she has performed in international improv festivals, conducted training modules for corporations, and developed special immersive theater shows. She also has diverse local and international experience in the fields of education, tourism, broadcasting, and HR training.

Joachim Mangilima, Tanzania
joachim
Joachim Mangilima is a technology and data enthusiast with a passion for using technology and data in addressing the most common problems facing communities around the world. He is active in consulting in the areas of development, deployment and management of mobile and web-based solutions and systems for decision support, data collection, analysis and management. Joachim is also the Co-founder and Co-manager of Google Developer Group (GDG), Dar es Salaam, a group of technology enthusiasts and software developers who are interested in open source technology with a bias in Google’s developer technology; this includes everything from the Android, App Engine, and Google Chrome platforms, to product APIs like the Maps API, YouTube API and Google Calendar API. Joachim holds a Bachelor of Science degree from University of Dar es Salaam majoring in Computer Science and Statistics with a minor in Economics.

Nisha Thompson, India
profilepicnisha
Nisha is currently working as Lead Organizer of a new organization called DataMeet, which is a community of people who are working towards open data by sharing experiences and helping others with data related problems. Datameet is hosting meetups and Open Data Camps around the country to promote dialogue about the use of data for civic purposes. Nisha moved to India in 2010 and worked with the India Water Portal to open up water data and worked with partners on the ground to improve the use and management of data. She also co-wrote a report on Open Government Data in India with the Centre for Internet and Society located in Bangalore. Previously she has worked with the Sunlight Foundation, in the United States, as social media and community organizer.

Oludotun Babayemi, Nigeria
Oludotun
Oludotun Babayemi has 5 years experience in the nonprofit sector and a Masters degree in Information Management. He is a Monitoring and Evaluation Expert with Connected Development [CODE], and the Lead Development Consultant with Cloneshouse Nigeria. He is a Microsoft Certified Information Technology Professional and presently a USAID and Google sponsored CrisisMapper Fellow. Oludotun Babayemi is working on monitoring and evaluation systems [such as the Follow The Money and the Education Budget Tracker] that could be used in putting pressure on governments and organizations in developing countries to be more responsive to demands from internal and external stakeholders for good governance, accountability and transparency, greater development effectiveness and delivery of tangible results. He has worked in participatory mapping projects with UNOCHA during the Libya Crisis, UNOSAT in the Post Libya Crisis Geotagging , WHO in the health facility registry post-Libya Crisis, Amnesty International-US during the Syria Uprising, UNSPIDER in the Samoa Simulation Exercise, Harvard Humanitarian Initiative Simulation Exercise and also with USAID on the mapping of poverty alleviation projects around the world. He was the Geo-Team Lead with Humanity Road using his expertise in information communications in disasters and humanitarian relief support.

Rita Zágoni, Hungary
profilRita
Rita is a programmer with social science background. She has worked in IT management and web development before joining the Economics department of Central European University, where she is in charge of parsing unstructured, free text data to create analyzable format. Wandering across these fields she has gained some experience in website development, text processing and statistics using mainly Python, Java and MySQL.

Ruben Moya, Mexico
ruben_moya_1
Ruben studied computer science at the Autonomous University of Guadalajara (UDG). He is currently freelancing developing web applications. He is a follower of technology and love to see new places. In the past months he has given lectures on code optimization, and have been teaching basic and advanced programming and developing. He also manages the programming of online conferences (hangouts) and online courses on various topics of technology, development and design.

Siyabonga Africa, South Africa
Siyabonga Africa
Siyabonga is from the east coast of South Africa but is currently living in Gauteng and working as a data visualization lead developer at Apehllion. His career has its roots in public administration and journalism from the University of Pretoria and Stellenbosch University respectively. He completed his masters in new media design at Indiana University before returning to South Africa in 2012.

Yuandra Ismiraldi, Indonesia
photo_yuandraismiraldi
Yuandra is a full stack mobile engineer and game developer from Indonesia. He holds a bachelor and master degree in software engineering, and started his career working with several startups in mobile and gaming space. He became interested in open data after participating in a hackathon about open data. Thinking that open data is a very interesting field, he is currently expanding his skill set to the world of open data and feels that information technology can become a great tool for open data.

Delivery partners
The Fellowship Programme is developed and delivered with Code for Africa, Social-Tic (Mexico) and Publish What You Pay Indonesia.

Funding partners
The School of Data Fellowship is made possible thanks to the generous support from the World Bank through the Partnership for Open Data, Foreign Commonwealth Office (FCO), Hivos, Indigo Trust, Southeast Asia Technology and Transparency Initiative (SEATTI), The William and Flora Hewlett Foundation and Open Society Foundations.

Flattr this!

Fellowship Deadline Extended & We need your help

- May 30, 2014 in Community

Supporting a global community of data teachers and learners is one of the core goals of School of Data. We know from running events with civil society and journalist organizations that there is a data literacy gap. While School of Data is one of a number of international groups aiming to make a difference, we aspire to mentor and support data leaders everywhere. To do this, we’ve created programmes, training modules and are working hard to find ways to support the community. The School of Data Fellowship is one of our projects to make this possible.

We are extending the School of Data Fellowship Programme Deadline to Tuesday, June 10, 2014

SCODAwall2

Data expedition during Mozilla Festival, 2013.

Our team is currently reviewing all the applications we’ve received. Thank you, we are impressed with your applications!

As mentioned in our FAQs and in our recent video hangout, we are having applications open globally while we are also on the lookout for fellows in specific areas of the world. We are seeking fellow applications from Macedonia, Tanzania, South Africa, Indonesia and Latin America. We also want to encourage women to apply! If it is any indication on how much we think about data for all: our team is half men and half women.

How to apply

To apply, please fill in this application form and attach your CV. The deadline is June 10.

Optionally, you can also send us a 30-second video expressing your interest in the role or explaining something “techie” in few (jargon-free) words. This is not a requirement for the application, but we will appreciate the extra effort! The video need not be professionally edited and can be filmed using any available recording equipment.

Help with outreach

Can you share a tweet or a Facebook post? Do you know someone in our target countries? Can you share this with them?

Be a School of Data Fellow. Applications Due June 10, 2014 https://schoolofdata.org/fellowship-programme/

Thanks!

Flattr this!

Slides, Tools and other Resources from the School of Data Journalism 2014

- May 23, 2014 in Data Journalism

The School of Data Journalism is a joint initiative of the European Journalism Centre, the Open Knowledge Foundation, and the International Journalism Festival of Perugia. The third edition consisted of four days of workshops and panels, covering everything from crime investigations to data journalism using spreadsheets, social media data, data visualisation and mapping for journalism.

In this post you will find all the links shared during this training event, the video replays of the panel sessions and workshops, links to the slides of the panelists and instructors. If you have links shared during the sessions that we missed, post them in the comments section and we will update the list.

Video recordings and notes

Panel discussions

Workshops

Pictures

You can find some pictures here. Please do feel free to share your pictures with us!

Slides, tutorials, articles

Tools and other resources

Organisations and initiatives

Data journalism projects

Flattr this!

School of Data Journalism 2014: the Storify summary

- May 13, 2014 in Data Journalism

Flattr this!

We need you! Become a School of Data Fellow

- May 9, 2014 in Community

STOP PRESS!

  • Update 2014-06-11: The application is now closed! We received 200 applications, thank you! We are now reviewing them, stay put!
  • Update 2014-05-30: The deadline for applications has been extended to the 10th of June. See more details here.
  • Update 2014-05-20: Macedonia confirmed as an additional country seeking fellows

IMG_6400

Got data skills to share? Member of a community that wants to turn data into information? Know about a data journalism or civic activism project or organisation which need a push for using data more effectively? The School of Data needs you! We are currently broadening our efforts to spread data skills around the world, and people like you are crucial in this effort: new learners need guidance and people to help them along the way. Stand out and become a **School of Data Fellow**.

We are looking for people fitting the following profile:

  • Data savvy: has experience working with data and a passion for teaching data skills.

  • Understands the role of  Non-Governmental Organizations (NGOs) and media in bringing positive change through advocacy, campaigns, and storytelling.  Fellows are passionate about enabling partners to use data effectively through training and ongoing support.

  • Interested or experienced in working with journalism and/or civil society.

  • Has some facilitation skills and enjoys community-building (both online and offline).

  • Eager to learn from and be connected with an international community of data enthusiasts

As a School of Data fellow, you will receive data and leadership training, as well as coaching to organise events and build your community. You will also be part of a growing global network of School of Data practitioners, benefiting from the network effects of sharing resources and knowledge and contributing to our understanding about how best to localise our training efforts.

You will be part of a six-month training programme where we expect you to work with us for an average of five days a month, including attending online and offline trainings, organising events, and being an active member of the School of Data community.

There are up to 10 fellowship positions open for the July to December 2014 School of Data training programme.

We have current collaborations and resourcing confirmed to support fellows from the following countries: Romania, Hungary, South Africa, Indonesia, Macedonia and Tanzania. We are also able to consider applicants for the remaining 4 places in this round from countries meeting these criteria:

  • The country falls under lower income, lower-middle income or upper-middle income categories as classified here.

  • There is demand from civil society organisations and/or journalists who wish to benefit from such a scheme.

  • There are some interesting datasets available in the country which would be worth exploring further. These could either be data published by a government or organisation or data collected by an organisation for their own internal use. Digitised or non-digitised—anything goes! We’re keen for a variety of challenges and want the fellows’ help to adapt teaching techniques to a variety of situations.

Our goal is to have global fellows from a wide mix of these countries. Don’t see your country listed? Keep reading to learn how you can get involved!

Got questions? See more about the Fellowship Programme here and have a looks at this Frequently Asked Questions (FAQ) page. If this doesn’t answer your question, email us on [email protected]

Not sure if you fit the profile? Have a look at who is a fellow now!

Convinced? Apply now to become a School of data fellow. The application will be open until the 1st of June 2014 and the programme will start in July 2014.

Flattr this!

DIY Aerial Mapping

- May 6, 2014 in HowTo

This is a report from the School of Data Journalism organised by Open Knowledge,European Journalism Centre, and International Journalism Festival. The session was led by Cindy Regalado founder of  CitizenswithoutBorders.com a London-based group engaging citizens in diverse ways that expand our horizons, practically, experientially, and philosophically and  community organiser for Public Laboratory for Open Technology and Science. Public Lab publishes a collection of resources for DIY aerial photography and mapping, as well as instructions on how to create other low-cost tools for environmental science. The organization sells pre-packaged kits through its Web store to help kickstart would-be aerial cartographers in getting their balloon- or kite-based sensor platforms into the skies.

Why kites?

Running on a field pushing up in the air a colorful kite brings instant joy and happiness.

IMG_0559

We can do aerial photography in numerous ways: with balloons, kites, drones, helicopters and so on. We chose kites because they are by far the most accessible and affordable. You don’t need expensive and rare helium like in the case of balloon mapping, we don’t need to spend loads like in the case of drones, they are silent (meaning you can cover a protest without attracting attention) and most of all, they are fun!

Getting ready

To fly a kite and capture pictures from high above you will need:

  • Kite – we used a 9′ (274 cm) Dazzle Delta Kite we ordered from the Public Lab store. One can also make a DIY kite with widely available materials for less than 20 USD.

IMG_0570

  • Reel – this was also bought from the Public Lab store but it can be found in any kite flying store

IMG_0713

  • Camera – we used a Canon PowerShot A1300
  • SD card & batteries
  • Camera rig – we used a DYI light wood picavete
  • Gloves

IMG_3260

  • Map
  • Sunglasses
  • Check the weather conditions before.

Fly the kite

Choose a location with lots of open space. We went on the stadium nearby our main workshop venue.

IMG_0584

To launch in good winds, stand with your back to the wind and hold your kite up to catch the wind. Let line out only as fast as the wind lifts the kite. If the wind lulls, pull in line to make your kite gain altitude.

In light or gusty winds, a high-start launch can get your kite up to steadier winds higher up. Have a friend hold your kite 50 meters or more downwind from you with the line stretched tight. When your assistant releases the kite, reel in line to make it climb.

Running is the hardest way to launch a kite. The uncontrolled tugging on the line makes kites dive and crash. Let the wind and your reel do the work instead.

IMG_0850IMG_0851IMG_0852IMG_0861
When the wind catches your kite, you’ll feel a small tug. Release a tiny bit of line and slowly move backwards—running actually makes things harder, unless there is no wind case in which you might need to run. As the kite ascends, keep your line taut, so you always remain in control.

Attach the camera

Wait until your kite is up in the air at at least 50-60 meters and attach the camera rig.

We used a home made, light wood picavete and we attached the camera with rubber band. We had to make sure that the camera was pretty stable with 3-4 rounds of band.

IMG_3345

Before attaching the camera, you have to make sure it’s on “continuous” mode. This will ensure that your camera will take pictures continuously once it’s up in the air. To keep the shutter speed pressed, block it with another rubber band.

IMG_3363

Before pressing the shutter speed button, point the camera to the horizon. This will help the auto-focus function of your camera to adjust to the light conditions. Otherwise, if your camera looks down when you first press the button, your pictures are likely to be underexposed.

IMG_3373

Now that everything is ready, attach the rig to the reel line making 3-4 turn on each hook.

IMG_3390

IMG_0804

IMG_3410

Create your map

Depending on your camera, battery life time and SD card capacity, you will get something like 800 – 1200 pictures like that. The first thing you want to do is select the best 20-30 pictures that illustrate best the area you want to map.

IMG_0344

You can use MapKnitter, a softare  designed and built for kite and balloon mapping. The interface is simple and the controls intuitive – here is the introduction video:

Belive us, this software it’s easy and intuitive!

IMG_3468

Once you’ve pieced everything together, MapKnitter can export the map in five different geographic information system formats—the KML format used by Google Earth, Google Maps Viewer format, OpenLayers, GeoTIFF, and Tile Map Service format—or as a JPG for printing. You can also share the map through the MapKnitter site. So if it’s marked as public domain and it’s better resolution than existing imagery, Google may suck it into Google Maps to replace what it has.

And if technology is not for you, you can go ahead and print the images and work the old fashion way with scissors and glue to stitch together your aerial map.

IMG_3476

Flattr this!

Using Excel to do data journalism

- May 5, 2014 in Data Journalism

This is a report from the School of Data Journalism organised by Open Knowledge,European Journalism Centre, and International Journalism Festival. The session was led by Steve Doig, Knight Chair in Journalism, specializing in computer-assisted reporting — the use of computers and social science techniques to help journalists do their jobs better.

You can download Steve’s presentation and the data used in this tutorial here.

Microsoft Excel is a powerful tool that will handle most tasks that are useful for a journalist who needs to analyze data to discover interesting patterns. These tasks include:

  • Sorting
  • Filtering
  • Using math and text functions
  • Pivot tables

#Introduction to Excel

Excel will handle large amounts of data that is organized in table form, with rows and columns. The columns (which are labeled A, B, C…) list the variables (like Name, Age, Number of Crimes, etc.) Typically, the first row holds the names of the variables. The rest of the rows are for the individual records or cases being analyzed. Each cell (like A1) holds a piece of data.

1

Modern versions of Excel will hold as many as 1,048,576 records with as many as 16,384 variables! An Excel spreadsheet also will hold multiple tables on separate sheets, which are tabbed on the bottom of the page.

2

#Sorting

One of the most useful abilities of Excel is to sort the data into a more revealing order. Too often, we are given lists that are in alphabetical order, which is useful only for finding a particular record in a long list. In journalism, we usually are more interested in extremes: The most, the least, the biggest, the smallest, the best, the worst. Consider the data used in this workshop, a list of the provinces of Italy showing the number of various kinds of crimes reported during a recent year.  Here is how it looks sorted in alphabetical order of province name:

3

Far more interesting would be to sort it in descending order of the total number of murders, with the most violent city at the top of the list:

4

There are two methods of sorting. The first method is quick and can be used for sorting by a single variable. Put the cursor in the column you wish to sort by (“Murders” in this case) and then click the A-Z icon:

5

You’ll get a window that looks like this:

7

Sort in whichever order you want. But beware! Put the cursor in the column, but DO NOT select the column letter (D, in this case) and then sort. Consider the example below:

8

You will get that warning message, but don’t choose “Continue with the current selection”. That will sort ONLY the data in that column, thereby disordering your data! The other method of sorting is for when you want to sort by more than one variable. For instance, suppose we wish to sort the crime data first by Territory in alphabetical order, but then by “Murders” in descending order within each Territory. To do that, go to the toolbar, click on the “Data” tab and then the “Sort” icon, and then choose the variables by which you wish to sort. (Click “Add level” to add new levels.) Then click “OK”.

9

The result will be this:

10

Filtering

Sometimes you want to examine only particular records from a large collection of data. For that, you can use Excel’s Filter tool. On the toolbar, go to the “Data” tab, then click “Filter”. Small buttons will appear at the top of each column:

11

Suppose we wish to see only the records from the Territory of Lazio. Click on the button on the Territory column, uncheck the “Select All” box, and then choose Lazio from the list, like this:

12

This is the result:

13

Notice that you now are seeing only rows 36, 44, 78, 80 and 104. The rest are still there, but hidden. More complicated filters are possible. For instance, suppose you wish to see only records in which “Burglaries” is greater than or equal to 5,000 AND car thefts is less than 2,000. You start by filtering Burglaries like this:

14

then this…

15

Do the same for Car Thefts, and you get this:

16

Functions

Excel has many built-in functions useful for performing math calculations and working with dates and text. For instance, assume that we wish to calculate the total number of murders in all the provinces. To do this, we would go to the bottom of Column D, skip a row, and then enter this formula in Cell D106: =SUM(D2:D104). The equals sign (=) is necessary for all functions. The colon (:) means “all the numbers from Cell D2 to Cell D104”. After you hit Enter, the result is this:

1

(The reason for skipping a row is to separate the sum from the main table so that the table can be sorted without pulling the sum into the table during the sorting operation. This way the sum will stay at the bottom of the column. Often you will want to do a calculation on each row of your data table. For instance, you might want to calculate the auto theft rate (the number of cars stolen per 100,000 population), which would let you compare the auto theft problem in cities of different sizes. To do this, we would create a new variable called “Car Theft Rate per 100k” in Column J, the first empty column. Then, in Cell J2, we would enter this formula: =(G2/C2)*100000.  This divides the stolen cars by the population, then multiplies the result by 100,000.  (Notice that there are no spaces and no thousands separators used in the formula.) Here is the result:

2

You can format your numbers using various choices in this box under the “Home” tab: 3It would be very tedious to repeat writing that calculation in each of 103 rows of data. Happily, Excel has a way to rapidly copy a formula down a column of cells. To do that, you careful move the cursor (normally a big fat white cross) to the dot on the bottom right corner of the cell containing the formula. When it is in the right spot, the cursor will change to a small black cross. At that point, you can double-click and the formula will copy down the column until it reaches a blank cell in the column to the left. This would be the result:

1

Notice that the formula changes for each row, so that the Row 5 formula is =G5/C5*100000, and so on.  That’s what makes Excel so powerful — the ability to change formulas as you copy down or across. Now, if we sort by Car Theft Rate in descending order, we see the cities with the worst auto theft  problems:

1

and sorting in ascending order, the least crime:

2

Here are some other useful Excel functions that can be used in similar ways:

(You can add, subtract, multiply or divide by using the symbols + – * and /)

  • =AVERAGE – calculates the arithmetic mean of a column or row of numbers
  • =MEDIAN – finds the middle value of a column or row of numbers
  • =COUNT – tells you how many items there are in a column or row
  • =MAX – tells you the largest value in a column or row
  • =MIN – tells you the smallest value in a column or row

There are also a variety of text functions that can join and cut apart text strings. For instance:

If “Steve” is in Cell B2 and “Doig” is in Cell C2, then =B2&” “&C2 will produce “Steve Doig”. And =C2&”, “&B2 will produce “Doig, Steve”. Other text functions include:

  • =SEARCH – this will find the start of a desired string of text in a larger string.
  • =LEN – this will tell you how many characters are in a text string.
  • =LEFT – this will extract however many characters you specify starting from the left.
  • =RIGHT — this will extract characters starting from the right.
  • =MID — this will start extract where you tell it to start, and get as many characters as you specify.

You can also do date arithmetic, such as calculating the number of days or years between two dates, or hours, minutes and/or seconds between two times. For instance, to calculate on April 24, 2014, the age in years of someone whose birth date is in cell B2, you could use this formula: =(DATE(2014,4,24)-B2)/365.25. The first part of the formula calculates the number of days between the two dates, then that is divided by 365.25 (the .25 accounts for leap years) to produce the years. Another useful date function is =WEEKDAY, which will tell you on which day of the week a chosen date falls. For instance =WEEKDAY(DATE(1948,4,21)) returns a 4, which means I was born on a Wednesday.

Pivot Tables

One of Excel’s best tricks is the ability to summarize data that is in categories. The tool that does this is called a pivot table, which creates an interactive cross-tabulation of the data by category. To create a pivot table, every column of your data must have a variable label; in fact, it is always good practice to put in a variable label as soon as you insert or add a new column.

First, you make sure your cursor is on some cell in the table. Then go to the tool bar, click on the “Insert” tab,  and then click on the “Pivot Table” icon. A window will pop up that looks like this:

Excel offers well over 200 functions in a variety of categories beyond just math, dates and text: Financial, engineering, database, logical, statistical, etc. But it is unlikely that you will need to be familiar with more than a dozen or so functions, unless you are a journalist with a very specialized beat such as economics.

1

Normally, all you need to do is hit “OK”. This will open a new sheet that looks like this:

2

To build a pivot table, you should visualize the piece of paper that would answer your question. Our example data shows 103 provinces in the 20 Territories of Italy. Imagine that you wanted to know the total number of murders in each Territorio. The piece of paper that would answer that question would list each Territory, with the total number of murders next to each name.

To build this pivot table, we would use the mouse to pick up “Territory” from the list of variables in the floating box to the right, and place it in the “Row Labels” box below. We would then take the “Murders” variable and put it in the “Values” box. This would be the result:

3

If you click the cursor into the “Total” Column and hit the Z-A button to sort, you will get this:

4

It is possible to make very complicated pivot tables, with multiple subtotals. But I recommend making a new pivot table for each question you want to answer; several simple tables are easier to understand than one very complicated table that tries to answer many questions at once.

The little black down-arrow button on the “Values” variable opens up a box that will let you make a variety of other choices about how to summarize and display the result. Click on “Value Field Settings” and you get this:

5

Other Excel Tips

Excel will import data that comes in a variety of formats other than the native *.xls or *.xlsx that Excel uses. For instance, Excel can readily import text files in which the data columns are separated by commas, tabs, or other characters, like this:

6

If you find a web page with data in table format (rows and columns), Excel can open it as a spreadsheet. Copy the table and then paste it into Excel; very often it will flow properly into the correct columns.

Finding Data

Government agencies are starting to make some of their data available in Excel or other formats. For example, ISTAT.IT has very comprehensive data about Italian demographics, economy, crime, etc. Many of their tables can be downloaded directly as Excel files.

One trick to find interesting data would be to use Google and add these search terms: site:.gov  filetype:xls.

Flattr this!

Investigating crime and corruption with maritime databases

- May 5, 2014 in Data Journalism, HowTo

This is a report from the School of Data Journalism organised by Open Knowledge, European Journalism Centre, and International Journalism Festival. The session was led by Giannina Segnini who headed a team of journalists and computer engineers at La Nacion, Costa Rica’s newspaper of reference, until early February 2014. She is fully dedicated to uncovering investigative stories by gathering, analyzing and visualizing public databases.

You can download the presentation and all other data sets used in this workshop here. You can also find here the video recording of this workshop.

One of the best workshops at the School of Data Journalism in Perugia gave participants the tool, instruments and techniques to investigate crime and corruption by tracking maritime databases.

Understand how shipments work

4755758395_403170af4c_z

##Codes

We need to change our mind-set! Everything in this world, from shoes to military weapons, ivory and chemical products have a Harmony System Code, known as HS Code. If you want to follow and understand international trade you have to get familiar with this.

The UN United Nations Commodity Trade Statistics Database contains detailed imports and exports statistics reported by statistical authorities from around the world.

Go to “Annual Data”, select “Fast Track” from the menu and then select “Country List” to get all data by country or “Commodity List” to look at specific goods.

2014-05-03_17h02_32

Here you can observe that both countries and goods have their specific codes.As you can see in this screen grab, there are categories of codes, like 89 for ships and more descriptive codes like 8901, 8902, etc. describing specific types of boats. Codes can go up 6 digits, each digit adding more information to the code.

Let’s explore this data further. Go to back to the http://comtrade.un.org/, select “Monthly Data” which gives you a friendlier interface. We are looking for all Italian imports of “potassium permanganate” (code: 284161), a chemical substance used for making cocaine from all over the world.

2014-05-03_18h22_46

We can download the data in comma separated value (CSV) format and import it in any spreadsheet programme.

To investigate this further, we can compare what Italy reports as imports (the query we just made) and what the rest of the world reports as exports towards Italy. We just need to download the same data with “World” as reporter, Italy as partner and “exports” as trade flows.

2014-05-03_18h25_54

In the real world what you import should match what the other countries are reporting as export. However, with a simple analysis we can see that Italy imports a total value of potassium permanganate of about 11 million dollars since 2010 while the rest of the world only reports about 6 million. If we look more carefully, for example with a simple pivot table we notice that France and Spain are not recording everything.

##Containers

2014-05-03_18h56_15

Like countries and commodities, also containers have a code. You need to learn how to decode the code. According to ISO standard 6346 (Freight Containers-coding, identification and marking), the BIC allocates an owner code to every container owner or operating company. Such codes are listed in the official Register “containers bic-code”. This task is completed with the assistance of a worldwide network National Registration Organisations.

2014-05-03_18h56_39

Codes include information about the company owning it, product group code, serial number, check digit.

Let’s practice this a bit with our container SUDU5004991.

######Step 1: Search who owns the container.

On the Bic-code.org home page, go to “Bic Codes” from the menu and select “Registered Codes”.

2014-05-03_19h04_00

The result will contain the company name and country where it’s registered. In our case is HAMBURG SUED.

######Step 2: Search for the itinerary

Containers are like DHL registered email, you can trace them, typically directly from their handling company. Go on the site of the company and search the container number: SUDU5004991. You can typically find a “track” search box.

2014-05-03_19h07_57

You can find out where the container is, where it’s coming from and where it’s heading to. You will also get the voyage number which will help us track the container around the world using customs and other resources.

2014-05-03_19h08_41

Sometimes you will find that the owner of the container is not a shipping line. It can be a company that leases containers to another company. Don’t worry, you can go to the real owner who must have a search box where you can do a unit inquiry and you get who is operating the container and from there you keep searching and searching until you find what you are looking for.

Step 3: Search on customs by container

Once we have the owner and the container, you can search for more information both about the vessel and the container. You can get the name of the exporter and most importantly, the bill of landing.

If you don’t find the info in Italy you might find it in another country, things are registered everywhere. For example, most ships coming to Italy with drugs come from Peru. If you can’t find the information on the Italian customs, look on the Peruvian side.

The best resource for finding custom organisations is the World Customs Organisations.

##Bill of Lading

2014-05-05_17h36_54

bill of lading is a document issued by a carrier to a shipper of goods. It is a standard form which serves as a receipt for the goods shipped, as evidence of the contract of carriage and as a document of ownership.

The bill of lading is the master document. From here you can find out who is the exporter, the shipper, the consignee, who has to be notified when the cargo arrives, the port of loading, the port of discharge, the vessel and so on.

Investigation techniques

There are many online services providing data about maritime and air shipping. Some need a login and some have a pricing model but in all you can get some information for free. Unfortunately, they have different database structures but the magic happens when we combine all sources!

The logic behind this kind of investigations is that of chained searches. You can find anything you want if you know how to combine sources – go from general to specific until you get what you need!

######Technique 1

  • Start with the container number
  • Container leads to bill of lading
  • Bill of lading leads to shipper and consignee
  • Shipper leads to owner or representative

######Technique 2

  • Sometimes all you will have is the bill of lading
  • Containers
  • Vessels
  • Route

######Technique 3

  • Sometimes you are only investigating people or companies like shipper or exporter
  • From the company you get a list of bills of lading
  • Shipment
  • Consignee
  • Importer’s name, owner or representative

#Other useful maritime databases

Here you can see consolidated data from United States, United Nations and all other international organisations. The information is up to date and you can all documentation of cargo shipping and air cargo and terrestrial transportations including shipper name, container number, bill of lading number, ports, etc.

More than 20 database in one where you can search by ship particular, like IMO number (the codes for ships).

This is a great database for Europe containing detentions and banning. Search here all info about ships detained, including a historic database. See “details” and get the complete report plus all information of why it was detained!

Ships: real time vessel location, itineraries history, pictures. Every single ship with an EIS (like a GPS) number, and using this you can also see all the ships parked in every single port around the world, filter by type of vessel (e.g. cargo vessel), see vessel details and get all the information, including history.

You have the same for planes, all air traffic in real time.

#Practice, practice, practice

Here are some exercises you can use to practice:

  • Track these containers and the ships that transported them: CXRU1455801, CXRU1454127

  • Provide the owner’s name, phone number (headquarters) of the ship transporting the Syrian chemical weapons that will arrive soon to Gioia Tauro Port in Calabria, Italy. Provide the last position (coordinates) received from this vessel. Find the owner name and the last position of the US ship where the cargo will be transferred in Italy.

  • In May 5, 2013, the Police and the Guardia di Finanza in Calabria seized another container carrying 190 kg of cocaine at port Gioia Tauro. Here is one of the videos about that operation. Track the LAST cargo transported by this container this year, describe the cargo, list the importer and the exporter and their addresses.

  • Try to track the last drugs seizures (listed in a separate file Gioia Tauro) in port Gioia Tauro. You need to get additional information than the one on the web.

Flattr this!

From idea to story: planning the data journalism story

- May 2, 2014 in Data Journalism

This is a report from the first workshop during the School of Data Journalism organised by Open Knowledge, European Journalism Centre and International Journalism Festival. The session was led by Steve Doig, Knight Chair in Journalism, specializing in computer-assisted reporting — the use of computers and social science techniques to help journalists do their jobs better.

You can download Steve’s presentation here.

Why do data journalism at all?

Steve’s take on this is that data journalism allows journalists to go beyond anecdotes and base their stories on facts and evidence. You can keep using anecdotes but based on data, you can find the best ones, which are most illustrative for that particular story.

So, how can journalists find data story ideas? Before anything, try to look at topics you already report on like sports, elections, disasters, crime investigations, money flows, etc. Almost anything journalists typically cover produce data which can be analysed. Other places to get ideas for data journalism stories

  • See what other journalists are doing – If something is going on in one city, chances are it’s happening in your city too
  • See featured projects in datadrivenjournalism.net
  • IRE’s Extra Extra feed
  • Have a look at the Guardian data blog – not always investigative stories, not big heavy crime or social justice
  • Reading documents produced by government agencies and academics who collect large amounts of data. Pay attention to footnotes and bibliography which can lead to interesting data sources!

How do you get from an idea to a story?

Work backwards from your idea:

1. Think of the statements you want to make

Start with a hypothesis like crime is getting worse in my area. For this hypothesis, you might want to make statements like: crime is increased by x amount, the amount of crime per 1000 people in such and such city is the greatest in our area, etc.

2. Think of what variables you need to make the statements Now what variables do I need, think in terms of the table of information (columns are variables and row are the individual data points).

There are 2 diff kind of variables:

  • Categorical (gender, type of rime, zip code) variables with lables
  • Numerical variables – the counts, number of crimes, number of accidents, numver of arrests

Examples of variable: type of crime, population of the places where crime is happening, date of crime, time, location, number of victims, was an arrest made (y/n?)

3. Think who collects the data Once we know our variables, check who collects this. All agencies we cover type of government, corporations, etc. are collecting lots of information so we don’t have to collect data ourselves most of the time

4. Get the data from there

Then you face the problem of getting the data. In US there are pretty strong public record law. In Europe as well most countries have Freedom of nformation laws or an official way to request data from public agencies.

Data formats Don’t be intimidated by different formats. Know how you want to work with data, for example Excel. You don’t need to get the data in .xls but you can use programmes to translate data from one format to another. Find a data nerd who can help you! One place to find good nerds is on forms or email lists:

One format you should try to avoid getting is PDF – it doesn’t import well in other formats. If sometimes they only give you a PDF there are tools to export it in other formats like Tabula.

5. Clean the data Data is sometimes messy. An classic example is campaign finance information which has all been typed in by volunteers – name of cities are always misspelled! In this case you need to find all the cities which were misspelled and correct them so you can say for example how much was collected from a single city. People who collect data are doing it for bureaucratic matters and it doesn’t really matter how clean it it. For people who use data for analysis need more precision and thus need to clean the data

6. Once you have clean data – what do you do with it? Look for patterns! Highs, lows, maximums, minimums, averages, etc. Get in your mind the shape of the data, look for outliers, anything in your data which is weird and stands out. Remember that many stories have been discovered by easy things like sorting, etc. Tools:

  • Use simple spreadsheets functions like sort, filter, functions and pivot tables
  • Another tool is your brain: math and statistics but its pretty much like 1+1 =2!
  • Resource for math http://t.co/CaZg5qS0jM

Last, it’s important to remember that data journalism stories are best done in teams. There are may roles to cover in such a team including: other reporters, editors, graphic artists, photographers, videographers, page designers, web designers, app developers, etc.

Flattr this!