The Webpage is the API – Scraping Resources
Webscraping is one of the most powerful techniques for obtaining data – it allows you to treat nearly any website out there as data sources. Thus allowing you to create your own data collections and work freely with the data. Webscraping is very basic for most of the people with programming knowledge – here we list some of the resources to go out and learn webscraping.
Propublica has a basic overview of tools and programs
Scraping without Programming:
- Michelle Minkoff: Web scraping without Programming
- The Dataist: Using Google Chromes Scraper Extension
Scraperwiki comes with Tutorials for Python,Ruby and PHP
Scraping for Journalists – A good ebook on Scraping
Python:
- A Pycon talk on Webscraping offers a good introduction
- Will Larsons An Introduction to compassionate screenscraping
Javascript (Node.js):
Tilo Mitra: Web scraping with Node.js
If you have more Resources to share: Let us know in the comments.
Leave a Reply