January 17, 2011 ProPublica's guide to scraping data — using free tools to get structured data out of messy HTML, PDFs and Flash (via) #