A Google Chrome extension for getting data out of web pages and into spreadsheets.
Highlight a part of the page that is similar to what you want to scrape. Right-click and select the "Scrape selected..." item. The scraper window will appear, showing you the initial results. You can export the table to by pressing the "Export to Google Docs..." button or use the left-hand pane to further refine or customize your scraping.
The "Selector" section lets you change which page elements are scraped. You can specify the query as either a jQuery selector, or in XPath.
You may also customize the columns of the table in the "Columns" section. These must be specified in XPath. You can specify names for columns if you would like.
Selecting the "Exclude empty results" filter will prevent any matches that contain no column values from appearing in the table.
After making any customizations, you must press the "Scrape" button to update the table of results.
Download the extension from http://chrome.google.com/extensions/detail/mbigbapnjcgaffohmbkdlecaccepngjd.
Get the sources from https://github.com/mnmldave/scraper.
You don't need to 'build' this extension per se. To test it out, you first
need to navigate to chrome://extensions
from Google Chrome then expand "Developer Mode". Click the "Load unpacked extension..." button and point it to the src
directory.
Learn more about plugin development from the Google Chrome Extensions page.
A Rakefile
is included for compiling the Google Chrome extension into a
zip file. It also does javascript and css minification.
Scraper is open-sourced under a BSD license which you can find in LICENSE.txt
.
Many of the icons used in this extension are from the generous Yusuke Kamiyamane.
Copyright (c) 2010 David Heaton ([email protected])