What is Web Data Extraction ?

A user can easily copy and paste data from a web page. They can even download images and store the files names with the associated product text. For small amounts of data this is fine. But consider many rows of data across 10s, 100s or even 1000s of pages, this can be very time consuming and prone to human error.

Screen Scraping, Web Data Mining, Data Extraction are automated techniques of navigating and extracting data and images from a web site. Generally The Screen Scraping Software will work with a Web Crawler. The Web Crawler will be responsible for automatically navigating a web site.

The crawler will follow every link in a methodical way hunting for data to be scraped. When the crawler (sometimes know has Web Spiders or Web Bots) finds a page that the screen scraper is interested in (such as a product page) the crawler then calls upon the Screen Scraper to extract the information that is required

The extracted data that the screen scraper found is then cleansed, processed, transformed, translated as required and then stored in another place such as a spread sheet or database. Once the data is stored in Excel, CSV (Comma Separated Values) or a Database, it makes life much easier to use the data.

Extract Data Automatically no more copy and paste

iHarvest can save you hours and hours of manual effort.

If you have a Screen Scraping project / idea? Contact iHarvest today, we’ll happily discuss your idea and take a look at the web site you want to extract data from. Initially we’ll help you establish how scrape-able the data is, again, its 100% no obligation.

[wp_lightbox_display_external_page link="http://www.youtube.com/embed/GUK0JHV560U?rel=0" width="640" height="480" title="Web Data Extraction" source="http://www.iharvest.co.uk/wp-content/uploads/2013/04/iHarvest_Web_Data_Extraction_SpeedWriting400.jpg" autoplay="1"]

Why use Web Data Extraction ?

  • Extract data and images from a web site very quickly.
  • Analyses a competitors site. “Measure” their product range.
  • Identify competitors items In Stock and Out of Stock.
  • Identify a competitors brand proportion, how much of one band to they sell, what product types.
  • Identify if a competitor is selling products you are not, and vice-versa.
  • Extract data and images from a web site very accurately.
  • Compile the extracted information into a database, spread sheet to draw further analysis.
  • Screen Scraping data helps with data mining and business intelligence.

More References

http://en.wikipedia.org/wiki/Data_extraction

 

Web Data Extraction

Save 100’s of hours manually inputting