We’ve hit the ~10K+ students on the Scraping and Data Mining for Beginners and Pros Udemy course. The reviews have been fantastic:

reviews

Since creating the course, a few new technologies came up that are worth mentioning:

portia:

A Python-based visual scraper, similar to how import.io works, but locally on your machine with no fees attached. It has a few dependencies, but once you go past the installation hoopla, using it is quite straight forward.

artoo:

A client-side scraping companion. This is great for quick scrapes from sites you’re visiting in the browser. It’s a utility belt for the browser console, so you visit a page, pop the JavaScript console open and with a few lines of code, you can save data into json. Great for quick jobs.

artoo

artoocode

ScrapingHub.com:

These are the guys who open-sourced Portia, and they at least get a mention because of it. They also have a service that’s similar to import.io that’s worth checking out. The way they describe it is:

[stag_alert style=”blue”]Scrapy Cloud bridges the highly efficient Scrapy development environment with a robust, fully-featured production environment to deploy and run your crawls. It’s like a Heroku for Scrapy, although other technologies will be supported in the near future. It runs on top of the Scrapinghub platform, which means your project can scale on demand, as needed.[/stag_alert]

Summary

I’m glad you guys are enjoying the course – I’ll keep updating you with interesting tidbits from the Data Mining world!