Crawling web pages using Python and Scrapy - Tutorial
In this post, let us walk through how we can crawl web pages using Scrapy.
For this tutorial, we will download all the excerpts and ebooks available in https://www.goodreads.com/ebooks?sort=popular_books. This page is paginated. Let’s download books from first page only. At the end of this post, you will know how to follow and crawl other pages too.
First lets create a python virtual environment called goodreads.
1 2 mkvirtualenv goodreads workon goodreads To know more about how mkvirtualenv and workon commands work, visit and install virtualenvwrapper