Web scrapers run on a regular basis to extract information. Today every application irrespective of size is deployed on Cloud providers like AWS, Google Cloud and Azure. In this article we will see how to deploy our Ebay product scraper as Google cloud function.
Today cloud providers offers more services than simply providing machines or infrastructure. The main reason for going to cloud deployments are auto scaling, monitoring, ease of deployment, faster delivery process, wide range of infrastructure options suitable for different types of applications and ofcourse cost. Cloud deployments show considerable cost saving when compared to managing own infrastructure. More details about cloud is out of scope of this article.
What are Google Cloud Functions?
Cloud Functions allows you to trigger your code from Google Cloud, Firebase, and Google Assistant, or call it directly from any web, mobile, or backend application via HTTP. You are only billed for your function’s execution time, metered to the nearest 100 milliseconds.
You pay for what you use. In this article we will deploy a simple ebay product scraper as a Google Cloud function.
Our Ebay scraper is built using
BeautifulSoup library. This scraper will extract the necessary information of a product given its URL. For example, given this URL https://www.ebay.com/itm/203449771112?hash=item2f5e8d2468:g:0zcAAOSwssFgnD-K&var=503791306328 to the scraper we will get the following result:
Refer to my ebay.py file for the scraper.
Deploy scraper as Google Cloud Function
To deploy our scraper we need to create a file called
main.py in the same folder where we have our scraper. Add the following code to
scrape_ebay is the Cloud function that we call using a HTTP request. We pass URL, in our case Ebay product URL as a query parameter. We will see the response as json. To deploy the cloud function you need
gcloud application on your system. You have to create a project on Google cloud and initialize
gcloud to use that project to deploy this cloud function. Use the below command to deploy:
If the above command is successful, you will get the URL to run your cloud function. Something like this:
You can add the Ebay URL as query parameter:
If you see any errors fix them. To run cloud functions two basic things are important:
- Enable billing on your project
- Enable Cloud Build API
You will have more options during the deployment. For example, you can set the memory required for your function to run. The complete source code of the project is available here. The major advantage of Google cloud functions when compared to AWS Lambda is, Google cloud functions respect
requirements.txt file to install dependencies, where as in AWS Lambda you need to bundle all your dependencies. Thanks for reading.