Use MongoDB to save data after Scrape with Scrapy

Tram Ho

Hey yoo, hi everyone, today I will introduce how to save data after Scrape into mongoDB, the details of the project I introduced in the previous lesson. Link

Introducing MongoDB

  1. MongoDB is an open source database type, of the NoSQL type.
  2. As a document-oriented database, data is stored in a place called Collection, similar to tables in database systems such as MySQL and PostgreSQL.
  3. Compared to RDBMS, in MongoDB collection corresponds to table, while document will correspond to row, MongoDB will use documents instead of row in RDBMS.

Write code for pipeline

We rewrite the pipeline, using pymongo (A tool that helps interact with MongoDB through Python.

The MongoDBPipeline class initialized with the Init constructor creates a MongoClient object with the attribute “localhost” and connects at port 27017, we name the database “Mac” and the name of the Collection and “Item”

After adding the code just now, change the name of the pipeline in the settings file to:

Rerun the project, and then check in MongoDB, we run the command.

later :

We can see that the Mac database has been created

Okay, let’s query. The first is:

And after that :

Data has been saved as shown below:

Conclude

This article I introduced how to import data into MongoDB from Scrapy, the next article I will introduce about Scrapy Cluster, thank you for your interest.

Share the news now

Source : Viblo