[Database] Bài 2 – Data & Metadata
- Tram Ho
In this article, we will meet the first two convention
concepts about storing data in database
– that is Data & Data Description
– or Data & Metadata
. ## Prefix meta
From data
or Data
we already know. There’s nothing to worry about with the opening brick. As for the word meta
, somewhere defined by Google and Wikipedia as a more concise and succinct description of the very object it is standing in front of. That means it will also be understood relative to the observation position, and here we have metadata
used to briefly describe data
. But how to describe? Our view now is that the data user is a ‘computer’. And so we can understand metadata
as a brief description of data
to make computer
understand some part of data
. But what do computers understand? Hmm… we’re building a simple blog, not a smart AI assistant software. So obviously the database
management code we write won’t understand the context of the stories inside a file. But in particular about the organization of data storage, there is quite a lot of information that we can give the database
management code to understand about the archive files. For example… what kind of data? Or more precisely, what does that file contain? Is it an article
or a newsletter about the admin
blog manager? Or another question… is this article
created the first or is it the 1001st? Or what is the id
number? Or if your blog has a category for articles, then we have another question… which category does this article
article fall into? And that’s it, we’ve got the first few metadata
attributes to describe the article archive files – to make it possible for the computer to ‘grow at a glance’ to know which file is the one we’re looking for. want to retrieve information. Now we will create a database
directory in the project express-blog
under construction to write code that provides basic data access features so that the code processing at routes
can use . ## Data storage directory structure Here we will be concerned only with the database
directory and the code that provides data access operations is a file manager.js
located at the level. first of the database
directory. We will create 2 files article
to store the content of 2 articles and 1 file admin
to store the content of 1 author. Each article
or admin
file that stores the complete content of a data object (article or author) like this is also known as a record
or record
. structure.txt [express-blog] | +------------[database] | | | +---------[data] | | | | | +-----article--id-0000--category-html.md | | +-----article--id-0001--category-html.md | | +-----user--id-00.json | | | +---------manager.js | +---test.js
This is the name structure of the files that we start with to start a more detailed discussion – – With this naming, we have the files that store the content Articles are distinguished from author files by the first keyword in the file names,
article
and admin
. – Then each article file has an identifier id
to distinguish it from each other and also the order in which the files were created. – Finally, the categories in which we arrange the files with related content will be represented by the third keyword in the filename. At this point we can assume there is a complete web interface and the user clicks on a link requesting to see the first article which is htttps://your-name.com/article/0001
. We should certainly be able to extract the id
in the path
that requires /article/:id
and call the code in manager.js
that requires access to the file article--id-0001
. Then send the content back to the web browser. However, let’s start with retrieving the names of all article
records and printing them to console
. database/data/manager.js const path = require("path"); const fsPromises = require("fs/promises"); const dataFolder = path.join(__dirname, "data"); const queryAllArticles = function() { fsPromises .opendir(dataFolder) .then(async function(fsDir) { for await (var dirEnt of fsDir) console.log(dirEnt.name); }) .catch(function(error) { console .error(error); }); }; // queryAllArticles module.exports = { queryAllArticles };
– [
fsPromises.openDir
](https://nodejs.org/dist/latest-v16.x/docs/api/fs.html#fspromisesopendirpath-options) – opens a directory. – [fs.Dir
](https://nodejs.org/dist/latest-v16.x/docs/api/fs.html#class-fsdir) – object
describes the directory. – [fs.Dirent
](https://nodejs.org/dist/latest-v16.x/docs/api/fs.html#class-fsdirent) – object
describes the elements inside a message item. test.js const database = require("./database/manager"); database.queryAllArticles();
CMD | Terminal
cd Documents/express-blog npm test article--id-0000--category-html.md article--id-0001--category-html.md user--id-00.json
Hmm… So here we will have to add filtering and separate the
article
files into separate groups. Then you can print the file list. Obviously the convention
of our directory structure could be further improved at this point to reduce the work of working with the result set for the code. We will now put the article
files in a subdirectory of data
and the admin
files in another directory. structure.txt [database] | +---------[data] | | | +-----[article] | | | | | +--------id-0000--category-html.md | | +--------id-0001--category-html.md | | | +-----[admin] | | | +---id-00.json | +---------manager.js
Great… now everything seems to be better separated and we can do our own work with the
article
or admin
records in the code without having to do unnecessary extra processing. Since we used the data type name as the directory name to classify the records, we can omit the first element in the names of the data files and use the filenames derived from id
. database/data/manager.js const path = require("path"); const fsPromises = require("fs/promises"); const dataFolder = path.join(__dirname, "data"); const articleFolder = path.join(dataFolder, "article"); const queryAllArticles = function() { fsPromises .opendir(articleFolder) .then(async function(fsDir) { for await (var dirEnt of fsDir) console.log(dirEnt.name); }) .catch(function(error) { console .error(error); }); }; // queryAllArticles module.exports = { queryAllArticles };
npm test id-0000--category-html.md id-0001--category-html.md
So up to this point we can proceed by testing
id
requires in the list of filenames to be printed to select the correct file to read data from and to send a response to the user. However, now we need to pay more attention to what is displayed, in particular a single web page displaying an article
might need a little more information about the articles. Examples include the related keyword keywords
, or the last modified time edited-datetime
, and a descriptive name shorter than the original title short-title
to use for the side navigation bar. next to the main content display frame of the web page. ## Adding metadata
for users All of the extra stuff we just talked about – keywords
, short-title
, edited-datetime
– are metadata
to users use. As for our manager
management software, these are all data
s. To further store these data
s in a record
, we can use the default convention about the internal content of the data files. For example, we can save a JSON string at the beginning of the article
files to describe the additional data as above, then we can split this JSON string when reading the contents of the articles. file. This additional data storage method is called prefix data storage. A good example is [**Jekyll by Github**](https://jekyllrb.com/docs/front-matter/) which calls these Front Matter
s and stores them in YAML – a model format describe data as an object with key/value' pairs similar to JSON.
— title: How to create a website? short: Getting Started datetime: 2017-07-27 05:00:00 — Forget about technical and academic views. We go online everyday. We can start at homepage of [a website](https://medium.com/ “ext”) and explore thousands of its pages. That’s it. > A website is a collection of many webpages. > __A simple & happy Mind So if you want to create a website, just start it simply by learning how to create a single webpage.
Operation to extract prefix data, we can follow Jekyll's
Front Matter by using
— separator symbols to delimit additional data. leave the main body of the article. However, it will be a bit inconvenient when we only want to access the
Front Matter for a quick browsing of the records, but the processing code has to read the entire content of a long article file. and then split up with
Front Matter. Here we can continue to improve the
convention about the data directory structure as follows -
structure.txt [database] | +———[data] | | | +—–[article] | | | | | +——–[id-0000–category-html] | | | | | | | +—header.json | | | +—content.md | | | | | +——–[id-0001–category-html] | | | | | +—header.json | | +—content.md | | | +—–[user] | | | +—id-00.json | +———manager.js
Now we have each
article record represented by a directory named
id-0000–category-html, and the section the main content of
content articles is still placed in
.md files; The additional data
keywords,
short-title,
edited-datetime is stored in a separate file
header.json. With this directory organization, querying for record names by
id will still be performed as before without reading detailed data files. And once we have identified the record to send to the user, we can write code to read the
header.json and
content.md files of that record very simply. ## End of the article Thus, we have gone through some basic analysis when designing the directory structure to store data and roughly understand the relative concepts
Data & Metadata. At present, we can store
article records and retrieve when the user clicks on a link by extracting the
id in the URL path parameter. And that's it, we can basically write code that completes building a simple blog. However, with the above
database design, we still have some limitations when expanding the blog's features a bit. For example, if you build a blog user interface with additional single pages that describe the content of a category of articles; now each category will be a
category record and should be stored in the
database like
article and
admin. And we can see that there is a slight correlation between the
article records and the
category records; If we edit a
category record to change the display name of a category; then to ensure data consistency - our code will now also need to update the directory names of the 1001
article` records that are in that category; or… we have a new topic to explore. 😀 [**[Database] Lesson 3 – Relational Database**](#)
Source : Viblo