[Database] Bài 2 – Data & Metadata

Tram Ho

In this article, we will meet the first two convention concepts about storing data in database – that is Data & Data Description – or Data & Metadata. ## Prefix meta From data or Data we already know. There’s nothing to worry about with the opening brick. As for the word meta, somewhere defined by Google and Wikipedia as a more concise and succinct description of the very object it is standing in front of. That means it will also be understood relative to the observation position, and here we have metadata used to briefly describe data. But how to describe? Our view now is that the data user is a ‘computer’. And so we can understand metadata as a brief description of data to make computer understand some part of data. But what do computers understand? Hmm… we’re building a simple blog, not a smart AI assistant software. So obviously the database management code we write won’t understand the context of the stories inside a file. But in particular about the organization of data storage, there is quite a lot of information that we can give the database management code to understand about the archive files. For example… what kind of data? Or more precisely, what does that file contain? Is it an article or a newsletter about the admin blog manager? Or another question… is this article created the first or is it the 1001st? Or what is the id number? Or if your blog has a category for articles, then we have another question… which category does this article article fall into? And that’s it, we’ve got the first few metadata attributes to describe the article archive files – to make it possible for the computer to ‘grow at a glance’ to know which file is the one we’re looking for. want to retrieve information. Now we will create a database directory in the project express-blog under construction to write code that provides basic data access features so that the code processing at routes can use . ## Data storage directory structure Here we will be concerned only with the database directory and the code that provides data access operations is a file manager.js located at the level. first of the database directory. We will create 2 files article to store the content of 2 articles and 1 file admin to store the content of 1 author. Each article or admin file that stores the complete content of a data object (article or author) like this is also known as a record or record. structure.txt [express-blog] | +------------[database] | | | +---------[data] | | | | | +-----article--id-0000--category-html.md | | +-----article--id-0001--category-html.md | | +-----user--id-00.json | | | +---------manager.js | +---test.js This is the name structure of the files that we start with to start a more detailed discussion – – With this naming, we have the files that store the content Articles are distinguished from author files by the first keyword in the file names, article and admin. – Then each article file has an identifier id to distinguish it from each other and also the order in which the files were created. – Finally, the categories in which we arrange the files with related content will be represented by the third keyword in the filename. At this point we can assume there is a complete web interface and the user clicks on a link requesting to see the first article which is htttps://your-name.com/article/0001. We should certainly be able to extract the id in the path that requires /article/:id and call the code in manager.js that requires access to the file article--id-0001. Then send the content back to the web browser. However, let’s start with retrieving the names of all article records and printing them to console. database/data/manager.js const path = require("path"); const fsPromises = require("fs/promises"); const dataFolder = path.join(__dirname, "data"); const queryAllArticles = function() { fsPromises .opendir(dataFolder) .then(async function(fsDir) { for await (var dirEnt of fsDir) console.log(dirEnt.name); }) .catch(function(error) { console .error(error); }); }; // queryAllArticles module.exports = { queryAllArticles }; – [fsPromises.openDir](https://nodejs.org/dist/latest-v16.x/docs/api/fs.html#fspromisesopendirpath-options) – opens a directory. – [fs.Dir](https://nodejs.org/dist/latest-v16.x/docs/api/fs.html#class-fsdir) – object describes the directory. – [fs.Dirent](https://nodejs.org/dist/latest-v16.x/docs/api/fs.html#class-fsdirent) – object describes the elements inside a message item. test.js const database = require("./database/manager"); database.queryAllArticles(); CMD | Terminal cd Documents/express-blog npm test article--id-0000--category-html.md article--id-0001--category-html.md user--id-00.json Hmm… So here we will have to add filtering and separate the article files into separate groups. Then you can print the file list. Obviously the convention of our directory structure could be further improved at this point to reduce the work of working with the result set for the code. We will now put the article files in a subdirectory of data and the admin files in another directory. structure.txt [database] | +---------[data] | | | +-----[article] | | | | | +--------id-0000--category-html.md | | +--------id-0001--category-html.md | | | +-----[admin] | | | +---id-00.json | +---------manager.js Great… now everything seems to be better separated and we can do our own work with the article or admin records in the code without having to do unnecessary extra processing. Since we used the data type name as the directory name to classify the records, we can omit the first element in the names of the data files and use the filenames derived from id . database/data/manager.js const path = require("path"); const fsPromises = require("fs/promises"); const dataFolder = path.join(__dirname, "data"); const articleFolder = path.join(dataFolder, "article"); const queryAllArticles = function() { fsPromises .opendir(articleFolder) .then(async function(fsDir) { for await (var dirEnt of fsDir) console.log(dirEnt.name); }) .catch(function(error) { console .error(error); }); }; // queryAllArticles module.exports = { queryAllArticles }; npm test id-0000--category-html.md id-0001--category-html.md So up to this point we can proceed by testing id requires in the list of filenames to be printed to select the correct file to read data from and to send a response to the user. However, now we need to pay more attention to what is displayed, in particular a single web page displaying an article might need a little more information about the articles. Examples include the related keyword keywords, or the last modified time edited-datetime, and a descriptive name shorter than the original title short-title to use for the side navigation bar. next to the main content display frame of the web page. ## Adding metadata for users All of the extra stuff we just talked about – keywords, short-title, edited-datetime – are metadata to users use. As for our manager management software, these are all datas. To further store these datas in a record, we can use the default convention about the internal content of the data files. For example, we can save a JSON string at the beginning of the article files to describe the additional data as above, then we can split this JSON string when reading the contents of the articles. file. This additional data storage method is called prefix data storage. A good example is [**Jekyll by Github**](https://jekyllrb.com/docs/front-matter/) which calls these Front Matters and stores them in YAML – a model format describe data as an object with key/value' pairs similar to JSON. — title: How to create a website? short: Getting Started datetime: 2017-07-27 05:00:00 — Forget about technical and academic views. We go online everyday. We can start at homepage of [a website](https://medium.com/ “ext”) and explore thousands of its pages. That’s it. > A website is a collection of many webpages. > __A simple & happy Mind So if you want to create a website, just start it simply by learning how to create a single webpage. Operation to extract prefix data, we can follow Jekyll's Front Matter by using separator symbols to delimit additional data. leave the main body of the article. However, it will be a bit inconvenient when we only want to access the Front Matter for a quick browsing of the records, but the processing code has to read the entire content of a long article file. and then split up with Front Matter. Here we can continue to improve the convention about the data directory structure as follows - structure.txt [database] | +———[data] | | | +—–[article] | | | | | +——–[id-0000–category-html] | | | | | | | +—header.json | | | +—content.md | | | | | +——–[id-0001–category-html] | | | | | +—header.json | | +—content.md | | | +—–[user] | | | +—id-00.json | +———manager.js Now we have each article record represented by a directory named id-0000–category-html, and the section the main content of content articles is still placed in .md files; The additional data keywords, short-title, edited-datetime is stored in a separate file header.json. With this directory organization, querying for record names by id will still be performed as before without reading detailed data files. And once we have identified the record to send to the user, we can write code to read the header.json and content.md files of that record very simply. ## End of the article Thus, we have gone through some basic analysis when designing the directory structure to store data and roughly understand the relative concepts Data & Metadata. At present, we can store article records and retrieve when the user clicks on a link by extracting the id in the URL path parameter. And that's it, we can basically write code that completes building a simple blog. However, with the above database design, we still have some limitations when expanding the blog's features a bit. For example, if you build a blog user interface with additional single pages that describe the content of a category of articles; now each category will be a category record and should be stored in the database like article and admin. And we can see that there is a slight correlation between the article records and the category records; If we edit a category record to change the display name of a category; then to ensure data consistency - our code will now also need to update the directory names of the 1001 article` records that are in that category; or… we have a new topic to explore. 😀 [**[Database] Lesson 3 – Relational Database**](#)

Share the news now

Source : Viblo