Problems related to image data in Computer Vision

Tram Ho

As everyone knows data is extremely important for machine learning so today I will share around the problem of images. The article includes the following sections:

  • Image storage
  • The tool annotate data

Image storage

When we talk about deep learning, usually the first thing that comes to mind is a huge amount of data or a large number of images. image. However, the larger the amount of photos you have, the more stored it on your computer will take up memory area. ImageNet is a well known image database gathered to train models for tasks like classification, detection, and segmentation and it includes over 14 million images.

Here I will share with you 3 ways to store images.

Figure 1: Image storage

Save in the .png file on disk

With the storage on this disk, you should install the pillow to make it simpler and more efficient

How to save

When processing data stored on disk, we should store a separate label file in a .csv file to avoid having to open all the files every time, just read a few pictures.

Store it in lightning memory-mapped databases (LMDB)

LMDB is a key value storage system where each item is stored as a byte array. The key will be a unique identifier for each image, and the value will be the image itself. LMDB is memory mapped. This means it returns a direct pointer to the memory address of both the key and value, without having to copy anything in memory like most other databases. Let’s install LMDB and let’s try

Here we will try with CIFAR files offline


Save as (HDF5)

With HDF5 you can store more than 1 data set, you can split the data and store it. Install with pips first:

Create the hdf5 file:

How to load and save

Above I have outlined 3 ways to store data, the next will continue to the new section.

Tools annotated data

In the machine learning problem, data processing and analysis is extremely important, so I will introduce to everyone some annotated data tools to make the work of making data simpler.


This tool is suitable for segmentation problems such as finding cars, roads, and cells in medicine to support diagnosis.

Figure 2: These two images are examples of segment (internet) images.

This tool is using the watershed marked algorithm of OpenCV. People can go to the binary link to download the tool and use it.

Figure 3: Tool interface

Usage: You can change the colors in the config file in the source code and then leave the number of colors corresponding to the different regions you want segmentation. Then just use your mouse to “dot” the color and press the “enter” key according to each of your desired color zones.

Data generation tool

Text Recognition Data Generator is a tool used to generate text.

With this tool you can generate different text styles and colors for your text detection problem. You just need to save the cn.txt file in the dicts and the fonts are also saved in the folder always and run the code according to the following code:

In order to generate the correct data according to problem requirements, you should thoroughly study the documentation

Tool LabelImg

LabelImg is also a tool used to annotated data, but different from Pixeltool in that LabelImg used to get 4 surrounding corners. To install the tool you can either Clone github or use pip

pip3 install pyqt5 lxml # Install qt and lxml by pip

My article is somewhat dull, I hope everyone will read for comments so that I can write better in future articles. Thanks


Share the news now

Source : Viblo