Database optimization and query tree refinement in MySQL

Tram Ho

When properly adjusted, the database will deliver excellent performance. It not only reduces unnecessary workload, but also optimizes the database for faster access, avoids unexpected problems such as deadlock, resource shortages, and more. serious consequences.

Optimize the query

1. Avoid using functions, starting with a wildcard in predicate


When using UPPER (), the database will not be able to use the COL1 column index, resulting in slower execution. If there is no other way and a function is required in a comparison, either create a function-based index or create a custom column in the database (perhaps a new column) to improve performance. export.


Using a wildcard (%) such as ‘% ABC’ causes a full table scan. In most cases using wildcards causes a performance limitation.

2. Avoid querying unnecessary columns in the SELECT statement

Instead of using ‘SELECT *’ to get all the row data, just get the columns you use and find necessary in the SELECT statement to improve MySQL performance. Because unnecessary columns will be added when loading the data, this leads to slowdowns and reduced performance.

3. Use Inner join instead of outer join if possible

Use outer join only when it is the only option. It not only limits the performance of the database, but also restricts optimal options in the MySQL query.

4. Use ORDER BY, DISTINCT and UNION only when absolutely necessary

  • Using without any purpose slows down the query because of having to sort the data. Instead of using UNION if possible, using UNION ALL (combining results without deleting duplicate columns) will yield better performance.

5. Subqueries and Join

In theory, all join statements can be converted to subqueris, but you should also consider using subqueries or joining in each case.

  • Understand the simple way that when we execute the JOIN statement, we will create a temporary table that is the pooled data of the tables satisfying the ON clause and SELECT data in that clipboard.
  • Subqueries are the execution of the IN statement in the data of each table. So for data <20K records, the job of JOIN is more effective, when the data is> 100k +, the IN statement gives better results.

Database optimization

6. Understand the four basic resources

It takes 4 sources, namely CPU, disk, memory, and network to create the function of the database. If any of the resources are not working properly, it will affect the database server and lead to poor performance. In most cases, upgrading hardware can improve system performance.

For example: Select ENGINE suitable for the purpose of use

MYSQL supports many types of ENGINE. We can use below commands to display information of each ENGINE type

The ENGINE types are widely used:

  • InnoDB:Transaction support (ACID compliant) commit, rollback.

    Supports row-level locking (row-level locking), increasing performance and multi-user

    Support for full foreign key constraints missing FOREIGN KEY. InnoDB is the default storage type since MySQL 5.5.5

  • MyISAM:Popularly used for the Web, data warehousing, was the default type of storage in earlier versions MySQL 5.5.5
  • Memory:Save all data in RAM, for super-fast access (Prefixed to HEAP).

Example: Select the data type for the field, the size of the cache, … calculate, because when we allocate how much memory, the database will use as much resources that we give even if it is not. really need that much. When we set the type for the name field, it is usually only VARCHAR (255), …

For example: For numeric fields such as ID, we usually set the type to INT ( id INT AUTO_INCREMENT PRIMARY KEY ) and consider it as obvious, but if you think about it a bit, you can realize that the field Our ID is never such a negative number if we leave it to INT (4 bytes) in the range -2147483648 to 2147483647, of course, there will be a waste of negative storage to take advantage of, then we can use it ( id INT UNSIGNED AUTO_INCREMENT PRIMARY KEY ) with INT UNSIGNED between 0 and 4294967295.

7. Use caching

When receiving a query from the user, SQL checks in the cache for the results of the query or not, if so, it will return to the user without re-executing the statement, if not in the In the cache, MySQL will execute the query into the database, then store the results in the cache and return the results to the user.

  • Use the EXPLAIN statement to get information about MySQL’s execution of a query for the best performance strategy.

  • Save your queries as well as their results. Query cache is suitable for tables that don’t change very often. If a table changes then all cached queries that use the table are no longer usable and are removed from the cache.

Although the values ​​are likely to be retrieved from memory (from the InnoDB buffer pool or the MyISAM key cache), using the query cache avoids query processing overload, deciding whether to use table scans, and determining whether blocks of data per row.

Displays the values ​​of the cache

Check the cache values ​​to adjust accordingly for resources.

  • query_cache_type: Determines the operation mode of the query cache
    • 0 or OFF: Does not store results or retrieve stored results
    • 1 or ON: Allow storage except for statements that start with SELECT SQL_NO_CACHE.
    • 2 or DEMAND: Store only the results that started SELECT SQL_CACHE.
  • query_cache_limit: Determines the maximum size for a result that can be stored in the cache
  • query_cache_size: Determine the amount of memory to distribute to the Query cache. The default value of the variable is 0, which means query caching is not enabled

Example: Reset value for variable

8. View running stream information

You can see MySQL servert’s work in progress to determine the bottlenecks. Check the list of threads executing in the server:

To cancel an inefficient process we can use the command

9. Transactions

  • Using short, transactional transactions increases performance, reduces deadlock risk, and delivers accuracy. Performance with precision, however, doesn’t go the same way.
  • You can consider using READ UNICOMMITED if you don’t need too precise data (the higher the isolation level between transactions, the lower the performance will be reduced) to increase the performance of the database query.
  • With the MyISAM table that does not support transactions, we use the table lock mechanismLOCK TABLEs <table name> READ/WRITE
    • READ: Other connections have read but cannot write to the table
    • WRITE: The current connection can write + read the table, other connections cannot access the table

10. Table optimization

The OPTIMIZE TABLE statement performs a number of functions, such as defragmenting, and sorting the indexes of the tables. OPTIMIZE TABLE does the following:

  • Create a clipboard
  • Remove the original after optimizing it (Minimize data pages, Narrow index pages, Calculate new index statistics)
  • Finally, rename the clipboard to its original name.

11. Save queries slowly

For queries that have execution time but we wait for it to finish before executing other tasks, it is indeed ‘broken’ the best way is to save these queries so we can continue. other actions.

For example, for statistical tasks that usually take a lot of time, the statistical query can be saved with a processing message and the user can perform other tasks when the query is complete. then notify the user)

Verify that the values ​​are properly configured.

  • long_query_time = [value]:This value specifies that if a query takes longer to execute than we specified in this configuration, MySQL will log the information related to that query. Here I leave it the number “1”, so it’s almost like recording all slow queries that take more than 1 second to execute. The default value is “10” ie 10 seconds.
  • slow_query_log = [0/1]If the value is “0” then the feature of log slow query is disabled, while the “1” feature enables this function. By default, MySQL disables the feature.

12. Indexing

Should be indexed when:

  • Columns that appear much in the WHERE / JOIN clause
  • MySQL automatically indexes primary and foreign keys

Should not

  • Do not type for columns that only appear in select, do not create index dumps
  • Columns with a small range of values ​​(MySQL evaluates if a value occurs> = 30% in the table will not automatically be indexed)

Index type

  • Hash index: Very fast compared to exact comparison, but slow to compare values ​​in range of values.
  • B-tree index: used effectively with both exact comparison, approx. Can be used for LIKE search (if it doesn’t start with wildcard)
Share the news now

Source : Viblo