Some SQL Interview Questions You Should Know!

Tram Ho

In data field interviews, technical questions are definitely indispensable, and a data person needs at least a solid understanding of SQL. Today, I want to share with you some of the questions that we might come across in those interviews. This is also a summary of a bit of SQL knowledge, sometimes we can forget, so read it again to remember.

1. What is JOIN? List JOIN types.

JOIN is an important knowledge in SQL, it will be used a lot for the purpose of getting information from many tables combined together to create a quantity report and we should understand it.

Definition : A JOIN in a database is a union of rows from two or more tables based on a related column between them (Join Conditional Column). It allows to query data from multiple tables in one statement by establishing relationships between them. Here are the different types of joins:

  1. INNER JOIN: Returns only matching rows between two tables based on the specified join condition. It removes rows that do not have a match in another table.
  2. LEFT JOIN: Returns all rows from the left table and matching rows from the right table based on the join condition. If there is no match, it returns NULL for the columns of the right table.
  3. RIGHT JOIN: Returns all rows from the right table and matching rows from the left table based on the join condition. If there is no match, it returns NULL for the columns of the left table.
  4. FULL OUTER JOIN: Returns all rows from both tables, including matching rows when there is and NULL when there is no match. It combines the results of both left and right outer joins.
  5. CROSS JOIN: Returns the Cartesian product of the two tables, which means it joins each row from the first table with each row from the second table. The result will contain all possible combinations of rows from both tables. An example: 1 record has 10 records and 1 record has 5 records, the Cross Join result will return 5×10 = 50 records. center
  6. SELF JOIN: Self join is a type of join in which a table is joined to itself using relational columns. In a self join, the table is treated as two different versions of itself and named mansion to distinguish between them. Self join is often used when you want to compare rows in the same table to find relationships or information related to the table itself.

2. What is Index? Explain the types of indexes.

In a database, an Index is a sub-data structure created to speed up querying and searching data in a table. It acts as an additional table, holds a small copy of the data from the important columns in the original table and arranges it in an optimal way for faster searching.

When a query is executed, the DBMS uses the index to efficiently find information. Instead of scanning the entire data in the table, the index helps to limit the number of records to be scanned by pinpointing the exact location of the data to be searched. Thereby optimizing the way to search data in the table in terms of time and cost.

Types of Indexes:

B-tree (Balanced Tree) is a popular and widely used type of index in databases. A B-tree is a balanced tree with the nodes containing the keys arranged in ascending or descending order. B-trees are designed to provide fast data search in O(log n) time, where n is the number of records in the database since the records are sorted according to the B algorithm. -tree. B-Tree is suitable for tables with a large amount of data and the index field has a unique value.

Bitmap index is a type of database index used to enhance query performance for fields with a small number of discrete values. It will be suitable for attribute columns of gender (male, female) or other columns that have discrete values ​​with small numbers. Bitmap index allows quick query using logical operators AND, OR, XOR on bitmaps.

Clustered Index is an index type that sorts the data in the table in ascending or descending order of the key. Each table can have only one clustered index. The data in the table is sorted in the same order as the key in the clustered index.

Non-Clustered Index is an index type that creates an additional data structure and holds a copy of the key values ​​along with a pointer to the original. Non-clustered index does not sort the data in the table but only creates a quick search structure => this is the difference between Clustered Index and Non-Clustered Index.

Unique Index Same as a non-clustered index, but ensures that the values ​​in the index are unique.

3. When do we need to index the table? And is there ever an indexing that doesn’t improve query performance?

When should we use the index to be effective, because if the index is not created properly, it will also be zero.

Purpose of creating index to improve efficiency in SELECT statement:

  • Frequent SELECT queries: If the SELECT statement is executed frequently on a particular field, creating an index on that field can improve query performance.
  • JOIN Query: When using JOIN statements to join two or more tables, creating indexes on the fields used in the JOIN condition will help speed up the query.
  • WHERE Query: If a WHERE conditional SELECT statement is based on a specific field, creating an index on that field makes searching and filtering data faster.

It should be noted that creating an index also has its downsides. Index takes up storage space and has an impact on the performance of data add, edit, and delete operations (INSERT, UPDATE, DELETE). When we need to perform add, edit and delete operations, we can disable the entire index for the table in the process for the fastest time, and then Rebuild again. Therefore, careful consideration and consideration is required when creating an index, ensuring that the benefits of query speed outweigh the limitations of using index.

If we’ve indexed and we don’t see an improvement in query performance, it could be some of the following problems:

  • Choosing the wrong field to create an index: One of the important rules when creating an index is choosing the right field. If the index is typed on a field that is not used in the query statement or is not used correctly, the index may not provide any benefit and will not improve query performance.
  • Since the indexed column is used by the query in an expression, the index will not be performed in this situation.
  • Change in environment or requirements: If the database environment or query requirements change after the index is created, the index may no longer be as relevant or as efficient as before. In this case, consideration should be given to recreating or adjusting the index to accommodate the new requirements.
  • Statistics and data analysis: When the index is not updated appropriately or the statistical information about the data is incorrect, it can lead to suboptimal query plan selection and thus no performance improvement. .
  • Query statement structure: Sometimes, creating an index does not improve query performance because the query statement structure is not optimal. The query statement may need to be reviewed and re-optimized to get the most out of the created index.

4. Distinguish UNION, MINUS and INTERSECT?

UNION : The UNION operator is used to combine the results of two or more SELECT statements into a single result. It removes duplicate rows in the final result. The columns in each SELECT statement must have the same data type and in the same order. The number of columns must also match.

MINUS : The MINUS operator is used to get rows that appear in the first SELECT statement but not in the second SELECT statement. It returns the unique rows from the first SELECT statement by comparing them with the rows in the second SELECT statement.

INTERSECT : The INTERSECT operator is used to get rows that appear in both the first and second SELECT statements. It returns the rows common between two result sets.

UNION, MINUS, and INTERSECT require SELECT statements with the same number of columns and compatible data types. Additionally, they operate on unique rows, and the order of columns in the result is determined by the first SELECT statement.

5. There are 2 data tables, need to know if the 2 tables are the same? Let’s say 2 bulk tables.

We have many ways to handle this problem, we can use COUNT(*) or MINUS. And when 2 tables have a large number of records, using MINUS can be more efficient because it only returns the rows that differ between the two tables instead of counting all the rows.

Method 1: Use COUNT(*)

  1. Count the number of records in the first table.
  2. Count the number of records in the second table.
  3. Compare results. If the number of records in the two tables is equal, it can be concluded that the two tables are the same.

Method 2: Use MINUS

  1. Execute a MINUS statement with two SELECT statements on two tables.
  2. If the returned result is null (no rows), it can be concluded that the two tables are the same.
  3. If the result is not null, then the two tables are different.

6. State the difference between using DROP and TRUNCATE.

The DROP and TRUNCATE statements are used to remove objects in the database, but they have some functional and behavioral differences:

DROP statement:

  • The DROP statement is a DDL (Data Definition Language) statement.
  • It is used to permanently delete objects in the database, such as a table, view, index, or sequence.
  • When you use the DROP statement, all data, metadata, and associated objects of the deleted object are deleted.
  • The DROP statement cannot be undone, as it is an implicit commit that cannot be undone.
  • It is important to use the DROP statement carefully, as it permanently deletes the object and its data.

TRUNCATE statement:

  • The TRUNCATE statement is a DDL (Data Definition Language) statement.
  • It is used to delete all rows from a table, effectively resetting the table to its original state.
  • When you use the TRUNCATE statement, the table structure, constraints, and indexes are preserved, but all data in the table is deleted.

We can see some obvious differences above, the DROP statement is used to permanently delete database objects, while the TRUNCATE statement is used to delete all rows from a table. while preserving the structure of the table. So using TRUNCATE may be safer when you want to delete data, but it still depends on specific requirements.

7. State the difference between using DELETE and TRUNCATE.

For the DELETE statement:

  • It is used to delete specific rows of data from a table based on a specified condition or criteria.
  • The DELETE statement is slower than the TRUNCATE statement, especially when deleting a large number of lines, because it generates transaction logs and can cause additional overhead.
  • It can be rolledback using the ROLLBACK statement if executed within a transaction.

For the TRUNCATE statement:

  • It is used to delete all rows from a table, effectively resetting the table to its original state.
  • The TRUNCATE statement is faster than the DELETE, especially when deleting a large number of lines, because it does not create a transaction log and does not have to log each line individually.
  • It cannot be rolledback, as this is an implicit commit that cannot be undone.
  • The TRUNCATE statement does not use a WHERE clause, because it deletes all rows from the table.

End

Above are a few questions about SQL knowledge that you can refer to to supplement your knowledge as well as prepare for interviews about data-related careers in databases using SQL. If you want to know more questions, you can refer to the link of my questions below.

Thank you for reading this article.

Reference

  1. SQL Interview Questions ( https://www.interviewbit.com/sql-interview-questions/#sql )
  2. Top 115 SQL Interview Questions You Must Prepare In 2023 ( https://www.edureka.co/blog/interview-questions/sql-interview-questions )
Share the news now

Source : Viblo