Learn about the data structure of an array

Tram Ho

Foreword

Array (array data structure or simply array) – the most basic data type of any language. Although it is very basic, how much do you understand about arrays? How do processes like accessing the i-element or adding one work and how much are complex?

Let me find out.

What is an array?

Very simple, anyone can answer like this

Arrays are collections of elements, defined by an index.

But is that enough? What else needs to be mentioned? Do you know all of the following when defining arrays?

  • Elements have the same data type. The data type of the elements can be of type array so we can have more concepts about one-dimensional and multi-dimensional arrays.
  • Arrays have a fixed size
  • The elements are stored in consecutive memory cells or in a way that can calculate the position of the elements with a mathematical expression.

OK, that is enough for the concept of arrays. If there’s anything else you can comment to let me know.

And when it comes to arrays, it is impossible not to mention dynamic arrays. Most of us prefer to use dynamic arrays in our code and arrays with a defined number of elements. So what is a dynamic array? And is it any different from the above concept?

Dynamic arrays are arrays of variable size by allowing the addition or deletion of an element

A brief but complete concept when it comes to dynamic arrays. The obvious difference from the array is that its size can be changed. When working with arrays, you cannot add or remove elements in the array (only allowed to change values), while with dynamic arrays it is possible. When it comes to working with arrays, there are two main problems in the data structure that need to be grasped: the size of the array and the index of the elements. (This article focuses on the data structure, not the he algorithm.)

Array size

The size of an array of fixed elements is simple, as shown in the definition, which is a specific number. But what about the size of the dynamic array? Will you be able to answer the following:

Dynamic arrays are also arrays. Which array has a fixed size. So dynamic arrays have a fixed size?

  • If the size is fixed, how does it resize?
  • If the size is not fixed, what is the size?

It is not easy to answer right. I also fell to this question and realized I didn’t know anything about it. Just use it and don’t think, keep being a newbie. This is actually a trap in our logical way. You should understand these are two different concepts:

  • Fixed size means that at a time the size of an array is always a number that can be determined.
  • Being resizable means it does not have a certain size at all times. This is what makes the dynamic array difference.

And to learn more about dynamic arrays, we need to answer two questions

  • How do arrays change size?
  • How does a dynamic array determine its size?

Here are two basic questions to understand the structure of a dynamic array. Although there are differences between some programming languages, in general the mechanism of resizing a dynamic array is as follows:

  • Dynamic arrays will first be allocated based on a fixed size array. This explains why you encountered an index error when working on dynamic arrays.
  • When a new element is needed. Then the system will allocate a new fixed array with a larger size, copy existing elements into the new array and then add new elements.

This is common sense for increasing the size of the array. However, its disadvantage is that it is expensive to allocate a new array and copy the original elements. One main solution is to create arrays larger than the current number of elements. Adding or deleting elements will not affect the size of the current array. Only when all memory cells are allocated, does the system create a new array and copy the elements. This helps increase system performance and reduce complexity when adding or removing elements. However, in order to determine the size of an array for this case, there are two concepts to grasp:

  • logical size or array size: represents the actual number of elements in the array
  • physical size or capacity: this is the physical size of the array to be granted in the system.

An image that explains the structure of dynamic arrays. The dashed blank cells represent the device used to add the element. If only the memory cells were used, adding the element would happen very quickly and the complexity was only O (1). However, if the memory needs to be re-allocated (in the case of a green turtle is marked), the complexity is now O (n) because it needs to manipulate the copying of n elements. If it is necessary to give an exact number of complexity when adding or removing elements in the dynamic array in this case, O (1) is still a reasonable choice.

Besides, by remembering the logical size, we can determine the size of the array at all times. And of course the complexity when determining its size is O (1), just take the value of logic size.

Note:

  • Some other dynamic array structures do not follow this mechanism. Such as linked lists. The linked list is a special dynamic array structure, the elements are strongly linked, the order is not determined by their physical location in memory. For linked lists, adding words or removing elements is quite simple if the last element is defined (complexity O (1)). However, if the last element is not specified, it is O (n). On the other hand, linked list is still a fixed element array, the number of elements in a given time is a specific number. However, in order to determine the number of elements we have to go through all the elements and its complexity is definitely O (n).
  • The complexity when adding between dynamic arrays is different from the complexity at the end of the array

Indexing – indexes in the array

As you know, when working with arrays, we can access the element directly through its index (for example, arr [5] used to access the sixth element). Returning to the concept of an array, the elements are stored in successive cells or stored in a way that can calculate the position of elements with a mathematical expression. And the index is the main parameter to determine the position of the element in memory.

For example, an int array of 10 elements is stored in memory, with each int element requiring 4 bytes to store then the system will spend a 40-byte memory area to create the array. Assuming this area is from memory cell 1000 ~ 1039, the first element will be stored at 1000, the second element will be 1004. If it is zero-base indexing, the index of the elements will be from 0 – 9 and the expression for determining the position of the ith element will be 1000 + (ix 4)

Through this expression, accessing the indexed element only has a complexity of O (1), which is very fast.

Epilogue

Hopefully, through this article, you have a closer look at the array data structure: how to organize storage in memory as well as how it works when adding / removing words or accessing elements by index. In addition to the execution command, we have also grasped the behind-the-scenes handling of it. That’s amazing right.

Thank you for watching. Have a nice day.

Refer

https://en.wikipedia.org/wiki/Array_data_structure https://en.wikipedia.org/wiki/Dynamic_array

Share the news now

Source : Viblo