When coding with Java/Kotlin, we often have to manipulate the subclasses of List, typically ArrayList. It’s so common that we sometimes forget about the array data type (Array) – A data structure that dictates that elements must have the same type and fixed size and cannot be changed.
So why can ArrayList add/remove elements so conveniently? Let’s find out together.
Jumping into the Java source code, it’s easy to see that ArrayList is just a ‘sticky wrapper’ =)) Inside it uses an object array to store data: transient Object[] elementData; // non-private to simplify nested class access
And also because Object is the largest data type in Java (People often call it Father Of Big – Father of to), List allows us to have generics to hold the data type whether any.
There are a few concepts that need clarification before getting started:
- Capacity: The capacity of the array is the maximum number of elements that the
elementData
array can hold. - Length: The actual number of elements added to the array. And so length <= capacity.
Now we will go through the operation steps of add() function:
1 2 3 4 5 6 | <span class="token keyword">public</span> <span class="token keyword">boolean</span> <span class="token function">add</span> <span class="token punctuation">(</span> <span class="token class-name">E</span> e <span class="token punctuation">)</span> <span class="token punctuation">{</span> <span class="token function">ensureCapacityInternal</span> <span class="token punctuation">(</span> size <span class="token operator">+</span> <span class="token number">1</span> <span class="token punctuation">)</span> <span class="token punctuation">;</span> <span class="token comment">// Increments modCount!!</span> elementData <span class="token punctuation">[</span> size <span class="token operator">++</span> <span class="token punctuation">]</span> <span class="token operator">=</span> e <span class="token punctuation">;</span> <span class="token keyword">return</span> <span class="token boolean">true</span> <span class="token punctuation">;</span> <span class="token punctuation">}</span> |
It’s really simple, just make sure the elementData
array has enough space and then insert the new element at the end of the array.
However, we will learn the function ensureCapacityInternal(), this is the function that ensures the request for more space to be able to accommodate the newly added element. By default, the array will be initialized with enough space to hold 10 elements , if the space is exhausted, the grow() function will be called.
The grow() function is in charge of 2 things:
- Increase the size of the elementData array.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 | <span class="token keyword">private</span> <span class="token keyword">void</span> <span class="token function">grow</span> <span class="token punctuation">(</span> <span class="token keyword">int</span> minCapacity <span class="token punctuation">)</span> <span class="token punctuation">{</span> <span class="token comment">// overflow-conscious code</span> <span class="token keyword">int</span> oldCapacity <span class="token operator">=</span> elementData <span class="token punctuation">.</span> length <span class="token punctuation">;</span> <span class="token keyword">int</span> newCapacity <span class="token operator">=</span> oldCapacity <span class="token operator">+</span> <span class="token punctuation">(</span> oldCapacity <span class="token operator">>></span> <span class="token number">1</span> <span class="token punctuation">)</span> <span class="token punctuation">;</span> <span class="token keyword">if</span> <span class="token punctuation">(</span> newCapacity <span class="token operator">-</span> minCapacity <span class="token operator"><</span> <span class="token number">0</span> <span class="token punctuation">)</span> newCapacity <span class="token operator">=</span> minCapacity <span class="token punctuation">;</span> <span class="token keyword">if</span> <span class="token punctuation">(</span> newCapacity <span class="token operator">-</span> MAX_ARRAY_SIZE <span class="token operator">></span> <span class="token number">0</span> <span class="token punctuation">)</span> newCapacity <span class="token operator">=</span> <span class="token function">hugeCapacity</span> <span class="token punctuation">(</span> minCapacity <span class="token punctuation">)</span> <span class="token punctuation">;</span> <span class="token comment">// minCapacity is usually close to size, so this is a win:</span> elementData <span class="token operator">=</span> <span class="token class-name">Arrays</span> <span class="token punctuation">.</span> <span class="token function">copyOf</span> <span class="token punctuation">(</span> elementData <span class="token punctuation">,</span> newCapacity <span class="token punctuation">)</span> <span class="token punctuation">;</span> <span class="token punctuation">}</span> ``` <span class="token class-name">Logic</span> tại đây khá rõ ràng <span class="token punctuation">,</span> số lượng phần tử của mảng mới <span class="token operator">:</span> <span class="token operator">-</span> <span class="token class-name">M</span> ặc định gán <span class="token operator">:</span> newCapacity <span class="token operator">=</span> <span class="token number">1.5</span> <span class="token operator">*</span> length hiện tại của mảng <span class="token punctuation">.</span> ``` <span class="token keyword">int</span> newCapacity <span class="token operator">=</span> oldCapacity <span class="token operator">+</span> <span class="token punctuation">(</span> oldCapacity <span class="token operator">>></span> <span class="token number">1</span> <span class="token punctuation">)</span> <span class="token punctuation">;</span> ``` <span class="token class-name">Ph</span> ép dịch bit sang phải <span class="token number">1</span> bit là phép chia cho <span class="token number">2.</span> <span class="token operator">=</span> <span class="token operator">></span> oldCapacity <span class="token operator">+</span> <span class="token punctuation">(</span> oldCapacity <span class="token operator">>></span> <span class="token number">1</span> <span class="token punctuation">)</span> <span class="token operator">=</span> <span class="token number">1.5</span> <span class="token operator">*</span> oldCapacity <span class="token punctuation">;</span> <span class="token operator">-</span> <span class="token class-name">N</span> ếu newCapacity vẫn nhỏ hơn số lượng mong muốn minCapacity thì sẽ gán luôn newCapacity <span class="token operator">=</span> minCapacity <span class="token operator">-</span> <span class="token class-name">N</span> ếu như minCapacity <span class="token operator"><=</span> <span class="token class-name">Integer</span> <span class="token punctuation">.</span> MAX_VALUE <span class="token operator">-</span> <span class="token number">8</span> thì sẽ gán minCapacity <span class="token operator">=</span> <span class="token class-name">Integer</span> <span class="token punctuation">.</span> MAX_VALUE <span class="token operator">-</span> <span class="token number">8</span> <span class="token operator">-</span> <span class="token class-name">N</span> ếu như minCapacity <span class="token operator">></span> <span class="token class-name">Integer</span> <span class="token punctuation">.</span> MAX_VALUE <span class="token operator">-</span> <span class="token number">8</span> thì sẽ gán minCapacity <span class="token operator">=</span> <span class="token class-name">Integer</span> <span class="token punctuation">.</span> MAX_VALUE luôn cho vừa lòng hả dạ <span class="token operator">=</span> <span class="token punctuation">)</span> <span class="token punctuation">)</span> <span class="token class-name">Nh</span> ư vậy thì số lượng phần tử tối đa mà một <span class="token class-name">ArrayList</span> có thể chứa chính là <span class="token class-name">Integer</span> <span class="token punctuation">.</span> MAX_VALUE <span class="token operator">=</span> <span class="token number">2147483647</span> <span class="token punctuation">;</span> <span class="token number">2.</span> <span class="token class-name">Copy</span> các phần tử của mảng hiện tại sang mảng mớ <span class="token class-name"><span class="token namespace">i <span class="token punctuation">.</span></span> V</span> ì mảng có tính chất là không thể thay đổi số lượng phần tử <span class="token punctuation">,</span> vậy nên việc tăng kích cỡ cho mảng đồng nghĩa với việc tạo một mảng mới có nhiều phần tử hơn sau đó mang các giá trị của mảng hiện tại <span class="token class-name"><span class="token namespace">sang <span class="token punctuation">.</span></span> V</span> ậy <span class="token class-name">Oracle</span> sử dụng vòng lặp trên <span class="token class-name">Java</span> để thực hiện việc này <span class="token operator">?</span> <span class="token class-name">T</span> ất nhiên là không được <span class="token punctuation">,</span> không ty đầu hàng công nghệ không thể làm theo cách dễ dàng như vậy đượ <span class="token class-name"><span class="token namespace">c <span class="token punctuation">.</span></span> Vi</span> ệc copy element của mảng cũ sang mảng mới được thực hiện thông qua hàm <span class="token class-name">Arrays</span> <span class="token punctuation">.</span> <span class="token function">copyOf</span> <span class="token punctuation">(</span> <span class="token punctuation">)</span> ```java <span class="token keyword">public</span> <span class="token keyword">static</span> <span class="token generics"><span class="token punctuation"><</span> <span class="token class-name">T</span> <span class="token punctuation">,</span> <span class="token class-name">U</span> <span class="token punctuation">></span></span> <span class="token class-name">T</span> <span class="token punctuation">[</span> <span class="token punctuation">]</span> <span class="token function">copyOf</span> <span class="token punctuation">(</span> <span class="token class-name">U</span> <span class="token punctuation">[</span> <span class="token punctuation">]</span> original <span class="token punctuation">,</span> <span class="token keyword">int</span> newLength <span class="token punctuation">,</span> <span class="token class-name">Class</span> <span class="token operator"><</span> <span class="token operator">?</span> <span class="token keyword">extends</span> <span class="token class-name">T</span> <span class="token punctuation">[</span> <span class="token punctuation">]</span> <span class="token operator">></span> newType <span class="token punctuation">)</span> <span class="token punctuation">{</span> <span class="token annotation punctuation">@SuppressWarnings</span> <span class="token punctuation">(</span> <span class="token string">"unchecked"</span> <span class="token punctuation">)</span> <span class="token class-name">T</span> <span class="token punctuation">[</span> <span class="token punctuation">]</span> copy <span class="token operator">=</span> <span class="token punctuation">(</span> <span class="token punctuation">(</span> <span class="token class-name">Object</span> <span class="token punctuation">)</span> newType <span class="token operator">==</span> <span class="token punctuation">(</span> <span class="token class-name">Object</span> <span class="token punctuation">)</span> <span class="token class-name">Object</span> <span class="token punctuation">[</span> <span class="token punctuation">]</span> <span class="token punctuation">.</span> <span class="token keyword">class</span> <span class="token punctuation">)</span> <span class="token operator">?</span> <span class="token punctuation">(</span> <span class="token class-name">T</span> <span class="token punctuation">[</span> <span class="token punctuation">]</span> <span class="token punctuation">)</span> <span class="token keyword">new</span> <span class="token class-name">Object</span> <span class="token punctuation">[</span> newLength <span class="token punctuation">]</span> <span class="token operator">:</span> <span class="token punctuation">(</span> <span class="token class-name">T</span> <span class="token punctuation">[</span> <span class="token punctuation">]</span> <span class="token punctuation">)</span> <span class="token class-name">Array</span> <span class="token punctuation">.</span> <span class="token function">newInstance</span> <span class="token punctuation">(</span> newType <span class="token punctuation">.</span> <span class="token function">getComponentType</span> <span class="token punctuation">(</span> <span class="token punctuation">)</span> <span class="token punctuation">,</span> newLength <span class="token punctuation">)</span> <span class="token punctuation">;</span> <span class="token class-name">System</span> <span class="token punctuation">.</span> <span class="token function">arraycopy</span> <span class="token punctuation">(</span> original <span class="token punctuation">,</span> <span class="token number">0</span> <span class="token punctuation">,</span> copy <span class="token punctuation">,</span> <span class="token number">0</span> <span class="token punctuation">,</span> <span class="token class-name">Math</span> <span class="token punctuation">.</span> <span class="token function">min</span> <span class="token punctuation">(</span> original <span class="token punctuation">.</span> length <span class="token punctuation">,</span> newLength <span class="token punctuation">)</span> <span class="token punctuation">)</span> <span class="token punctuation">;</span> <span class="token keyword">return</span> copy <span class="token punctuation">;</span> <span class="token punctuation">}</span> |
Next, copyOf() will call System.arraycopy() to copy the elements from the original
array from the starting position of 0 to the copy
array with the number of newLength
= newCapacity
calculated in the previous step.
1 2 3 4 | <span class="token keyword">public</span> <span class="token keyword">static</span> <span class="token keyword">native</span> <span class="token keyword">void</span> <span class="token function">arraycopy</span> <span class="token punctuation">(</span> <span class="token class-name">Object</span> src <span class="token punctuation">,</span> <span class="token keyword">int</span> srcPos <span class="token punctuation">,</span> <span class="token class-name">Object</span> dest <span class="token punctuation">,</span> <span class="token keyword">int</span> destPos <span class="token punctuation">,</span> <span class="token keyword">int</span> length <span class="token punctuation">)</span> <span class="token punctuation">;</span> |
Here we see that System.arraycopy is a native method written in C++, which copies the elements of the source array src
to the destination array dest
System.arraycopy() is used a lot, so it is optimized. memory and make it available through the Bionic Libc memmove() function. If the platform is available it will be used, otherwise Dalvik_java_lang_System_arraycopy will be used instead.
It turns out that the things that we have often used for a long time have been built extremely complex and elaborate, abstract under many layers but have to dig deep and learn very carefully to understand.