Mistakes when using slicing slices in Golang

Tram Ho

Welcome back to the series of common mistakes in Golang. In this article, we will learn about the causes of memory leaks related to slices and arrays in Go. Unlike C/C++, Go has GC, so we don’t need to care about memory allocation or release. However, it is because of the help of GC that we need to understand how GC works to prevent unintentional memory leaks. And in this article, we will learn about leaking memory due to the use of slicing slices.

1 Scenario

Suppose we have a service consumer, this service will receive data in the form of slices, the first 5 positions of the slice are the type of the data. Service will take the first 5 elements of data to perform some function.

Below is a simple example.

  • Here for simplicity, I will initialize dataReceived with a capacity of 1MB (You can imagine that we receive 1MB of data from a request or another service :v).
  • After receiving the dataReceived , I will proceed to get the first 5 elements of the slice by slicing like the code above.
  • Finally, I will use runtime.KeepAlive to keep the typeData from being collected by the GC, to illustrate how I will save the typeData in the memory cache of the program.
  • After each step, I will print the memory of the process to see how much space the process currently consumes.

The code has no problems and is pretty easy to understand. Before running the program, you often guess how much memory at start and end. Theoretically, dataReceived has about 1 million elements, which takes up 1MB of memory, typeData with 5 words, takes up about 5 bytes he. Run the program to see if it is correct.

image_3

Europe shit, memory when receiving data and after GC collects is the same. So not 5 bytes as we expected. So does it mean that typeData is holding the space as 1MB? So, if the service receives 1 thousand data like this, it needs 1GB of memory. At this point, you may think that one is the code that has problems, the other is that our assumption above is wrong. So let’s find out why.

2. How slice work

First, we need to understand how slices work.

Slices in Golang are fat pointers. You can read this article to understand more about fat pointers. The structure of the slice includes:

Basically, you can understand that when we have a slice, we have a pointer to the underlying array of that slice.

3. Reason

After understanding how slices work, let’s go back to visualize the problem above.

dataReceived when init, you can imagine it like this. An array of 1 million elements will be allocated in memory, and dataReceived will point a pointer to it.

Then we create one more slice of typeData from slice dataReceived by slicing method. When using the slicing method, instead of creating a new underlying array, Go will point a pointer to that existing underlying array as shown below. Then, our typeData , although it has only 5 elements, its capacity is 1M elements.

Finally when the process ends.

We keepAlive typeData , and dataReceived will be collected by GC. But since the underlying array of dataReceived is still being pointed to by typeData , the GC won’t collect it, but it will persist in memory until there are no more slices pointing to it. And this is why after the process ends, they still see 1MB exists in memory. So what is the solution to this problem?

4. Solution

Above 2 slices point to the same underlying array, so if we now separate those 2 slices into 2 separate underlying arrays, that will solve the problem. Then the GC will collect the dataReceived completely and keep only the typeData .

To implement this solution, we will use copy slice method instead of slicing slice as before.

Because we are using copy method, typeData will have length 5 and capacity 5 instead of 1M, storing 5 bytes in memory instead of 1MB as before.

Modify the getTypeOfData function.

Run it again and see the results.

Oops, The result after we update the code is 0 MB. This is because my function printAlloc() has divided the byte capacity by 1024*1024 to convert the capacity to MB, so it will round the float. I will modify the function a bit to let it print the amount of bytes.

Run again we will get the result.

When the process initializes, the memory is 104800 bytes, and after the process ends, the memory is 110408 bytes, a difference of about 5608 bytes. And in this 5608 bytes will exist our 5 bytes storing typeData . Because when running the program, there will be additional parts that are initialized and allocated memory below that we can’t see, so the number of memory will be slightly different. But it’s nice to see that the memory capacity is no longer 1MB as before, isn’t it :v.

5. Recap

In short, keep in mind that slicing a large slice or array can lead to memory leaks. Underlying array will not be collected by the GC as long as there is a pointer to it. And we can keep the very large underlying array in memory while using only a few elements of that underlying array. And copy slice is the solution to avoid this situation.

6. References

Harsanyi, T. (2022) 100 go mistakes. Shelter Island: Manning Publications.

Chia sẻ bài viết ngay

Nguồn bài viết : Viblo