Find out how ruby ​​uses memory

Tram Ho

After working with ruby ​​for a while I never thought about how this language uses memory until one day I received feedback from customers that sidekiq is taking up more than 5Gb RAM while no job is actually working. exam. So why does the memory increase or decrease when we run the code and how to use as little memory as possible, let’s find out in this article.

What is a garbage collector?

When creating an object in ruby, the computer will use memory to store that object. If you create many objects at the same time, the computer will have to use a lot of memory. Because your RAM is limited (depending on the configuration available), at some point you will run out of RAM to store objects / use for other tasks. Then you will have to “free” (free) areas in the memory space is no longer used, to be able to have extra memory for other tasks.

In C, there are often times when using the two built-in functions “malloc ()” and “free ()” to manage memory when the program runs. Developer will have to put these two functions manually inside the code of the function to get memory for the variable (malloc ()) and discharge memory when no longer working with that variable (free ()). However, manually allocating memory and free memory takes a lot of effort of the programmer, especially in object-oriented programming, because objects are used and repeated many times, so it is not clearly defined. when to discharge memory, so there must be an automated process to collect the used memory and release it.

Ruby, like other programming languages, already has libraries / modules that handle this, called “garbage collector.” In ruby, it is the “GC” module. The garbage collector feature automatically detects and frees up “useless” memory – that is, it is storing a variable or object, but that variable and object are no longer used. You will not have to worry about allocate and free memory while coding anymore with the support of GC. The GC task in ruby ​​is the garbage collector, which frees up unused memory without any tasks.

Object Retention

The most obvious way to make Ruby increase RAM usage is to retain objects. Constants in Ruby are never cleaned up by the Garbage collector, so if a constant has an reference to an object, that object can never be released.

If we run the above code and debug with GC.stat (: total_freed_objects), it will return the number of objects released by Ruby. We see very little change in the results:

Ruby has created 100,000 copies of “a string” but they cannot be released. Objects cannot be freed when they are referenced by a global object. This applies to constants, global variables, modules and classes.

If we do the same thing without retaining any object:

Number of objects freed: 101112. Memory usage is also much smaller, about 6mb compared to 12mb when retaining a reference to the object. The number of objects retained can be determined by GC.stat (: total_allocated_objects): Number of objects retained = total_allocated_objects – total_freed_objects.

Retain object to increase speed

Everyone in Ruby is familiar with DRY or “Don’t repeat yourself”. This is as true for object allocation as it is for code. Sometimes, it makes sense to retain objects for reuse instead of having to recreate them multiple times. Ruby has this feature built into the string. If you call freeze a string, the interpreter will know that you have no plans to modify that string and will reuse it. Here is an example:

Running this code, the number of objects freed does not change much but the memory usage is extremely low. Instead of having to store 100,000 different objects, Ruby can store a string object with 100,000 references to that object. In addition to reducing memory, it also reduces runtime because Ruby has to spend less time on object creation and memory allocation.

You can do the same thing to any other object you want by assigning it a constant. This is already a common pattern when storing external connections, like Redis, for example:

Because a constant has a reference to the Redis connection, it will never be released by the Garbage collector. Therefore, when reusing constants in another code, Ruby will not take time to create new objects, allocate and release memory anymore.

Short Lived Objects

Most objects in ruby ​​are Short Lived Objects, meaning that when created they have no references. For example, see this code snippet:

On the surface, it seems that it only requires a few objects to operate (hash, symbol: name and the string “schneems”). However, when you call it, many other intermediate objects are created to create the correct SQL statement, using a prepared statement if available, and more. Many of these objects exist only as long as the methods they are created are being implemented. Why should we care about creating objects if they are not retained?

When the method being called has a sufficiently long processing time, it will cause objects to be created which have a fairly long lifetime, which will lead to your memory increasing over time. They can cause Ruby GC to need more memory if the GC is enabled at the time those objects are still referenced.

How does Ruby increase memory?

When the number of objects used exceeds the amount of memory allocated by Ruby, it must request additional memory. Requiring memory from the operating system is an “expensive” operation, so Ruby tries to do it infrequently. Instead of requesting several KBs at a time, it allocates a block larger than necessary. You can set this amount manually by setting the environment variable RUBY_GC_HEAP_GROWTH_FACTOR.

For example, if Ruby consumes 100 mb and you set RUBY_GC_HEAP_GROWTH_FACTOR = 1.1, then when Ruby reallocates memory, it will get 110 mb. When the Ruby application starts, it will continue to increase in the same percentage until it reaches a point where the entire program can execute within the allocated memory. Setting a lower value for this environment variable means we have to run GC and allocate memory more often, but we will approach the maximum memory usage slower. Larger values ​​mean less GC is needed, but they will require more memory than is required.

For the sake of optimizing a website, many developers often think that Ruby never frees up memory. This is not entirely true, because Ruby frees up memory. We will talk about this later.

Consider the following example to see the effect of objects not being retained on memory:

When we call this method, 10 million strings are created. When the method exits, the strings are not referenced by anything and will be released by the Garbage collector. However, while the program is running, Ruby must allocate additional memory to make room for 10,000,000 strings. This requires more than 500mb of memory!

The GC must activate and allocate more memory if it cannot collect enough locations to store objects during the array building process, which causes the Ruby’s allocated memory to increase rapidly. Ruby will keep this allocated memory for a while, because if the process uses that maximum amount of memory once, it can happen again, freeing and reallocating it each time. Such is very expensive. Memory will be freed gradually, but slowly. If you’re concerned about performance, it’s best to create as few objects as possible.

Modify in place to speed up

One trick I’ve used to speed up the program and cut down on object allocations is to modify state instead of creating new objects. For example, here is a code snippet taken from gem mime-types:

In this code each downcase and gsub method creates a new string object, which requires more time and memory. To avoid this, we can make modifications in place:

The code is definitely more verbose, but it’s also much faster because we modify the current string instead of creating new strings.

Note: You do not need to use constants to store regex, because in Ruby all the regex threads have been frozen.

On-site modifications can get you into trouble. You may end up modifying a variable that you don’t realize you used elsewhere, resulting in a subtle and difficult-to-find regression. Before doing this kind of optimization, make sure you have good tests.

It would be a mistake to think that “objects are slow”. Using objects correctly can make a program easier to understand and optimize. Even the fastest tools and techniques, when used inefficiently, will cause delays.

The best way to free up memory

As mentioned earlier, Ruby does release memory, albeit slowly. After running the make_an_array method causes our memory to swell, you can observe the memory freeing Ruby by running:

Very slowly, the memory of the application will decrease. Ruby frees up a small number of empty pages (a group of slots) if too much memory is allocated at a time. For most applications, such as web apps, actions can cause the system to allocate memory to some extent. When actions occur frequently, the release of ruby ​​memory is not fast enough to keep their application smaller. It is better to minimize the creation of objects as much as possible.

References

Chia sẻ bài viết ngay

Nguồn bài viết : Viblo