The power of Python Intertools

Tram Ho

Let’s explore two Python libraries itertools and more_itertools and see how to take advantage of them to process data. There are many great Python libraries, but most of them are not close to what itertools integrates and more_itertools provides. These two libraries are really very complete when processing / repeating some data in Python. However, at first glance, the functions in those libraries may not seem to be useful, so let’s take a look at the most interesting ones, including examples of how to get the most out of them!

1. Compress

You have quite a few options when filtering strings, one of which is compress which has an iterable and boolean selector and outputs iterable entries where the corresponding element in the selector is true.

We can use this to apply the result of filtering one string to another, as in the example above, where we create a list of dates in which the corresponding number is greater than 3.

2. Accumulate

As the name suggests – we will use this function to accumulate the results of some (binary) functions. Examples of this can be run max or factorial:

If you are not interested in intermediate results, you can use functools.reduce to keep only the final value and higher memory efficiency.

3. Cycle

This function can repeat and create infinite cycles from it. This can be useful for example in games where players take turns. Another great cycle creates infinite cycles

4. Tee

Finally, the itertools module is tee, which creates multiple loops from one, allowing us to remember what happened. An example of that is the pairing function from the itertools formulas (and also more_itertools), which returns the value pairs from the input iterable (current and previous values):

This function is useful whenever you need multiple separate pointers for the same data stream. Be careful when using it, as it can be quite expensive when it comes to memory. It should also be noted that you should not use an original after you use tee as it has become the new tee object

5. Divide

First up from more_itertools is devide. As the name suggests, it divides iterable into the number of sub iterations. As you can see in the example below, the length of the extra iterations may not be the same, since it depends on the number of elements to be divided and the number of sub-iterations.

6. Partition

With this function, we will also split the loop, however this time, using a predicate

In the first example above, we are splitting the list of dates into recent and old dates, using simple lambda. For the second example, we are partitioning files based on their extensions, once again using the lambda function to split the filenames into names and extensions and check if the extension is in the list. Book extensions are allowed or not.

7. Consecutive_groups

If you need to find sequential numbers, dates, letters, booleans, or any other unordered object, you can find consecutive_groups:

In this example, we have a list of dates, some of which are consecutive. To be able to convert these dates into consecutive functions, we must first convert them into ordinal numbers. Then, using the list comprehension feature, we iterate over the sequential groups of dates created by successive_groups and convert them back to datetime.datetime using map and fromordinal functions.

8. Side_effect

Let’s say you need to cause side effects when repeating a list of items. This side effect could be an example of logging, writing to a file, or the same in the example below counting the number of events that occurred:

Share the news now

Source : Viblo