Post

What is an iterator in Python?

Iterators are python objects for traversing sequences (Sequence of elements/objects)

What is an iterator in Python?

Understanding Python Iterators: A Beginner’s Guide

1. Introduction

In Python, iterators and iterables form the backbone of iteration. If you’ve ever used a for loop, you’ve benefited from these powerful concepts. This blog will unravel the mystery behind them, setting the stage for understanding generators, which we’ll explore in the next blog.

2. What is a Sequence?

A sequence is a container that holds items in an ordered manner (like list, str, tuple etc.). It allows indexing and slicing using integer indices. Examples:

  • nums = [1, 2, 3]
  • message = "Hello, World!"
  • coordinate = (5, 10)

3. What is an Iterator?

An iterator is an object representing a stream of data, yeilding one element at a time.
The iterator objects themselves are required to support the following two methods, which together form the iterator protocol:

  • __iter__(): Return the iterator object itself. This is required to allow both containers and iterators to be used with the for and in statements.
  • __next__(): Return the next item from the container (one at a time). If there are no further items, raise the StopIteration exception.

4. What is an Iterable?

An iterable is an object that produce an iterator when passed to iter() method. Examples of iterables include sequences like list, string, and tuple.

Iterable object must implement __iter__() method which returns an iterator object (that must support the iterator protocol discribed above).

Note that, iterable objects are not loopable inherently but it can be passed to iter() to get an iterator which eventually is iterable/loopable.

5. What are the difference between Iterators and Iterables?

Every iterator is an iterable but every iterable need not be an iterator.

For an object to be an iterator it required to support iterator protocol.

For example: Python list is an iterable because it supports __iter__() method and applying built-in iter() method on a list returns an iterable object which further can be iterated over. Since this object doesn’t support __next__() method, it is not an iterator.

NOTE: iter() and next() are built-in functions which can be applied on the objects which implements __iter__() and __next__() methods respectively.

  • Iterators must implement both __iter__() and __next__().
  • Iterables implement only __iter__(), which returns an iterator.

6. Examples:

1
2
3
4
5
nums = [1, 2, 3]          # nums is an iterable
nums_it = iter(nums)      # nums_it is an iterator 

print(type(nums))     # Outputs: <class 'list'>
print(type(nums_it))  # Outputs: <class 'list_iterator'>

nums is a sequence (list) and it supports __iter__() method, it is an iterable. Calling iter(nums) will return an iterator. Note that, nums it self is not an iterator. Calling next() on list will through an error.

1
2
3
4
5
>>> print(next(nums))  # or nums.__iter__() will through an error

Traceback (most recent call last):
  File "<input>", line 1, in <module>
TypeError: 'list' object is not an iterator

But calling snext() on nums_it will return an element from the iterator

1
2
3
4
5
6
7
8
9
10
11
>>> next(nums_it)  # or nums_it.__next__()
1
>>> next(nums_it)
2
>>> next(nums_it)
3
>>> next(nums_it)  # No further item left in the container
Traceback (most recent call last):
  File "<input>", line 1, in <module>
StopIteration
>>>

7. How Iterators Work in Python?

When you use a for loop on a container

1
2
for num in nums:
    print(num)

behind the scene, Python:

  1. calls iter(nums) to get an iterator.
  2. calls next() repeatedly on the iterator returned in step 1 to fetch elements until StopIteration is raised.

8. Iterator Exhaustion

What happens when we do for num in nums_it: ?

Well, it is a bit different than for num in nums: or (for num in iter(nums):) in the sense that nums_it will exhaust once last element of the iterator will be fethced by next(nums_it). Meaning if you run the below code twice then it will print nothing for second time

1
2
3
4
5
6
for num in nums_it:
    print(num)      # Outputs: 1 2 3

for num in nums_it:
    print(num)      # Outputs nothing, iterator is exhausted

Iterators are exhausted after one pass, unlike iterables which can be re-iterated by creating a new iterator

9. Custom Iterator and Iterable Class

Classes can exibit iterator behaviour by implementing both the __iter__() and __next__() method of iterator protocol.

In the example below ForwardCounter and BackwardCouneter implement the iterator protocol, making them both iterators as well as iterables.
On the other hand, Counter is an iterable class which only supports __iter__() method. This method returns an iterator object of either class ForwardCounter or BackwardCouneter.

Forward Counter:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
class ForwardCounter:
    """
    A forward counter class that counts from 1 up to a specified limit.

    This class implements the iterator protocol, making its objects both 
    iterators and iterables. You can use it directly in a `for` loop or 
    with `iter()` and `next()`.

    Attributes:
        limit: The maximum number up to which the counter will iterate.
        current: The current value of the counter during iteration.
    """

    def __init__(self, limit):
        """
        Initializes the ForwardCounter with an upper limit.

        Args:
            limit: The maximum number up to which the counter will iterate.
        """
        if limit < 1:
            raise ValueError("Limit must be a positive integer.")
        self.limit = limit
        self.current = 0

    def __iter__(self):
        """
        Returns the iterator object itself.

        Returns:
            ForwardCounter: The iterator instance.
        """
        return self

    def __next__(self):
        """
        Returns the next number in the counter sequence.

        Returns:
            int: The next number in the sequence.

        Raises:
            StopIteration: When the counter reaches its upper limit.
        """
        if self.current >= self.limit:
            raise StopIteration
        self.current += 1
        return self.current

# Example run:
fw_counter = ForwardCounter(5)

print(next(fw_counter))  # Outputs: 1

for num in fw_counter:
    print(num)  # Outputs: 2 3 4 5

Backward Counter:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
class BackwardCounter:
    """
    A backward counter class that counts down from a specified number to 1.

    This class implements the iterator protocol, making its objects both 
    iterators and iterables. It can be used directly in a `for` loop or 
    with `iter` and `next`.

    Attributes:
        current: The current value of the counter during iteration.
        limit: The minimum number to stop at.
    """

    def __init__(self, start):
        """
        Initializes the BackwardCounter with the starting number.

        Args:
            start: The number to start counting down from.
        """
        if start < 1:
            raise ValueError("Start must be a positive integer.")
        self.current = start + 1
        self.limit = 1

    def __iter__(self):
        """
        Returns the iterator object itself.

        Returns:
            BackwardCounter: The iterator instance.
        """
        return self

    def __next__(self):
        """
        Returns the next number in the counter sequence.

        Raises:
            StopIteration: When the counter reaches its lower limit.
        
        Returns:
            int: The next number in the sequence.
        """
        if self.current <= self.limit:
            raise StopIteration
        self.current -= 1
        return self.current


# Example run:
bw_counter = BackwardCounter(5)

print(next(bw_counter))  # Outputs: 5

for num in bw_counter:
    print(num)           # Outputs: 4 3 2 1

Counter:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
class Counter:
    """
    A flexible counter class that provides forward or backward counting 
    based on the mode.

    This class implements the iterable protocol by defining the `__iter__` 
    method, which returns an iterator object based on the mode.

    Attributes:
        mode: The mode of counting. 0 for forward, 1 for backward.
        limit: The maximum number for counting.
    """

    def __init__(self, mode=0, limit=10):
        """
        Initializes the Counter with a mode and a limit.

        Args:
            mode: The mode of counting. 0 for forward, 1 for backward.
            limit: The maximum number for counting.
        """
        if mode not in (0, 1):
            raise ValueError("Mode must be 0 (forward) or 1 (backward).")
        if limit < 1:
            raise ValueError("Limit must be a positive integer.")
        self.mode = mode
        self.limit = limit

    def __iter__(self):
        """
        Returns an iterator object based on the mode.

        Returns:
            ForwardCounter or BackwardCounter: An iterator for the specified mode.
        """
        if self.mode == 0:
            return ForwardCounter(self.limit)
        else:
            return BackwardCounter(self.limit)


# Example run:
counter = Counter(1, 5) 

for num in counter:
    print(num)        # Outputs: 5 4 3 2 1


# print(next(counter))        # Will produce TypeError

counter_it = iter(counter)
print(next(counter_it))   # Outputs: 5

10. Why is __iter__() Important for Iterators?

The example

1
2
for i in fw_it:     
    print(i)

demonstrates why the __iter__() method is essential for an iterator (and why an iterator should also be iterable).

As already explained above, when you use a statement like for i in fw_it:, Python internally calls the iter() function on the fw_it object. Even if fw_it is already an iterator, the iter() function ensures compatibility with the iterable protocol by calling the object’s __iter__() method.

If the __iter__() method is missing, the fw_it object will not be recognized as an iterable, and the for loop will raise a TypeError. This is because for loops require objects to support the iterable protocol, even for iterators.

To see this in action, try removing the __iter__() method from fw_counter and then running for i in fw_it:. You’ll encounter an error.

11. Real-World Application of Iteratos

Iterators can be ver useful when dealing with large datasets, as they help optimize memory usage by processing one item at a time instead of loading everything into memory at once.

Imagine you have a file with millions of records that you need to read and process. A naive approach might look like this:

1
2
3
with open("huge_file.txt") as f:
    for line in f.readlines():
        do_some_processing(line)

This will load the entire file data to the memory. This can exhaust your system memory and can lead to crashes. An efficient approach would be to use an iterator

1
2
3
with open("huge_file.txt") as f:   # The file object f is an iterator
    for line in f:
        do_some_processing(line)

This will read the file line by line without loading the entire file into the system’s memory.

12. Summary

  • Iterables: Objects that can be passed to iter() to obtain an iterator.
  • Iterators: Objects that implement __iter__() and __next__() to produce elements.
  • Using iterators can be memory efficient when dealing with large data

References:

This post is licensed under CC BY 4.0 by the author.