User-defined iterators in Python
Iterable classes are one of the features which make Python code more readable. Simply put, they let you iterate over a container a la:
1 2 | for s in ("Spam", "Eggs"): print s |
Here, s
iterates over the tuple printing the words one by one.:
Spam
Eggs
Now comes the interesting part: How do I make my own classes iterable? The official Python Tutorial gives a working example for how to do it:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | class Reverse: "Iterator for looping over a sequence backwards" def __init__(self, data): self.data = data self.index = len(data) def __iter__(self): return self def next(self): if self.index == 0: raise StopIteration self.index = self.index - 1 return self.data[self.index] value = Reverse('spam') for char in value: print char |
Output:
m
a
p
s
The example appeared perfectly fine to a beginner like me. However, since I’m just kinda twisted in the head, I added a new line in the for
loop:
16 17 18 19 | value = Reverse('spam') for char in value: if char in value: print char |
Which resulted in the (quite unexpected) output:
That’s it. Nothing. Even though the code should make perfect sense and does work in case of built-in types. For example:
1 2 3 4 | tup = ("Spam", "Eggs") for s in tup: if s in tup: print s |
So daisy-ly gives:
Spam
Eggs
The culprit in case of tutorial’s example for user-defined iterators? After toying around the code sample a little, here’s what I pinned down:
- On the nested lines where another iterator is required, the
Reverse
class is supposed to return instance of an iterator which would define thenext()
method for returning successive values. - Since the
Reverse
class returns only itself
in this scenario, theself.index
variable is shared among iterators of theReverse('spam')
. - As a result,
Reverse.next()
raises theStopIteration
in the nested condition.
Once I understood the underlying problem, some further head-scratching and a can of malted drink resulted in the solutions:
- Return a
copy
for the iterative functions instead of the instance itself
:1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
import copy class Reverse: "Iterator for looping over a sequence backwards" def __init__(self, data): self.data = data self.index = len(data) def __iter__(self): return copy.copy(self) def next(self): if self.index == 0: raise StopIteration self.index = self.index - 1 return self.data[self.index] value = Reverse('spam') for char in value: if char in value: print char
Pro: Less strain on the programmer, only a couple of extra lines of code are needed.
Con:copy
ing the instance can be expensive in case of larger containers. - Use another class:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
class Reverse: "Iterator for looping over a sequence backwards" def __init__(self, data): self.data = data def __iter__(self): return ReverseIter(self) class ReverseIter: def __init__(self, inst): self.inst = inst self.index = len(self.inst.data) def next(self): if self.index == 0: raise StopIteration self.index = self.index - 1 return self.inst.data[self.index] value = Reverse('spam') for char in value: if char in value: print char
Pro: Since the whole container is not copied, only the index is unique among iterators — less burden on the memory.
Con: Not everyone likes defining new classes.
Both solutions worked equally well and resulted in the same output (the expected one this time):
m
a
p
s
The choice of either solution is solely dependent on the programmer’s preference. As a side note, after equating Python programming with carnal activities in few of my previous posts, I’m gonna take it to the next step and finally tag this post accordingly.
Tags: Classes, Code, Example, Flag 42, Iterators, Object Oriented Programming, OOP, Open Source, Python, S**, Technology
I was able to use this info to make one of my user defined classes iterable. Which is to say that I was able to iterate thru it with a “for x in foobar” statement.
But, apparently, this approach does not allow you to use the ‘[]’ index selection syntax as is foobar[2]. Is there some additional method you have to add to your defined class for this, or it is that strictly for Python built-in types?
Comment by Ray Wood — December 14, 2010 @ 2:36 am
Ray, you have to use the __setitem__ method to override [] operator. Here’s an example:
Comment by krkhan — December 14, 2010 @ 4:48 am