Inspirated

 
 

May 14, 2009

User-defined iterators in Python

Filed under: Blog — krkhan @ 5:19 am

Iterable classes are one of the features which make Python code more readable. Simply put, they let you iterate over a container a la:

1
2
for s in ("Spam", "Eggs"):
	print s

Here, s iterates over the tuple printing the words one by one.:

Spam
Eggs

Now comes the interesting part: How do I make my own classes iterable? The official Python Tutorial gives a working example for how to do it:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
class Reverse:
	"Iterator for looping over a sequence backwards"
	def __init__(self, data):
		self.data = data
		self.index = len(data)
 
	def __iter__(self):
		return self
 
	def next(self):
		if self.index == 0:
			raise StopIteration
		self.index = self.index - 1
		return self.data[self.index]
 
value = Reverse('spam')
for char in value:
		print char

Output:

m
a
p
s

The example appeared perfectly fine to a beginner like me. However, since I’m just kinda twisted in the head, I added a new line in the for loop:

16
17
18
19
value = Reverse('spam')
for char in value:
	if char in value:
		print char

Which resulted in the (quite unexpected) output:

 

That’s it. Nothing. Even though the code should make perfect sense and does work in case of built-in types. For example:

1
2
3
4
tup = ("Spam", "Eggs")
for s in tup:
	if s in tup: 
		print s

So daisy-ly gives:

Spam
Eggs

The culprit in case of tutorial’s example for user-defined iterators? After toying around the code sample a little, here’s what I pinned down:

  • On the nested lines where another iterator is required, the Reverse class is supposed to return instance of an iterator which would define the next() method for returning successive values.
  • Since the Reverse class returns only itself in this scenario, the self.index variable is shared among iterators of the Reverse('spam').
  • As a result, Reverse.next() raises the StopIteration in the nested condition.

Once I understood the underlying problem, some further head-scratching and a can of malted drink resulted in the solutions:

  • Return a copy for the iterative functions instead of the instance itself:
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    
    import copy
     
    class Reverse:
    	"Iterator for looping over a sequence backwards"
    	def __init__(self, data):
    		self.data = data
    		self.index = len(data)
     
    	def __iter__(self):
    		return copy.copy(self)
     
    	def next(self):
    		if self.index == 0:
    			raise StopIteration
    		self.index = self.index - 1
    		return self.data[self.index]
     
    value = Reverse('spam')
    for char in value:
    	if char in value:
    		print char

    Pro: Less strain on the programmer, only a couple of extra lines of code are needed.
    Con: copying the instance can be expensive in case of larger containers.

  • Use another class:
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    
    class Reverse:
    	"Iterator for looping over a sequence backwards"
    	def __init__(self, data):
    		self.data = data
     
    	def __iter__(self):
    		return ReverseIter(self)
     
    class ReverseIter:
    	def __init__(self, inst):
    		self.inst = inst
    		self.index = len(self.inst.data)
     
    	def next(self):
    		if self.index == 0:
    			raise StopIteration
    		self.index = self.index - 1
    		return self.inst.data[self.index]
     
    value = Reverse('spam')
    for char in value:
    	if char in value:
    		print char

    Pro: Since the whole container is not copied, only the index is unique among iterators — less burden on the memory.
    Con: Not everyone likes defining new classes.

Both solutions worked equally well and resulted in the same output (the expected one this time):

m
a
p
s

The choice of either solution is solely dependent on the programmer’s preference. As a side note, after equating Python programming with carnal activities in few of my previous posts, I’m gonna take it to the next step and finally tag this post accordingly.

Tags: , , , , , , , , , ,

May 3, 2009

“All methods in Python are effectively virtual”

Filed under: Blog — krkhan @ 8:07 pm

Dive Into Python really is one of the best programming books I have ever laid my hands on. Short, concise and to-the-point. The somewhat unorthodox approach of presenting an alien-looking program at the start of each chapter and then gradually building towards making it comprehensible is extraordinarily captivating. With that said, here’s an excerpt from the chapter introducing Python’s object orientation framework:

Guido, the original author of Python, explains method overriding this way: “Derived classes may override methods of their base classes. Because methods have no special privileges when calling other methods of the same object, a method of a base class that calls another method defined in the same base class, may in fact end up calling a method of a derived class that overrides it. (For C++ programmers: all methods in Python are effectively virtual.)” If that doesn’t make sense to you (it confuses the hell out of me), feel free to ignore it. I just thought I’d pass it along.

If you were able to comprehend the full meaning of that paragraph in a single go, you’re probably (a) Guido van Rossum himself (b) Don Knuth (c) Pinnochio.

Neither of which happens to be my identity, so it took me a few dozen re-reads to grasp the idea. It brought back memories of an interesting question that I used to ask students while I was working as a teacher’s assistant for the C++ course: “What is a virtual function?” The answer always involved pointers and polymorphism; completely ignoring any impact virtual functions would be having on inheritance in referential/non-pointer scenarios. (Considering that most of the C++ books never attempt to portray the difference either, I didn’t blame the students much.) Confused again? Here’s some more food for thought: Python does not even have pointers, so what do these perpetually virtual functions really entail in its universe? Let’s make everything peachy with a nice example.

Consider a Base class in C++ which defines three functions:

  • hello()
  • hello_non_virtual()
  • hello_virtual()

The first function, i.e., hello() calls the latter two (hello_non_virtual() and hello_virtual()). Now, we inherit a Derived class from the Base, and override the functions:

  • hello_non_virtual()
  • hello_virtual()

Note that the hello() function is not defined in the Derived class. Now, what happens when someone calls Derived::hello()? The answer:

Mechanism of virtual function invocation

Since Derived::hello() does not exist, Base::hello() is called instead. Which, in turn, calls hello_non_virtual() and hello_virtual(). For the non-virtual function call, the Base::hello_non_virtual() function is executed. For the virtual function call, the overridden Derived::hello_virtual() is called instead.

Here’s the test code for C++:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
#include <iostream>
 
using namespace std;
 
class Base {
public:
	void hello()
	{
		cout&lt;&lt;"Hello called from Base"&lt;<endl; hello_non_virtual();="" hello_virtual();="" }="" void="" hello_non_virtual()="" {="" cout<<"hello="" called="" from="" non-virtual="" base="" function"<<endl;="" virtual="" hello_virtual()="" };="" class="" derived="" :="" public="" public:="" int="" main()="" d;="" d.hello();="" return="" 0;="" <="" pre="">
 
And its output:
 
 
<blockquote>
Hello called from Base
Hello called from non-virtual Base function
Hello called from virtual Derived function
</blockquote>
 
 
 
Similarly, a Python program to illustrate the statement <em>"all methods in Python are effectively virtual"</em>:
 
 
<pre lang="python" line="1">class Base:
	def hello(self):
		print "Hello called from Base"
 
		self.hello_virtual()
 
	def hello_virtual(self):
		print "Hello called from virtual Base function"
 
class Derived(Base):
	def hello_virtual(self):
		print "Hello called from virtual Derived function"
 
d = Derived()
d.hello()

Output:

Hello called from Base
Hello called from virtual Derived function

I hope this clears up the always-virtual concept for other Python newcomers as well. As far as my experience with the language itself is concerned, Python is s**; simple as that. Mere two days after picking up my first Python book for reading, I have fallen in love with its elegance, simplicity and overall highly addictive nature.

Tags: , , , , , , , , , , , , ,