I got a bit angry when I noticed that the iterator protocol has been changed for python 3k. I didn’t like it at first, and I still think that it will be the worst compatibility nit of the whole transition, but at least I found a nice positive side of it.
The python iterator protocol used to be simple: an object is iterable if iter(object) returns an iterator. An iterator has an __iter__ method, which returns self, and a next method, which returns the next value or raises StopIteration.
What is the change? In python 3.0 the next method will be renamed to __next__, and there will be a new builtin called next(iterable,default) that calls __next__ and returns default or raises StopIteration, if default is missing.
While I was testing python3k, and I was getting errors because of the change, I was getting more and more angry, it looked quite gratuitous. It was on the context of /me trying to understand the concept of a generator, and exploring a series of variations on syntax, as well as old articles. One of the papers, General ways to traverse collections, has a nice section on generators on Icon, Python and scheme. I was trying to reproduce the first Icon example:
sentence := "Store it in the neighboring harbor"
if (i := find("or", sentence)) > 5 then write(i)
Is is supposed to print 22, i.e, the first value of i that matches. I couldn’t find a good way to do this in python until I noticed the new builtin. Then it was obvious (still a bit less readable as regular expressions have no syntax in python):
>>> import re
>>> sentence = "Store it in the neighboring harbor"
>>> next( (i.start() for i in re.finditer("or",sentence) if i.start()>5) )
22
>>> next( (i.start() for i in re.finditer("or",sentence) if i.start()>32),-1)
-1
We can even use the default value to avoid the exception and return a sentinel or an empty value. With this addition, python gets one idion I was missing: get me the first value that fills a condition (expr(i) for i in iterable if condition(i)) is a filter. On it:
all will tell me if all the expr(i) are true (filtered by condition(i))any will tell me if there is at least one true conditionlist will build a list of the results (set and dict versions too)next will return the first one, and evaluate only those needed to find it. Either fails with StopIteration or with a value passed as second argument to nextBorrowing the generator example from the wikipedia , slightly changed to avoid the list there:
>>> def primes(): ... n = 2 ... p = [] ... while True: ... if not any( n % f == 0 for f in p ): ... yield n ... p.append( n ) ... n += 1 ... >>> next(i for i in primes() if i>10000) 10007
will return the first prime greater than 10000. Dealing with infinite generators is a bit tricky, but at least now python will have a nice, readable idiom for just the first.
In Python 2, you can write the last line as
(i for i in primes() if i > 10000).next()