Tuesday, May 20, 2008

Evolution of a Python programmer

It is interesting to look back at how one evolves as a Python programmer. I ran across a blog which discussed this in general terms here:

http://www.vetta.org/2008/05/scipy-the-embarrassing-way-to-code/

Now I have some actual examples of this evolution, from basically programming a direct copy from FORTRAN, to being truly pythonic. Over time it evolved into true python and here is how it happened:


So to start we have a direct translation of some FORTRAN Code for doing composition flash calculations. Solve for equilibrium composition by root bisection where the main calculation is built into subroutine


def cFlash2(X, K, nC):
EPS = 1.0e-8
TOL = 1.0e-6
MAXITER = 100
La = 0.0
Lb = 1.0
f0 = 0
for i in range(nC):
f0+=(K[i]-1)*X[i]/K[i]
f1 = 0
for i in range(nC):
f1+=(K[i]-1)*X[i]
if (f0>0) or (f1<0):
print "No VLE"
return -1
print f0, f1
for i in range(MAXITER):
L=(La+Lb)/2.
if abs(La-Lb) return L
f = 0.0
for j in range(nC):
kk =(K[j]-1)*X[j]/(L+K[j]*(1-L))
f=f+kk
if (f>EPS):
Lb=L
elif (f<-EPS):
La=L
else:
return L


So we get a bit pythonic. Define a separate function, arrays know their size, tuples for multiple definitions. Return None if we can't find a solution



def f(L, X, K):
fx=0
for j in range(len(X)):
fx+=(K[j]-1)*X[j]/(L+K[j]*(1-L))
return fx

def cFlash(X, K):
EPS, TOL = 1.0e-6, 1.0e-8
MAXITER = 100
La, Lb = 0.0,1.0
f0,f1 = f(0,X,K), f(1,X,K)
if (f0>0) or (f1<0):
print "No VLE"
return None
for i in range(MAXITER):
L=(La+Lb)/2.
if abs(La-Lb) return L
fx = f(L,X,K)
if (fx>EPS):
Lb=L
elif (fx<-EPS):
La=L
else:
return L


Now define the function locally so we don't contaminate the global namespace and can access the function parameters. Use enumerate to access list index. Raise an exception if root can't be found.


def cFlash(X, K):
def f(L):
fx=0
for j,k in enumerate(K):
fx+=(k-1)*X[j]/(L+k*(1-L))
return fx
EPS, TOL = 1.0e-6, 1.0e-8
MAXITER = 100
La, Lb = 0.0,1.0
f0,f1 = f(0), f(1)
if (f0>0) or (f1<0):
raise VLEError
for i in range(MAXITER):
L=(La+Lb)/2.
if abs(La-Lb) return L
fx = f(L)
if (fx>EPS):
Lb=L
elif (fx<-EPS):
La=L
else:
return L


Now we are using a canned Brent solver routine, which is much faster and more robust. Error handling is all in the canned routine


def cFlash(X, K):
def f(L):
fx=0
for j,k in enumerate(K):
fx+=(k-1)*X[j]/(L+k*(1-L))
return fx
return solver.zbrent(f,0,1)


List comprehensions, zip, sum. Reduce FP operations in inner product term from 6 to 4 by rearrangement



def cFlash(X, K):
def f(L):
return sum([x/(k/(k-1)-L) for x,k in zip(X,K) if k!=1.])
return solver.zbrent(f,0,1)



Now going *too* far: functional version using lambda


def cFlash(X, K):
return solver.zbrent(lambda L:sum([x/(k/(k-1)-L) for x,k in zip(X,K) if k!=1.]),0,1)



In fact, why bother with a function, just inline the call! Actually, the last version is reaching the point of being obscure and offers no efficiency gains over the previous version. The lambda is just syntactic sugar (GVR wants to get rid of it, and I probably agree, though it is useful for some callback situations where the function call is trivial and defining the function makes the code messier than just the lambda, and I do use it in Tkinter code a lot)

Final production version: precompute the ratios k/(k-1) reducing the FP count to two


def cFlash1(X, K):
XK1 = [(x, k/(k-1.)) for x,k in zip(X,K) if k!=1]
def f(L):
return sum([x/(k1-L) for x,k1 in XK1])
return solver.zbrent(f,0,1)



Anyway that sequence basically shows the progression that I have made in using python. In doing so, there have been a number of optimizations made that are transparent when you write in the functional form. The code is perfectly adequate for the present purpose, but if I needed additional speed, I would simply rewrite the core function using NumPy for the array manipulations. In that case the overall speed would approach that of compiled c++, combined with the transparency of python.

But agreed, looking back at ones old code can be embarassing!

Tuesday, May 13, 2008

The most useful function we never learned at school

In school math we learned all about sin() and cos() and tan() and log() and a few other trigonometric and special functions. But one of the most useful functions I have ever come across never got taught and perhaps it should.

I have programmed variants of this function in just about every language I have ever programmed in, and I use it all over the place. And it is a function that has applications in real life in just about every field.

The function is just a linear map. If we have something that takes one value at some time and a different value at another time, and the value is changing in the simplest possible way between the two times (ie linearly), what is the value at some intermediate or later time.

In general, if we have a linear function of a variable x with value y1 when x=x1 and value y2 when x= x2, what is the value in general?

A long time ago (ie FORTAN days), I would write a function

y = li(x, x1, y1, x2, y2)

with five parameters and then call this whenever I wanted to map from world to screen coordinates or any other application. Now very often I want to have the same linear map available for many many calls (like in doing computer graphics), so I would end up writing another function

y = li1(x, a, b)

where I could precompute a and b. Or even keeping a and b as global variables, setting them and just being able to call a function with a single parameter.

Now fast forward to Python. I figured out early on I could just do this

def li(x1,y1, x2,y2):
b = (y2-y1)/(x2-x1)
a = y1 - x1*b
def f(x):
return a+b*x
return f


so then I could create a function taking only a single x value and evaluate that to my hearts content:

>>> f = li(0., 3., 1., 2.)
>>> f(4)
-1.0
>>> f(2)
1.0
>>> f(1)
2.0
>>> f(0)
3.0
>>>


or if I wanted a single call

y = li(2., 2., 3., 5.)(4.5)



Apparently I had discovered closures, which turn out to be a good thing in functional languages and computer science. My li() function (LinearInterpolate) sets up a closure, encapsulating the parameters, and allows the actual function to be called while keeping these values (in this case the precomputed intercept and slope, for those who did traditional cartesian geometry at high school)

Anyway, this linear map function turns up all over the place in coding and is such a useful animal that it should be taught in school. Trigonometry has its uses but straight linear mappings are probably much more common in real life! If you bank balance is $335.47 in January and $221.38 in March, and you don't change your spending or earning habits, what will it be in September?


>>> lin(1, 335.47, 3, 221.38) (9)
-120.89
>>>


So it is likely that you will change your spending habits (unless you have arranged an overdraft)