Using decorators with python

Decorators are quite often used in python code. It makes stuff lot more easier. Consider this awesome decorator in django.

@login_required
def some_view(request):
    # do stuff here

login_required is said to be a decorator, and some_view is said to be the decorated function. This decorator ensures that the user is logged in before this view is processed – otherwise they are redirected to another page ( In fact – the page to be redirected can be specified through the decorator itself )

How does this work?

Before seeing how decorators work, there are some stuff that you need to know.

Closures

Python supports the concept of closures. Consider the following.

def outer_function(args):
    def inner_function(more_args):
        # some stuff here
        # notice that both args and more_args 
        # are accessible here

        print args, more_args
        pass
    return inner_function

fun = outer_function(1)
fun(2) # prints 1 2

Everytime I call the function outer_function a new definition of inner_function is created and it is returned. The fun thing is, even though the scope of args is only within the definition of outer_function it is available when inner_function is executed.

That was confusing. Let me give a better, practical example.

Functional programming languages have a concept of partial functions. Lets make a partial function.

def partial(function, *args, **kwargs):
    def ret(*iargs, **ikwargs):
        return function(*(args+iargs), **dict(ikwargs, **kwargs))
    return ret

Now I’m going to create a partial function out of add, which takes two parameters and returns their sum.

def add(x,y):
    return x+y
>>> add(5,6)
11

The partial function I create will be called add5, and will take a single parameter, add that to 5 and return the sum. To create a partial function, we call the function partial as follows

>>> add5 = partial(add, 5) 
# the first parameter to be supplied to add is 5
>>> add5(6)
11
>>> add5(10)
20

Notice how the value we passed in partial gets “stored” somehow? This works because the returned function remembers all its variables in its scope and the enclosing scopes. This can be confirmed by printing the value of locals() within the returned function.

...  
      def ret(*iargs, **ikwargs):
        print locals()
        return function(*(args+iargs), **dict(ikwargs, **kwargs))
    return ret
...
>>> add5 = partial(add, 5)
>>> add5(6)
{'args': (5,), 'iargs': (6,), 'kwargs': {}, 'function': <function add at 0x92e1844>>, 'ikwargs': {}}
11

so clearly, the value passed to partial is accessible to the inner function. keep this in mind, we’re going to use this concept in decorators.

Why decorate?

Decorating a function is often required for modifying the result of function itself, or for separating out stuff that is not related to the action performed by the function.

Let us see examples for both.

A very trivial example to consider will be, suppose you have a series of functions which returns a string. And the string thus returned has to be converted to upper cased. A decorator, uppercase, can be written as follows.

def uppercase(func):
    def ret(*args, **kwargs):
        return func(*args, **kwargs).upper()
    return ret
 

def string_function(string__):
    # do some stuff with string, and return it
    return string__

string_function = uppercase(string_function) # apply the decorator

What the heck? where’s the @thing I’ve seen for decorators?

@uppercase
def string_function(..):
   ...

is equivalent to

def string_function(..):
   ...

string_function = uppercase(string_function)

So you see, @uppercase is just syntactic sugar for calling the function uppercase with the function to be decorated as the argument. Personally, whenever I see a decorator in @decorator pattern, I mentally parse it as func = decorator(func)

Lets see another example. This is a very widely used application of the decorators. We’re going to write a memoize decorator.

Memoize? What the heck is that?

Memoization is a really cool technique which can be applied for pure functions. Pure functions – as you might know, always gives same output for the same input values – so its often useful to store the output for already computed inputs. When we get an input which is already computed, we can simply retrieve the output from the storage, instead of computing it all over again. Think of it as a cache.

Lets consider an example. Fibonacci series is a well known series. If you don’t know what it is, you should check it out, its fun!

A naive recursion implementation to find nth item of Fibonacci series [ with n starting from 0 ]could be written as follows.

 def fib(n):
    if n == 0 or n == 1:
        return n
    else:
        return fib(n-1)+fib(n-2)

Lets run this for bigger numbers.

In [28]: %timeit fib(35)
1 loops, best of 3: 8.56 s per loop

Of course, you may get a different output, but that’s incredibly long for me. Lets think of some ways to reduce the time.

fib(5) can be expressed as fib(3)+fib(4) which in turn can be expressed as fib(2)+fib(1)+fib(3)+fib(2). See how the pattern repeats – we need to calculate fib for lower values again and again.

Given that fib itself is a pure function, why not store the values as we compute? Lets apply memoization in this case.

A memoize decorator

Our strategy to attack this problem is to have a dictionary, with each key having value as function(key). As we compute newer function(key) values, we store an entry key: function(key) in the dictionary.

def memoize(function):
    cache = {}
    def compute(key):
        if key not in cache:
            cache[key] = function(key)
        return cache[key]
    return compute

Now, lets apply this memoize decorator to our fibonacci function

@memoize
def fib(n):
    ...

Time to run the tests again.

In [37]: %timeit fib(35)
1000000 loops, best of 3: 345 ns per loop

Yaaay! look at the improvement, from 8ish seconds, it has come down to 345 nanoseconds. So there you go, that’s the kind of performance boost you can get by using memoization.

That’s it for now.

Advertisements
Standard

Using *args (reloaded)

I had already written a post on using the asterisk operator [ splat operator anyone ? ] and the double asterisk operator. The problem was, my understanding of the concept wasn’t too clear. Here goes another attempt at it.

Asterisk operator ( * ) can be used in two ways.

1. Getting variable number of arguments into a function

Scenario : You are writing a simple function to chain together arbitrary number of lists to form a single list (by the way itertools.chain does exactly this, but returns a iterator instead). Here, you do not know in advance how many parameters are going to be passed.

def chain(*args):
    result = []
    for arg in args:
        result.extend(arg)
    return result


In  : chain([1,2,3], [4,5,6])
Out : [1,2,3,4,5,6]

What happens here is, all the arguments passed are wrapped into a tuple – args in our case. Now all we need to do is take out the arguments one by one from that tuple, using the loop and adding it to our result list.

2. Unpacking argument lists

In this case, if our arguments are wrapped up in a list or a tuple – the asterisk operator can be used to unpack them.

Scenario : You have a nested list and want to find the sum of each position in lists and create a new list. Add 1st element of list 1, list 2 and list 3, Add 2nd element and so on. [[1,2,3],[4,5,6],[7,8,9]] should provide output as [12,15,18]

To tackle this, we use the zip function, it creates tuples of elements from each corresponding position of the list.


In : zip([1,2,3],[4,5,6])
Out : [(1, 4), (2, 5), (3, 6)]

Clearly, zip takes a number of lists, not a nested list. But our input is a nested list. Asterisk operator to the rescue! . We can use the asterisk operator to unpack our nested list.

def inner_sum(nested_list):
    result = []
    for tup in zip(*nested_list):
        result.append(sum(tup))
    return result

In  : inner_sum([[1,2,3],[4,5,6],[7,8,9]])
Out : [12, 15, 18]

Fun fun fun

Scenario: Transposing a matrix ( essentially a nested list ). A very concise solution would be to use zip with asterisk operator, along with a list comprehension.


In  : matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
In  : [ list(x) for x in zip(*matrix) ]
Out : [[1, 4, 7], [2, 5, 8], [3, 6, 9]]

This is similar to how we worked out the inner_sum function, except that we create a new list out of a tuple.

Hopefully that clears up the asterisk operator’s usage.

Standard

Python memory model – I

Coming from a C/C++ background, Python looked deceptively familiar for me – and I made a big mistake here. I assumed that python uses the same memory model as C. This is first of a series of posts to try and summarize in broad terms how python’s memory model works.

Btw, if you are a newbie, you should definitely check ipython out. It’s an awesome interpreter with a lot of useful features, like say tabbing for autocompletion etc. Fun thing to learn python with.

The very first thing is to NOT compare C with Python. They both are completely different languages with different idioms, so its best not to compare them.

Everything in python is an object – yes, everything. Numbers are objects. Strings are objects. Lists are objects. Classes are objects. And objects have an identity. Lets find out more about identity of objects.

In python, the id() function can be used for checking identity of an object.

In [1]: id(5)
Out[1]: 151765128

In [2]: id('string')
Out[2]: 3075578752L

In [3]: id([1,2,3])
Out[3]: 156305932

Objects have values too. The values of the object may or may not change ( more on that later ) but the identity of an object cannot change.

That wasn’t so interesting… I know, I know. Well, my point is, in Python everything is an object and has an identity. Please re-read that last line – because this is important.

Lets move onto variables [ in python, it is referred to as names ]. In python, we’d do something like x = 5. What happens here is x gets mapped to the object which has the value 5.
Or, you could also say, x is bound to object having value 5.

x —-> [ object with value 5 ]

Everything’s good till now? awesome. Lets talk more about values now.

In Python, objects can be of two types with respect to values. Mutable and immutable. If you can modify the object and make it have a new value, then the object is mutable and otherwise immutable (Note: There are some caveats – an immutable container may contain a mutable object, but the content of container itself cannot be modified )

Integers, Floats, strings, tuples etc are immutable in python – that is you cannot modify the value of the object.

Wait, what the heck, then how am I able to do the following?

In [1]: name = 5

In [2]: name = 6

In [3]: name
Out[3]: 6

I hear you ask, didn’t I change the value of ‘name’ now?
Nope, you didn’t.

lets view the operation in terms of names and objects.

name = 5
name ---------> [ Object with value 5 ]
============================================================================================
name = 6
name ----x [ Object with value 5 ]
|
|------------> [ Object with value 6 ]


Okay, I’m not so good with asciiart now. Well, the point is, ‘name’ gets rebound to object with value 6. So what you are doing is not changing the value of object, but constructing a new object with value 6.

Whee! I hope that helped. If you like to read long technical documents then this link from docs.python will be very useful for you. I’ll be back with mutable objects in my next post.

Standard