itemgetter, attrgetters and keyword argument key

One thing which consistently amazes me is python’s flexibility and ability to concisely do stuff. Case in point – suppose you have a list of tuples like [(1,2), (3,4), (5,0)]

The task is to find the tuple which has the min value in 2nd position.

import sys
list_of_tuples = [(1,2), (3,4), (5,0)]
min_tuple = None
minimum = sys.maxint
for pair in list_of_tuples:
    x,y = pair
    if y < minimum:
        min_tuple = pair
print min_tuple

Sure, this works, but too verbose, difficult to maintain etc. They say that when in Rome, do as the Romans do. Let’s do it the python way.

def snd(pair):
    x,y = pair
    return y

list_of_tuples = [(1,2), (3,4), (5,0)]
min(list_of_tuples, key=snd)

Nice, this code looks much better. Now the intent is clear. The function snd takes out the second element, and that element is used for comparison.

Using Itemgetter

operator.itemgetter is a really cool thing to have. It can be used to get a particular item from a sequence.

The usage of itemgetter is not really obvious, here’s an example.

import operator
list__ = [1,2,3]
print operator.itemgetter(1)(list__) ## prints 2

What’s going on here? itemgetter(position) returns a callable, that takes a sequence, and returns the value at position in sequence. itemgetter could be roughly implemented as

def __itemgetter(position):
    def applier(sequence):
        return sequence.__getitem__(position)
    return applier

Note: of course this works only for a single element, whereas operator.itemgetter can take a series of elements (or even slices) and __getitem__ gets applied for all of them – that part is omitted for simplicity. An equivalent source is provided in documentation – so interested readers can go through that.

So what happens? the outer function takes a position, and then the inner function takes the sequence and closes position in it. Now the inner function is returned, which, of course is a callable.

Lets see how this is useful in our original problem

import operator

list_of_tuples = [(1,2), (3,4), (5,0)]
min(list_of_tuples, key=operator.itemgetter(1))

We specify the key function as operator.itemgetter(1), which returns a callable, which then gets applied to every element of list_of_tuples. Pretty nifty eh?

Using Attrgetter

operator.attrgetter works more or less similarly to itemgetter, except that it looks up an attribute instead of an index.
Suppose you have a class which goes like

class Student(object):
    def __init__(self, id, name, marks):
        self.id = id
        self.name = name
        self.marks = marks
    
    def __str__(self):
        return '%s has marks %s' %(self.name, self.marks)

And say we have a list of Student instances, named students
The objective is to find student with maximum marks. We can use max function here, which luckily supports the key parameter as well.

students = [ Student(0, 'Foo', 30), Student(1, 'Bar', 95), Student(2, 'Baz', 80)]

best_student = max(students, key=operator.attrgetter('marks')) # don't forget the quotes
print best_student

Tell me you’re not impressed by python. attrgetter gets the value for the given attribute ( Note that it uses __getattr__ for introspection behind the scenes, so you need to pass a string ) and max can use that value as key for performing the comparison.

Remember, key can be used with max, min and sorted. Make use of it. Along with itemgetter and attrgetter, comparisons can be done very easily.

Advertisements
Standard

Using *args (reloaded)

I had already written a post on using the asterisk operator [ splat operator anyone ? ] and the double asterisk operator. The problem was, my understanding of the concept wasn’t too clear. Here goes another attempt at it.

Asterisk operator ( * ) can be used in two ways.

1. Getting variable number of arguments into a function

Scenario : You are writing a simple function to chain together arbitrary number of lists to form a single list (by the way itertools.chain does exactly this, but returns a iterator instead). Here, you do not know in advance how many parameters are going to be passed.

def chain(*args):
    result = []
    for arg in args:
        result.extend(arg)
    return result


In  : chain([1,2,3], [4,5,6])
Out : [1,2,3,4,5,6]

What happens here is, all the arguments passed are wrapped into a tuple – args in our case. Now all we need to do is take out the arguments one by one from that tuple, using the loop and adding it to our result list.

2. Unpacking argument lists

In this case, if our arguments are wrapped up in a list or a tuple – the asterisk operator can be used to unpack them.

Scenario : You have a nested list and want to find the sum of each position in lists and create a new list. Add 1st element of list 1, list 2 and list 3, Add 2nd element and so on. [[1,2,3],[4,5,6],[7,8,9]] should provide output as [12,15,18]

To tackle this, we use the zip function, it creates tuples of elements from each corresponding position of the list.


In : zip([1,2,3],[4,5,6])
Out : [(1, 4), (2, 5), (3, 6)]

Clearly, zip takes a number of lists, not a nested list. But our input is a nested list. Asterisk operator to the rescue! . We can use the asterisk operator to unpack our nested list.

def inner_sum(nested_list):
    result = []
    for tup in zip(*nested_list):
        result.append(sum(tup))
    return result

In  : inner_sum([[1,2,3],[4,5,6],[7,8,9]])
Out : [12, 15, 18]

Fun fun fun

Scenario: Transposing a matrix ( essentially a nested list ). A very concise solution would be to use zip with asterisk operator, along with a list comprehension.


In  : matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
In  : [ list(x) for x in zip(*matrix) ]
Out : [[1, 4, 7], [2, 5, 8], [3, 6, 9]]

This is similar to how we worked out the inner_sum function, except that we create a new list out of a tuple.

Hopefully that clears up the asterisk operator’s usage.

Standard

Python memory model – I

Coming from a C/C++ background, Python looked deceptively familiar for me – and I made a big mistake here. I assumed that python uses the same memory model as C. This is first of a series of posts to try and summarize in broad terms how python’s memory model works.

Btw, if you are a newbie, you should definitely check ipython out. It’s an awesome interpreter with a lot of useful features, like say tabbing for autocompletion etc. Fun thing to learn python with.

The very first thing is to NOT compare C with Python. They both are completely different languages with different idioms, so its best not to compare them.

Everything in python is an object – yes, everything. Numbers are objects. Strings are objects. Lists are objects. Classes are objects. And objects have an identity. Lets find out more about identity of objects.

In python, the id() function can be used for checking identity of an object.

In [1]: id(5)
Out[1]: 151765128

In [2]: id('string')
Out[2]: 3075578752L

In [3]: id([1,2,3])
Out[3]: 156305932

Objects have values too. The values of the object may or may not change ( more on that later ) but the identity of an object cannot change.

That wasn’t so interesting… I know, I know. Well, my point is, in Python everything is an object and has an identity. Please re-read that last line – because this is important.

Lets move onto variables [ in python, it is referred to as names ]. In python, we’d do something like x = 5. What happens here is x gets mapped to the object which has the value 5.
Or, you could also say, x is bound to object having value 5.

x —-> [ object with value 5 ]

Everything’s good till now? awesome. Lets talk more about values now.

In Python, objects can be of two types with respect to values. Mutable and immutable. If you can modify the object and make it have a new value, then the object is mutable and otherwise immutable (Note: There are some caveats – an immutable container may contain a mutable object, but the content of container itself cannot be modified )

Integers, Floats, strings, tuples etc are immutable in python – that is you cannot modify the value of the object.

Wait, what the heck, then how am I able to do the following?

In [1]: name = 5

In [2]: name = 6

In [3]: name
Out[3]: 6

I hear you ask, didn’t I change the value of ‘name’ now?
Nope, you didn’t.

lets view the operation in terms of names and objects.

name = 5
name ---------> [ Object with value 5 ]
============================================================================================
name = 6
name ----x [ Object with value 5 ]
|
|------------> [ Object with value 6 ]


Okay, I’m not so good with asciiart now. Well, the point is, ‘name’ gets rebound to object with value 6. So what you are doing is not changing the value of object, but constructing a new object with value 6.

Whee! I hope that helped. If you like to read long technical documents then this link from docs.python will be very useful for you. I’ll be back with mutable objects in my next post.

Standard

Enabling media url in development server – Django

To make the {{ media_url }} tag work in templates while using development server, you need to do some stuff, listed below.

1. Add the following to your urls.py

if settings.DEBUG:
    urlpatterns += patterns('',
        (r'^media/(?P<path>.*)$', 'django.views.static.serve', {'document_root': settings.MEDIA_ROOT, 'show_indexes':True}),
)

Don’t forget to add from django.conf import settings

2. Add to TEMPLATE_CONTEXT_PROCESSORS in settings.py the following
'django.core.context_processors.media',

3. Edit the MEDIA_ROOT and MEDIA_URL, and set them up.

4. Make sure you are using RequestContext to pass the context to the template, ie

  return render_to_response('template.html', {}, context_instance=RequestContext(request))

That’s it, you should be able to use {{ media_url }} tag in templates using development server now. Thanks to folks at stackoverflow for the information.

Standard

Using *args and **kwargs in python

please read the update to this post : Using *args, reloaded!

So we’ve been actively working on our latest project in django – and I came across this weird syntax in views
def view_function(request, *args, **kwargs)

Digging in further, I found that *args stands for argument list, and **kwargs stands for keyword argument list ( well, only the asterik and double asterik matters, the names can be anything ). So, why is this useful again?

For example:

def foo(*args):
  for i, arg in enumerate(*args):
    print "Argument ", i, " : ", arg

and call it by

foo("a", "b", "c")

Simply put, we now have a way to call a function with arbitrary number of parameter. The function foo can be passed with any number of arguments.

So, that’s about *args. Now what is **kwargs? It is the keyword argument list – which means you can pass the keywords as well as their values as a dictionary.

For example:


class Foo(object):
  def __init__(self, value):
    print value

class DerivedFoo(Foo):
  def __init__(self, *args, **kwargs):
    print 'DerivedFoo'
    super(DerivedFoo, self).__init__(*args, **kwargs)

myFoo = DerivedFoo("Calling Foo through DerivedFoo")

The super keyword can be used to call the methods of superclass – in this case the init of class Foo. The parameters which are passed to inherited class can be passed to the base class as shown above. This can be used to extend the behavior of the base class, without knowing anything about base class.

Standard