dataHans: August 2013

Wednesday 21 August 2013

Shoud "inplace" be standard to reduce clutter, keystrokes and errors

Every time I do something, like replacing missing values in pandas dataframe, I often have to either write the name of the object twice, or explicitly specify "inplace." Now, I know there are good reasons for this, but I just wonder whether it might be practical to do it the other way around. make "inplace standard" when using methods on objects and if you want to avoid changing the object this has to be explicitly specified.

Why? To save keystrokes (important in itself). However, it also reduces the probability of silly errors like typing the name of the object wrong in one place.

Example: I wanted to drop a row because it was added in the original dataset by an error. I wrote:

cost_in_age = cost_in_age.drop(cost_in_age.index[105])

A shorter way, which does not work now, but is equally (or more) intuitive, would be:

cost_in_age.drop(cost_in_age.index[105])

Sometimes it is possible to do the latter, but then one has to specify that it is to be done "inplace". So when we do something with objects we should not have to specify this.

OK, I know there are probably good reasons why it is like this. Methods and functions return things and unless it is explicitly specified they we should put the output returned in an object, it is simply just returned.

But still, it does not look technically impossible to make it the other way. Inplace as standard? Benefits would be a lot of saved typing and reduced errors. Cost?