Archive for the ‘R programming’ Category

Overhead cost of a function call

October 2, 2011 3 comments

Recently, I would like to apply unit testing method to my R program. The first thing i need to chop every few lines of the code into functions so that I can test each of them.

A Question comes up to my mind: What is the overhead cost of a function call? To answer this question, i wrote the following :




benchmark(1+2,f(1,2),g(1,2),cmpf(1,2),cmpg(1,2),cmpg2(1,2), replications =1000000, columns = c("test", "replications", "elapsed", "relative"),order='relative')

          test replications elapsed relative
1       1 + 2      1000000    4.00    1.000
4  cmpf(1, 2)      1000000    4.34    1.085
2     f(1, 2)      1000000    4.82    1.205
5  cmpg(1, 2)      1000000    5.44    1.360
3     g(1, 2)      1000000    5.68    1.420

The result suggests several things

  1. The overhead cost is about 0.82 second for 1,000,000 times function call.
  2. If we compile the function, the overhead cost is about 0.34 second for 1,000,000 times function call.

I don’t know whether it is a huge cost, but I believe the benefit of cleaner writing code with unit testing must worth more than that!

Categories: R programming

Call by reference in R

September 11, 2011 6 comments

Sometimes it is convenient to use “call by reference evaluation” inside an R function. For example, if you want to have multiple return value for your function, then either you return a list of return value and split them afterward or you can return the value via the argument.

For some reasons(I would like to know too), R do not support call by reference. The first reason come up in my mind is safety, if the function can do call by reference, it is more difficult to trace the code and debug(you have to find out which function change the value of your variables by examining the details of your function). In fact, R do “call by reference” when the value of the argument is not changed. They will make a copy of the argument only when the value is changed.  So we can expect there’s no efficiency gain (at least not a significant one) even we can do call by reference.

Anyway, it is always good to know how to have a “pseudo call by reference” in R (you can choose (not) to use it for whatever reason). The trick to implement call by reference is to make use of the eval.parent function in R. You can add a code to replace the argument value in the parent environment so that the function looks like implementing the call by reference evaluation strategy. Here are some examples of how to do it:

valX <- 51
set(valX ,10)
>[1] 10
valX <- 51
>[1] 52

Note that you could not change the value of x inside the function. If you change the value of x, a new object will be created. The substitute function will replace x with the new value and hence this method wont work. For example

valX <- 51
>Error in 52 <- 52 : invalid (do_set) left-hand side to assignment

If you want to change the value of x inside the function, you have to copy x to a new object and use new object as x.  At the end of the function, you can replace the value of x with the new object at the parent environment.

valX <- 51
>[1] 52

Another way to do call by reference more formally is using the R.oo packages.

Another way to implement

Categories: R programming