Why are loops slow in R?

Answer

Loops in R are slow for the same reason any interpreted language is slow: every operation carries around a lot of extra baggage.

Look at R_execClosure in eval.c (this is the function called to call a user-defined function). It's nearly 100 lines long and performs all sorts of operations -- creating an environment for execution, assigning arguments into the environment, etc.

Think how much less happens when you call a function in C (push args on to stack, jump, pop args).

So that is why you get timings like these (as joran pointed out in the comment, it's not actually apply that's being fast; it's the internal C loop in mean that's being fast. apply is just regular old R code):

A = matrix(as.numeric(1:100000))

Using a loop: 0.342 seconds:

system.time({
    Sum = 0
    for (i in seq_along(A)) {
        Sum = Sum + A[[i]]
    }
    Sum
})

Using sum: unmeasurably small:

sum(A)

It's a little disconcerting because, asymptotically, the loop is just as good as sum; there's no practical reason it should be slow; it's just doing more extra work each iteration.

So consider:

# 0.370 seconds
system.time({
    I = 0
    while (I < 100000) {
        10
        I = I + 1
    }
})

# 0.743 seconds -- double the time just adding parentheses
system.time({
    I = 0
    while (I < 100000) {
        ((((((((((10))))))))))
        I = I + 1
    }
})

(That example was discovered by Radford Neal)

Because ( in R is an operator, and actually requires a name lookup every time you use it:

> `(` = function(x) 2
> (3)
[1] 2

Or, in general, interpreted operations (in any language) have more steps. Of course, those steps provide benefits as well: you couldn't do that ( trick in C.

All r Questions

Ask your interview questions on r

Write Your comment or Questions if you want the answers on r from r Experts

*Name :**
*Email Id :**
*Mob no :**
Question Or Comment* :