Performance of grepl versus more crude method using substr() and ==

I wanted test if the name of parameters begins with max. First I used a rather crude substr(var, begin, length) and then I had a better idea: using grepl. But was the more crude version really the worst ?

Here is a little test that permits to use the function microbenchmark in the package of the same name.

> times <- microbenchmark( grepl("^max", "max_152"), substr("max_152", 1, 3) == "max", times=1e3)
> times
Unit: nanoseconds
                             expr  min   lq     mean median     uq   max neval
         grepl("^max", "max_152") 3980 4255 4863.691 4539.5 4941.0 30161  1000
 substr("max_152", 1, 3) == "max"  995 1205 1492.998 1340.5 1579.5 16659  1000

The returned data frame has the following informations:
expr: the tested expressions
neval: how many time they have been evaluated
min, lq, mean, median, uq, max are respectively the minimum, lower quartile, mean, median, upper quartile and maximum time for the evaluation of 1 iteration. Note that Max value is wrong.

And finally... grepl is not efficient at all !

Commentaires

Posts les plus consultés de ce blog

Standard error from Hessian Matrix... what can be done when problem occurs

Install treemix in ubuntu 20.04

stepAIC from package MASS with AICc