Articles

Affichage des articles du décembre, 2016

Read ... parameters in both function and interactive run

The ... parameter is very useful in a function but at the debugging stage it is boring because it generate an error in direct use. With this line, you will be able to use ... even in direct use without error: p3p <- tryCatch(list(...), error=function(e) list()) Let do some example. If you run this in interactive mode, you will get: > p3p <- tryCatch(list(...), error=function(e) list()) > p3p list() Now let use it within a function: > essai <- function(r, ...) { +   p3p <- tryCatch(list(...), error=function(e) list()) +   return(p3p) + } > essai() list() > essai(k=100) $k [1] 100 Original idea Marc Girondot with amelioration by Bert Gunter <bgunter.4567@gmail.com> and William Dunlap <wdunlap@tibco.com>.

Parallel computing in both windows and Unix computer

In the package parallel, a very useful function is mclapply(); it can be used directly as an lapply() but it runs the code in parallel in different cores. However, do not forget the to set the number of cores to be used because the default is only 2. Then you can do: options(mc.cores = detectCores()) system.time(out <- mclapply(as.list(1:10), FUN=function(x) {x*2})) or directly: system.time(out <- mclapply(as.list(1:10), FUN=function(x) {x*2}, mc.cores =  detectCores() )) However it does not work in Windows because forking cannot be used in windows. In windows you can use: options(cl.cores = detectCores()) cl <- makeCluster(getOption("cl.cores", 2)) # If you must use other package in the parallel function; use # invisible(clusterEvalQ(cl = cl , library(xxxxxx))) system.time(out <- parLapply(cl=cl, X=as.list(1:10), fun=function(x) {x*2})) stopCluster(cl)   Don't forget to stop the cluster before to open another one.

Take care to the span parameter if you use the loess() function

Image
The loess() function permits to interpolate data with very few information about the data that you want interpolate. By default, the smoothing parameter (span) is set to 0.75. However this value is not always ideal. Look at these data. The data are very simple (exp(0:10)) but the span value at 0.75 is clearly not correct. A 0.5 value is much better.

On the distribution of the maximum

Image
It is very tempting as a biologist (being an human) to claim that we have found the "biggest" of something. A typical exemple can be seen here: Fretey, J., A. H. Nibam, and D. Ngnamaloba. 2016. A new clutch size record from an olive ridley sea turtle nest in Cameroon. African Sea Turtle Newsletter 6:15-16. But as statistician, what can we say about the "maximum" ?