Retrieve data from data.frame
The command subset is very efficient to retrieve data from data.frame. For example, let retrieve the size for a specific stage in this data frame:
> library(embryogrowth)
> s <- TSP.list[[1]]
> s
stages metric
1 8 NA
2 9 NA
3 10 NA
4 11 NA
5 12 NA
6 13 NA
7 14 0.016
8 15 0.023
9 16 0.035
10 17 0.044
11 18 0.057
12 19 0.072
13 20 0.090
14 21 0.140
15 22 0.240
16 23 0.340
17 24 0.550
18 25 0.750
19 26 1.000
For example:
> subset(x = s, subset=stages==20, select="metric", drop = TRUE)
[1] 0.09
An alternative solution for this simple case is:
> s$metric[s$stages == 20]
[1] 0.09
Let measure the relative speed of both solutions; clearly subset is very useful for complex situation but simpler ones are faster using direct comparison.
> system.time(expr = {
+ for (i in 1:100000) {
+ g <- subset(x = s, subset=stages==20, select="metric", drop = TRUE)
+ }
+ }
+ )
utilisateur système écoulé
3.124 0.099 3.247
> system.time(expr = {
+ for (i in 1:100000) {
+ g <- s$metric[s$stages == 20]
+ }
+ }
+ )
utilisateur système écoulé
1.331 0.045 1.387
> library(embryogrowth)
> s <- TSP.list[[1]]
> s
stages metric
1 8 NA
2 9 NA
3 10 NA
4 11 NA
5 12 NA
6 13 NA
7 14 0.016
8 15 0.023
9 16 0.035
10 17 0.044
11 18 0.057
12 19 0.072
13 20 0.090
14 21 0.140
15 22 0.240
16 23 0.340
17 24 0.550
18 25 0.750
19 26 1.000
For example:
> subset(x = s, subset=stages==20, select="metric", drop = TRUE)
[1] 0.09
An alternative solution for this simple case is:
> s$metric[s$stages == 20]
[1] 0.09
Let measure the relative speed of both solutions; clearly subset is very useful for complex situation but simpler ones are faster using direct comparison.
> system.time(expr = {
+ for (i in 1:100000) {
+ g <- subset(x = s, subset=stages==20, select="metric", drop = TRUE)
+ }
+ }
+ )
utilisateur système écoulé
3.124 0.099 3.247
> system.time(expr = {
+ for (i in 1:100000) {
+ g <- s$metric[s$stages == 20]
+ }
+ }
+ )
utilisateur système écoulé
1.331 0.045 1.387
Commentaires
Enregistrer un commentaire