Calculates the feature importance of each column in x
in trying to predict the time ordering.
Usage
gene_importances(
x,
time,
num_permutations = 0,
ntree = 10000,
ntree_perm = ntree/10,
mtry = ncol(x) * 0.01,
num_threads = 1,
...
)
Arguments
- x
A numeric matrix or a data frame with M rows (one per sample) and P columns (one per feature).
- time
A numeric vector containing the inferred time points of each sample along a trajectory as returned by
infer_trajectory
.- num_permutations
The number of permutations to test against for calculating the p-values (default: 0).
- ntree
The number of trees to grow (default: 10000).
- ntree_perm
The number of trees to grow for each of the permutations (default: ntree / 10).
- mtry
The number of variables randomly samples at each split (default: 1% of features).
- num_threads
Number of threads. Default is 1.
- ...
Extra parameters passed to
ranger
.
Examples
dataset <- generate_dataset(num_genes=500, num_samples=300, num_groups=4)
expression <- dataset$expression
group_name <- dataset$sample_info$group_name
space <- reduce_dimensionality(expression, ndim=2)
traj <- infer_trajectory(space)
# set ntree to at least 1000!
gene_importances(expression, traj$time, num_permutations = 0, ntree = 1000)
#> # A tibble: 500 × 3
#> gene importance pvalue
#> <chr> <dbl> <lgl>
#> 1 Gene98 0.171 NA
#> 2 Gene429 0.153 NA
#> 3 Gene335 0.149 NA
#> 4 Gene417 0.146 NA
#> 5 Gene150 0.140 NA
#> 6 Gene319 0.134 NA
#> 7 Gene407 0.129 NA
#> 8 Gene295 0.129 NA
#> 9 Gene66 0.123 NA
#> 10 Gene82 0.122 NA
#> # ℹ 490 more rows