Calculate the importance of a feature — gene

Calculates the feature importance of each column in x in trying to predict the time ordering.

Usage

gene_importances(
  x,
  time,
  num_permutations = 0,
  ntree = 10000,
  ntree_perm = ntree/10,
  mtry = ncol(x) * 0.01,
  num_threads = 1,
  ...
)

Arguments

x: A numeric matrix or a data frame with M rows (one per sample) and P columns (one per feature).
time: A numeric vector containing the inferred time points of each sample along a trajectory as returned by infer_trajectory.
num_permutations: The number of permutations to test against for calculating the p-values (default: 0).
ntree: The number of trees to grow (default: 10000).
ntree_perm: The number of trees to grow for each of the permutations (default: ntree / 10).
mtry: The number of variables randomly samples at each split (default: 1% of features).
num_threads: Number of threads. Default is 1.
...: Extra parameters passed to ranger.

Value

a data frame containing the importance of each feature for the given time line

Examples

dataset <- generate_dataset(num_genes=500, num_samples=300, num_groups=4)
expression <- dataset$expression
group_name <- dataset$sample_info$group_name
space <- reduce_dimensionality(expression, ndim=2)
traj <- infer_trajectory(space)
# set ntree to at least 1000!
gene_importances(expression, traj$time, num_permutations = 0, ntree = 1000)
#> # A tibble: 500 × 3
#>    gene    importance pvalue
#>    <chr>        <dbl> <lgl> 
#>  1 Gene98       0.171 NA    
#>  2 Gene429      0.153 NA    
#>  3 Gene335      0.149 NA    
#>  4 Gene417      0.146 NA    
#>  5 Gene150      0.140 NA    
#>  6 Gene319      0.134 NA    
#>  7 Gene407      0.129 NA    
#>  8 Gene295      0.129 NA    
#>  9 Gene66       0.123 NA    
#> 10 Gene82       0.122 NA    
#> # ℹ 490 more rows