Skip to contents

Cluster query cells based on which reference cells they tend to mapped to

Usage

get_metric_clusters(
  vesalius_assay,
  use_cost = "feature",
  cluster_method = "hclust",
  trial = NULL,
  group_identity = NULL,
  ref_cells = NULL,
  query_cells = NULL,
  top_nn = 30,
  h = 0.75,
  k = NULL,
  nn = 30,
  resolution = 1,
  verbose = TRUE,
  ...
)

Arguments

vesalius_assay

vesalius_assay object after mapping a query onto a reference.

use_cost

character vector describing which cost matrices should be used to compare cells

cluster_method

character string - which method should be used for clustering (hclust, louvain, leiden)

trial

character string defining which trial should be used for clustering if any. If NULL, will search for "Cells".

ref_cells

character vector with reference cell barcodes (by default will use all barcodes)

query_cells

character with query cell barcodes (by default will use all barcodes)

top_nn

int - how many cells should be used to define clustering similarity (see details)

h

numeric - normalized height to use as hclust cutoff [0,1]

k

int - number of cluster to obtain from hclust

nn

int - number of nearest neighbors to use when creating graph for community clustering algorithms

resolution

numeric - clustering resolution to be parsed to community clustering algorithms

verbose

logical - print output message

group_identitiy

character vector - which specific substes of trial should be used for clustering By default will use all labels present.

Value

vesalius_assay with clustering results

Details

Once we have mapped cells between sample, we can identify which cells tend to map to the same group of cells. To achieve this, we first create a cost matrix that will serve as a basis to find similar-mapping instances. The cost matrix can be constructed from any cost matrix that was used during the mapping phase. Next, for each query cell we extract the top_nn cells in the reference with lowest cost. Using the ordered index as a character label, we compute a jaccard index between overlapping labels. Query cells with a high jaccard index tend to map to the same reference cells. We then use the reciprocal to define a distance between cells and cluster cells based on this distance. The same approach is used for every clustering method provided. This function will add a new column with the metric clustering results.