past.Utils.preprocess

past.Utils.preprocess(adata, min_cells=3, target_sum=None, is_filter_MT=False, n_tops=None, gene_method='hvg', normalize=True)

Data preprocess for downstream analysis

Parameters:
  • adata – target dataset of anndata format to be preprocessed

  • min_cells – number of cells in which each gene should as least express

  • target_sum – total gene expression to normalize each cell

  • is_filter_MT – whether or not to filter mitochondrial genes

  • n_tops – number of SVGs or HVGs to select, if None then keep all genes

  • gene_method – strategy to select genes, if ‘hvg’ then use ‘seurat_v3’ method applied in scanpy package to select HVGs, else if ‘gearyc’ then use geary’s c statistics to select SVGs, else keep all genes

  • normalize – whether or not to normalize gene expression matirx

Returns:

target dataset of anndata format after preprocessing

Return type:

scanpy.anndata