buddi.preprocessing.utils#

Module Attributes

CellDf

scRNA seq data processing utilities

Functions

apply_sample_wise_noise(expr[, ...])

Helper function to apply noise to a gene expression vector.

generate_counts_from_props(prop_df[, ...])

Helper function that generates a count matrix based on a proportion matrix and the total number of cells.

generate_log_normal_counts(cell_order, ...)

Generates a count vector by sampling from a log-normal distribution.

generate_prop_from_counts(count_df)

Helper function that generates a proportion matrix based on a count matrix.

generate_random_similar_props(num_samp, props_df)

Helper function that generates a proportion matrix where each sample's cell-type proportions correlated to a given base proportion vector.

generate_single_celltype_dominant_props(...)

Helper function to generate a proportion matrix where each row represents a sample in which one cell type dominates while other cell types have a small background presence.

generate_true_counts(in_adata, num_cells, ...)

Helper function that generates a count vector based on the true cell type proportions in an AnnData object.

get_cell_type_sum(cell_adata, num_cells)

Helper function to generate the pseudobulk gene expression for a given cell type, given the cell type specific subset of the AnnData object and the number of cells to sample.

get_true_proportions(in_adata, cell_type_col)

Helper function that calculates the true proportion of cell types in the given AnnData object.

subset_adata_by_cell_type(in_adata, ...[, ...])

Constructs a dictionary mapping each cell type to a subset of the AnnData object containing only cells of that type.

Classes

AnnData([X, obs, var, uns, obsm, varm, ...])

An annotated data matrix.