mclumi.deduplicate.monomer package
Submodules
mclumi.deduplicate.monomer.Adjacency module
- class mclumi.deduplicate.monomer.Adjacency.adjacency
Bases:
object- decompose(cc_sub_dict)
- Parameters
cc_sub_dict
- umi_tools(connected_components, df_umi_uniq_val_cnt, graph_adj)
Examples
umi_tools adjacency wrap
- Parameters
connected_components
df_umi_uniq_val_cnt
graph_adj
- umi_tools_(df_umi_uniq_val_cnt, cc, graph_adj)
umi_tools adjacency
- Parameters
df_umi_uniq_val_cnt – unique umi counts
cc – connected_components
graph_adj – the adjacency list of a graph
mclumi.deduplicate.monomer.Build module
mclumi.deduplicate.monomer.Cluster module
mclumi.deduplicate.monomer.DedupBasic module
- class mclumi.deduplicate.monomer.DedupBasic.dedupBasic(bam_fpn, ed_thres, method, mode='external', mcl_fold_thres=None, inflat_val=2.0, exp_val=2, iter_num=100, is_sv=True, sv_fpn='./dedup.bam', verbose=False)
Bases:
object- bamids(df_row, by_col)
- correct(umi)
- decompose(list_nd)
- Parameters
x
- diffDedupReadCountPos(df_row, by_col)
- Parameters
df_row – object - a pandas-like df row
by_col – str - a column name in question
- Returns
- Return type
int - the total counts of deduplicated reads per position
- diffDedupUniqCountPos(df_row, by_col)
- Parameters
df_row – object - a pandas-like df row
by_col – str - a column name in question
- Returns
- Return type
int - the sum of deduplicated unique UMI counts per position
- edave(df_row, by_col)
- eds_(df_row, by_col)
- evaluate()
- length(df_val)
- Parameters
df_val – list - a python list
- Returns
- Return type
int - the length of the list
- markSingleUMI(df_val)
- umimax(df_row, by_col)
mclumi.deduplicate.monomer.DedupGene module
- class mclumi.deduplicate.monomer.DedupGene.dedupGene(bam_fpn, ed_thres, method, gene_assigned_tag, gene_is_assigned_tag, mode='internal', mcl_fold_thres=None, inflat_val=2.0, exp_val=2, iter_num=100, is_sv=True, sv_fpn='./dedup.bam', verbose=False)
Bases:
object- bamids(df_row, by_col)
- correct(umi)
- decompose(list_nd)
- Parameters
x
- diffDedupReadCountPos(df_row, by_col)
- diffDedupUniqCountPos(df_row, by_col)
- edave(df_row, by_col)
- eds_(df_row, by_col)
- evaluate()
- length(df_val)
- markSingleUMI(df_val)
- umimax(df_row, by_col)
mclumi.deduplicate.monomer.DedupPos module
- class mclumi.deduplicate.monomer.DedupPos.dedupPos(bam_fpn, ed_thres, method, mode='external', pos_tag='PO', mcl_fold_thres=None, inflat_val=2.0, exp_val=2, iter_num=100, is_sv=True, sv_fpn='./dedup.bam', verbose=False)
Bases:
object- bamids(df_row, by_col)
- correct(umi)
- decompose(list_nd)
- Parameters
x
- diffDedupReadCountPos(df_row, by_col)
- diffDedupUniqCountPos(df_row, by_col)
- edave(df_row, by_col)
- eds_(df_row, by_col)
- evaluate()
- length(df_val)
- markSingleUMI(df_val)
- umimax(df_row, by_col)
mclumi.deduplicate.monomer.DedupSC module
- class mclumi.deduplicate.monomer.DedupSC.dedupSC(bam_fpn, ed_thres, method, gene_assigned_tag, gene_is_assigned_tag, mode='internal', mcl_fold_thres=None, inflat_val=2.0, exp_val=2, iter_num=100, is_sv=True, sv_fpn='./dedup.bam', verbose=False)
Bases:
object- bamids(df_row, by_col)
- decompose(list_nd)
- Parameters
x
- diffDedupReadCountPos(df_row, by_col)
- diffDedupUniqCountPos(df_row, by_col)
- edave(df_row, by_col)
- eds_(df_row, by_col)
- evaluate()
- length(df_val)
- markSingleUMI(df_val)
- umimax(df_row, by_col)
mclumi.deduplicate.monomer.Directional module
- class mclumi.deduplicate.monomer.Directional.directional
Bases:
object- decompose(cc_sub_dict)
- Parameters
cc_sub_dict
- dfs(node, node_val_sorted, node_set_remaining, graph_adj)
- Parameters
node
node_val_sorted
node_set_remaining
graph_adj
- dictTo2d(x)
- Parameters
x
- formatApvsDisapv(cc_dict)
input format for the Directional method in umi-tools :Parameters: cc_dict
- formatCCS(cc_dict)
- Parameters
cc_dict
- umi_tools(connected_components, df_umi_uniq_val_cnt, graph_adj)
- Parameters
connected_components
df_umi_uniq_val_cnt
graph_adj
- umi_tools_(df_umi_uniq_val_cnt, cc, graph_adj)
- Parameters
df_umi_uniq_val_cnt
cc
graph_adj
mclumi.deduplicate.monomer.MarkovClustering module
- class mclumi.deduplicate.monomer.MarkovClustering.markovClustering(inflat_val, exp_val, iter_num)
Bases:
object- cluster(cc_adj_mat)
- Parameters
cc_adj_mat
- decompose(list_nd)
- Parameters
df
- Returns
{
}
- dfclusters(connected_components, graph_adj)
- Parameters
connected_components – connected components in dict format: {
‘cc0’: […] # nodes, ‘cc1’: […], ‘cc2’: […], … ‘ccn’: […],
}
graph_adj – the adjacency list of a graph
- Returns
a pandas dataframe
each connected component is decomposed into more connected subcomponents.
- graph_cc_adj(cc, graph_adj)
- Parameters
cc – The first parameter.
graph_adj – The se parameter.
- keyToNode(list_2d, keymap)
- Parameters
list_2d
keymap
- keymap(graph_adj, reverse=False)
- Parameters
graph_adj
reverse
- matrix(graph_adj, key_map)
- Parameters
graph_adj
key_map
- maxval_ed(df_mcl_ccs, df_umi_uniq_val_cnt, umi_uniq_mapped_rev, thres_fold)
- Parameters
df_mcl_ccs
df_umi_uniq_val_cnt
umi_uniq_mapped_rev
thres_fold
- maxval_ed_(mcl_clusters_per_cc, df_umi_uniq_val_cnt, umi_uniq_mapped_rev, thres_fold)
# for k1, v1 in mcl_sub_clust_max_val_weights.items(): # for k2, v2 in mcl_sub_clust_max_val_weights.items(): # if k1 != k2: # edh = hamming().general( # umi_uniq_mapped_rev[k1], # umi_uniq_mapped_rev[k2], # ) # if edh <= thres_fold: # mcl_sub_clust_max_val_graph[k1].add(k2) # mcl_sub_clust_max_val_graph[k2].add(k1) # approval.append([k1, k2]) # else: # disapproval.append([k1, k2])
- Parameters
mcl_clusters_per_cc
df_umi_uniq_val_cnt
umi_uniq_mapped_rev
thres_fold
- maxval_val(df_mcl_ccs, df_umi_uniq_val_cnt, thres_fold)
- Parameters
df_mcl_ccs
df_umi_uniq_val_cnt
thres_fold
- maxval_val_(mcl_clusters_per_cc, df_umi_uniq_val_cnt, thres_fold)
- Parameters
mcl_clusters_per_cc
df_umi_uniq_val_cnt
thres_fold
- sort_vals(df_umi_uniq_val_cnt, cc)
- Parameters
df_umi_uniq_val_cnt
cc