mgkit.snps.conv_func module

Wappers to use some of the general function of the snps package in a simpler way.

mgkit.snps.conv_func.get_full_dataframe(snp_data, taxonomy, min_num=3, index_type=None, filters=None)

New in version 0.1.12.

Changed in version 0.2.2: added filters argument

Returns a DataFrame with the pN/pS of the given SNPs data.

Shortcut for using combine_sample_snps(), using filters from get_default_filters().

Parameters:
  • snp_data (dict) – dictionary sample->GeneSyn of SNPs data
  • taxonomy – Uniprot Taxonomy
  • min_num (int) – minimum number of samples in which a valid pN/pS is found
  • index_type (str, None) – type of index to return
  • filters (iterable) – list of filters to apply, otherwise uses the default filters
Returns:

pandas.DataFrame of pN/pS values. The index type is None (gene-taxon)

Return type:

DataFrame

mgkit.snps.conv_func.get_gene_map_dataframe(snp_data, taxonomy, gene_map, min_num=3, index_type='gene', filters=None)

New in version 0.1.11.

Changed in version 0.2.2: added filters argument

Returns a DataFrame with the pN/pS of the given SNPs data, mapping all taxa to the gene map.

Shortcut for using combine_sample_snps(), using filters from get_default_filters() and as gene_func parameter map_gene_id().

Parameters:
  • snp_data (dict) – dictionary sample->GeneSyn of SNPs data
  • taxonomy – Uniprot Taxonomy
  • min_num (int) – minimum number of samples in which a valid pN/pS is found
  • gene_map (dict) – dictionary of mapping for the gene_ids in in SNPs data
  • index_type (str, None) – type of index to return
  • filters (iterable) – list of filters to apply, otherwise uses the default filters
Returns:

pandas.DataFrame of pN/pS values. The index type is ‘gene’

Return type:

DataFrame

mgkit.snps.conv_func.get_gene_taxon_dataframe(snp_data, taxonomy, gene_map, min_num=3, rank='genus', index_type=None, filters=None)

New in version 0.1.12.

Changed in version 0.2.2: added filters argument

Todo

edit docstring

Returns a DataFrame with the pN/pS of the given SNPs data, mapping all taxa to the gene map.

Shortcut for using combine_sample_snps(), using filters from get_default_filters() and as gene_func parameter map_gene_id().

Parameters:
  • snp_data (dict) – dictionary sample->GeneSyn of SNPs data
  • taxonomy – Uniprot Taxonomy
  • min_num (int) – minimum number of samples in which a valid pN/pS is found
  • gene_map (dict) – dictionary of mapping for the gene_ids in in SNPs data
  • index_type (str, None) – type of index to return
  • filters (iterable) – list of filters to apply, otherwise uses the default filters
Returns:

pandas.DataFrame of pN/pS values. The index type is ‘gene’

Return type:

DataFrame

mgkit.snps.conv_func.get_rank_dataframe(snp_data, taxonomy, min_num=3, rank='order', index_type='taxon', filters=None)

New in version 0.1.11.

Changed in version 0.2.2: added filters argument

Returns a DataFrame with the pN/pS of the given SNPs data, mapping all taxa to the specified rank. Higher taxa won’t be included.

Shortcut for using combine_sample_snps(), using filters from get_default_filters() and as taxon_func parameter map_taxon_id_to_rank(), with include_higher equals to False

Parameters:
  • snp_data (dict) – dictionary sample->GeneSyn of SNPs data
  • taxonomy – Uniprot Taxonomy
  • min_num (int) – minimum number of samples in which a valid pN/pS is found
  • rank (str) – taxon rank to map. Valid ranks are found in mgkit.taxon.TAXON_RANKS
  • index_type (str, None) – type of index to return
  • filters (iterable) – list of filters to apply, otherwise uses the default filters
Returns:

pandas.DataFrame of pN/pS values. The index type is ‘taxon’

Return type:

DataFrame