mgkit.mappings.enzyme module¶

New in version 0.1.14.

EC mappings

mgkit.mappings.enzyme.ENZCLASS_REGEX = '^(\\d)\\. ?([\\d-]+)\\. ?([\\d-]+)\\. ?([\\d-]+) +(.+)\\.'¶: Used to get the description for the higher level enzyme classes from the file enzclass.txt on expasy

mgkit.mappings.enzyme.LEVEL1_NAMES = {1: 'oxidoreductases', 2: 'transferases', 3: 'hydrolases', 4: 'lyases', 5: 'isomerases', 6: 'ligases'}¶: Top level classification names

mgkit.mappings.enzyme.change_mapping_level(ec_map, level=3)[source]¶

New in version 0.1.14.

Given a dictionary, whose values are dictionaries, in which a key is named ec and its value is an iterable of EC numbers, returns an iterator that can be used to build a dictionary with the same top level keys and the values are sets of the transformed EC numbers.

Parameters:	ec_map (dict) – dictionary generated by `mgkit.net.uniprot.get_gene_info()` level (int) – number from 1 to 4, to specify the level of the mapping, passed to `get_enzyme_level()`
Yields:	tuple – a tuple (gene_id, set(ECs)), which can be passed to dict to make a dictionary

Example

>>> from mgkit.net.uniprot import get_gene_info
>>> from mgkit.mappings.enzyme import change_mapping_level
>>> ec_map = get_gene_info('Q9HFQ1', columns='ec')
{'Q9HFQ1': {'ec': '1.1.3.4'}}
>>> dict(change_mapping_level(ec_map, level=2))
{'Q9HFQ1': {'1.1'}}

mgkit.mappings.enzyme.get_enzyme_full_name(ec_id, ec_names, sep=', ')[source]¶

New in version 0.2.1.

From a EC identifiers and a dictionary of names builds a comma separated name (by default) that identifies the function of the enzyme.

Parameters:	ec_id (str) – EC identifier ec_names (dict) – a dictionary of names that can be produced using `parse_expasy_file()` sep (str) – string used to join the names
Returns:	the enzyme classification name
Return type:	str

mgkit.mappings.enzyme.get_enzyme_level(ec, level=4)[source]¶

New in version 0.1.14.

Returns an enzyme class at a specific level , between 1 and 4 (by default the most specific, 4)

Parameters:	ec (str) – a string representing an EC number (e.g. 1.2.4.10) level (int) – from 1 to 4, to get a different level specificity of in the enzyme classification
Returns:	the EC number at the requested specificity
Return type:	str

Example

>>> from mgkit.mappings.enzyme import get_enzyme_level
>>> get_enzyme_level('1.1.3.4', 1)
'1'
>>> get_enzyme_level('1.1.3.4', 2)
'1.1'
>>> get_enzyme_level('1.1.3.4', 3)
'1.1.3'
>>> get_enzyme_level('1.1.3.4', 4)
'1.1.3.4'

mgkit.mappings.enzyme.get_mapping_level(ec_map, level=3)[source]¶

New in version 0.3.0.

Given a dictionary, whose values are iterable of EC numbers, returns an iterator that can be used to build a dictionary with the same top level keys and the values are sets of the transformed EC numbers.

Parameters:	ec_map (dict) – dictionary genes to EC level (int) – number from 1 to 4, to specify the level of the mapping, passed to `get_enzyme_level()`
Yields:	tuple – a tuple (gene_id, set(ECs)), which can be passed to dict to make a dictionary

mgkit.mappings.enzyme.parse_expasy_dat(expasy_dat, keep_empty=False, skip_comments=True, skip_codes=None)[source]¶

New in version 0.4.2.

Parses the information in enzyme.dat file in expasy, a flat file containting the information about the enzyme classification.

It can be downloaded at: ftp://ftp.expasy.org/databases/enzyme/enzyme.dat

Parameters:	expasy_dat (str) – file name or handle to an expasy.dat file keep_empty (bool) – section that are empty are removed by default skip_comments (bool) – used to avoid returning comments (lines starting) with CC in the file skip_codes (set, tuple) – set or tuple or list to skip specific parts of the file, like skip_comments
Yields:	dict – dictionary with each entry in the file, where the keys are the codes and the values are the lines included in the file

mgkit.mappings.enzyme.parse_expasy_dat_section(expasy_dat_section, skip_comments=True, skip_codes=None)[source]¶

New in version 0.4.2.

Parses an entry of the enzyme.dat file in expasy, used internally by mgkit.mappings.enzyme.parse_expasy_dat(), with the other arguments being passed over from it.

Returns:	dictionary with the entry, with keys being the codes of the entry and the values the lines
Return type:	dict

mgkit.mappings.enzyme.parse_expasy_file(file_name)[source]¶

Changed in version 0.4.2: changed to work on python 3.x

Used to load enzyme descriptions from the file enzclass.txt on expasy.

The FTP url for enzclass.txt is: ftp://ftp.expasy.org/databases/enzyme/enzclass.txt