mgkit.mappings.enzyme module

New in version 0.1.14.

EC mappings

mgkit.mappings.enzyme.ENZCLASS_REGEX = '^(\\d)\\. ?([\\d-]+)\\. ?([\\d-]+)\\. ?([\\d-]+) +(.+)\\.'

Used to get the description for the higher level enzyme classes from the file enzclass.txt on expasy

mgkit.mappings.enzyme.LEVEL1_NAMES = {1: 'oxidoreductases', 2: 'transferases', 3: 'hydrolases', 4: 'lyases', 5: 'isomerases', 6: 'ligases'}

Top level classification names

mgkit.mappings.enzyme.change_mapping_level(ec_map, level=3)[source]

New in version 0.1.14.

Given a dictionary, whose values are dictionaries, in which a key is named ec and its value is an iterable of EC numbers, returns an iterator that can be used to build a dictionary with the same top level keys and the values are sets of the transformed EC numbers.

Parameters:
Yields:

tuple – a tuple (gene_id, set(ECs)), which can be passed to dict to make a dictionary

Example

>>> from mgkit.net.uniprot import get_gene_info
>>> from mgkit.mappings.enzyme import change_mapping_level
>>> ec_map = get_gene_info('Q9HFQ1', columns='ec')
{'Q9HFQ1': {'ec': '1.1.3.4'}}
>>> dict(change_mapping_level(ec_map, level=2))
{'Q9HFQ1': {'1.1'}}
mgkit.mappings.enzyme.get_enzyme_full_name(ec_id, ec_names, sep=', ')[source]

New in version 0.2.1.

From a EC identifiers and a dictionary of names builds a comma separated name (by default) that identifies the function of the enzyme.

Parameters:
  • ec_id (str) – EC identifier
  • ec_names (dict) – a dictionary of names that can be produced using parse_expasy_file()
  • sep (str) – string used to join the names
Returns:

the enzyme classification name

Return type:

str

mgkit.mappings.enzyme.get_enzyme_level(ec, level=4)[source]

New in version 0.1.14.

Returns an enzyme class at a specific level , between 1 and 4 (by default the most specific, 4)

Parameters:
  • ec (str) – a string representing an EC number (e.g. 1.2.4.10)
  • level (int) – from 1 to 4, to get a different level specificity of in the enzyme classification
Returns:

the EC number at the requested specificity

Return type:

str

Example

>>> from mgkit.mappings.enzyme import get_enzyme_level
>>> get_enzyme_level('1.1.3.4', 1)
'1'
>>> get_enzyme_level('1.1.3.4', 2)
'1.1'
>>> get_enzyme_level('1.1.3.4', 3)
'1.1.3'
>>> get_enzyme_level('1.1.3.4', 4)
'1.1.3.4'
mgkit.mappings.enzyme.get_mapping_level(ec_map, level=3)[source]

New in version 0.3.0.

Given a dictionary, whose values are iterable of EC numbers, returns an iterator that can be used to build a dictionary with the same top level keys and the values are sets of the transformed EC numbers.

Parameters:
  • ec_map (dict) – dictionary genes to EC
  • level (int) – number from 1 to 4, to specify the level of the mapping, passed to get_enzyme_level()
Yields:

tuple – a tuple (gene_id, set(ECs)), which can be passed to dict to make a dictionary

mgkit.mappings.enzyme.parse_expasy_dat(expasy_dat, keep_empty=False, skip_comments=True, skip_codes=None)[source]

New in version 0.4.2.

Parses the information in enzyme.dat file in expasy, a flat file containting the information about the enzyme classification.

It can be downloaded at: ftp://ftp.expasy.org/databases/enzyme/enzyme.dat

Parameters:
  • expasy_dat (str) – file name or handle to an expasy.dat file
  • keep_empty (bool) – section that are empty are removed by default
  • skip_comments (bool) – used to avoid returning comments (lines starting) with CC in the file
  • skip_codes (set, tuple) – set or tuple or list to skip specific parts of the file, like skip_comments
Yields:

dict – dictionary with each entry in the file, where the keys are the codes and the values are the lines included in the file

mgkit.mappings.enzyme.parse_expasy_dat_section(expasy_dat_section, skip_comments=True, skip_codes=None)[source]

New in version 0.4.2.

Parses an entry of the enzyme.dat file in expasy, used internally by mgkit.mappings.enzyme.parse_expasy_dat(), with the other arguments being passed over from it.

Returns:dictionary with the entry, with keys being the codes of the entry and the values the lines
Return type:dict
mgkit.mappings.enzyme.parse_expasy_file(file_name)[source]

Changed in version 0.4.2: changed to work on python 3.x

Used to load enzyme descriptions from the file enzclass.txt on expasy.

The FTP url for enzclass.txt is: ftp://ftp.expasy.org/databases/enzyme/enzclass.txt