mgkit.kegg module¶
Module containing classes and functions to access Kegg data
-
class
mgkit.kegg.
KeggClientRest
(cache=None)¶ Bases:
future.types.newobject.newobject
Changed in version 0.3.1: added a cache attribute for some methods
Kegg REST client
The class includes methods and data to use the REST API provided by Kegg. At the moment it provides methods to for ‘link’, ‘list’ and ‘get’ operations,
-
api_url
= 'http://rest.kegg.jp/'¶
-
cache
= None¶
-
contact
= None¶
-
conv
(target_db, source_db, strip=True)¶ New in version 0.3.1.
Kegg Help:
http://rest.kegg.jp/conv/<target_db>/<source_db>
(<target_db> <source_db>) = (<kegg_db> <outside_db>) | (<outside_db> <kegg_db>)
For gene identifiers: <kegg_db> = <org> <org> = KEGG organism code or T number <outside_db> = ncbi-proteinid | ncbi-geneid | uniprot
For chemical substance identifiers: <kegg_db> = drug | compound | glycan <outside_db> = pubchem | chebi http://rest.kegg.jp/conv/<target_db>/<dbentries>
For gene identifiers: <dbentries> = database entries involving the following <database> <database> = <org> | genes | ncbi-proteinid | ncbi-geneid | uniprot <org> = KEGG organism code or T number
For chemical substance identifiers: <dbentries> = database entries involving the following <database> <database> = drug | compound | glycan | pubchem | chebi
Examples
>>> kc = KeggClientRest() >>> kc.conv('ncbi-geneid', 'eco') {'eco:b0217': {'ncbi-geneid:949009'}, 'eco:b0216': {'ncbi-geneid:947541'}, 'eco:b0215': {'ncbi-geneid:946441'}, 'eco:b0214': {'ncbi-geneid:946955'}, 'eco:b0213': {'ncbi-geneid:944903'}, ... >>> kc.conv('ncbi-proteinid', 'hsa:10458+ece:Z5100') {'10458': {'NP_059345'}, 'Z5100': {'AAG58814'}}
-
cpd_desc_re
= <_sre.SRE_Pattern object>¶
-
cpd_re
= <_sre.SRE_Pattern object at 0x4544a20>¶
-
empty_cache
(methods=None)¶ New in version 0.3.1.
Empties the cache completely or for a specific method(s)
Parameters: methods (iterable, str) – string or iterable of strings that are part of the cache. If None the cache is fully emptied
-
find
(query, database, options=None, strip=True)¶ New in version 0.3.1.
Kegg Help:
http://rest.kegg.jp/find/<database>/<query>
- <database> = pathway | module | ko | genome | <org> | compound | glycan |
- reaction | rclass | enzyme | disease | drug | dgroup | environ | genes | ligand
<org> = KEGG organism code or T number
http://rest.kegg.jp/find/<database>/<query>/<option>
<database> = compound | drug <option> = formula | exact_mass | mol_weight
Examples
>>> kc = KeggClientRest() >>> kc.find('CH4', 'compound') {'C01438': 'Methane; CH4'} >>> kc.find('K00844', 'genes', strip=False) {'tped:TPE_0072': 'hexokinase; K00844 hexokinase [EC:2.7.1.1]', ... >>> kc.find('174.05', 'compound', options='exact_mass') {'C00493': '174.052823', 'C04236': '174.052823', 'C16588': '174.052823', 'C17696': '174.052823', 'C18307': '174.052823', 'C18312': '174.052823', 'C21281': '174.052823'}
-
get_entry
(k_id, option=None)¶ Changed in version 0.3.1: this is now cached
The method abstract the use of the ‘get’ operation in the Kegg API
Parameters:
-
get_ids_names
(target='ko', strip=True)¶ New in version 0.1.13.
Changed in version 0.3.1: the call is now cached
Returns a dictionary with the names/description of all the id of a specific target, (ko, path, cpd, etc.)
If strip=True the id will stripped of the module abbreviation (e.g. md:M00002->M00002)
-
get_ortholog_pathways
()¶ Gets ortholog pathways, replace ‘map’ with ‘ko’ in the id
-
get_pathway_links
(pathway)¶ Returns a dictionary with the mappings KO->compounds for a specific Pathway or module
-
get_reaction_equations
(ids, max_len=10)¶ Get the equation for the reactions
-
id_prefix
= {'C': 'cpd', 'K': 'ko', 'R': 'rn', 'k': 'map', 'm': 'path'}¶
-
ko_desc_re
= <_sre.SRE_Pattern object>¶
-
link
(target, source, options=None)¶ New in version 0.2.0.
Implements “link” operation in Kegg REST
-
link_ids
(target, kegg_ids, max_len=50)¶ Changed in version 0.3.1: removed strip and cached the results
The method abstract the use of the ‘link’ operation in the Kegg API
The target parameter can be one of the following:
pathway | brite | module | disease | drug | environ | ko | genome | <org> | compound | glycan | reaction | rpair | rclass | enzyme <org> = KEGG organism code or T number
Parameters: Return dict: dictionary mapping requested id to target id(s)
-
list_ids
(k_id)¶ The method abstract the use of the ‘list’ operation in the Kegg API
The k_id parameter can be one of the following:
pathway | brite | module | disease | drug | environ | ko | genome | <org> | compound | glycan | reaction | rpair | rclass | enzyme <org> = KEGG organism code or T number
Parameters: k_id (str) – kegg database to get list of ids Return list: list of ids in the specified database
-
load_cache
(file_handle)¶ New in version 0.3.1.
Loads the cache from file
-
rn_eq_re
= <_sre.SRE_Pattern object>¶
-
rn_name_re
= <_sre.SRE_Pattern object>¶
-
write_cache
(file_handle)¶ New in version 0.3.1.
Write the cache to file
-
-
class
mgkit.kegg.
KeggCompound
(cp_id=None, description='')¶ Bases:
future.types.newobject.newobject
Kegg compound
-
__eq__
(other)¶ >>> KeggCompound('test') == KeggCompound('test') True >>> KeggCompound('test') == 1 False
-
__ne__
(other)¶ >>> KeggCompound('test') != KeggCompound('test1') True >>> KeggCompound('test') != 1 True
-
-
class
mgkit.kegg.
KeggData
(fname=None, gen_maps=True)¶ Bases:
future.types.newobject.newobject
Deprecated since version 0.3.4.
-
gen_ko_map
()¶
-
gen_maps
()¶
-
get_cp_names
()¶
-
get_ko_cp_links
(path_filter=None, description=False)¶
-
get_ko_cp_links_alt
(direction='out', description=False)¶
-
get_ko_names
()¶
-
get_ko_pathway_map
(black_list=None)¶
-
get_ko_pathways
(ko_id)¶
-
get_ko_rn_links
(path_filter=None, description=False)¶
-
get_pathway_ko_map
(black_list=None)¶
-
get_rn_cp_links
(path_filter=None, description=False)¶
-
get_rn_names
()¶
-
load_data
(fname)¶
-
maps
= None¶
-
pathways
= None¶
-
save_data
(fname)¶
-
-
class
mgkit.kegg.
KeggMapperBase
(fname=None)¶ Bases:
future.types.newobject.newobject
Deprecated since version 0.3.4.
Base object for Kegg mapping classes
-
get_id_map
()¶ Returns a mapping->KOs dictionary (a reverse mapping to get_ko_map)
-
get_id_names
()¶ Returns a copy of the mapping names
-
get_ko_map
()¶ Returns a copy of the KO->mapping dictionary
-
static
ko_to_mapping
(ko_id, query, columns, contact=None)¶ Returns the mappings to the supplied KO. Can be used for any id, the query format is free as well as the columns returned. The only restriction is using a tab format, that is parsed.
Parameters: Note
each mapping in the column is separated by a ;
-
load_data
(fname)¶ Loads mapping data to disk
-
save_data
(fname)¶ Saves mapping data to disk
-
-
class
mgkit.kegg.
KeggModule
(entry=None, old=False)¶ Bases:
future.types.newobject.newobject
New in version 0.1.13.
Used to extract information from a pathway module entry in Kegg
The entry, as a string, can be either passed at instance creation or with
KeggModule.parse_entry()
-
classes
= None¶
-
compounds
= None¶
-
entry
= ''¶
-
find_submodules
()¶ New in version 0.3.0.
Returns the possible submodules, as a list of tuples where the elements are the first and last compounds in a submodule
-
first_cp
¶ Returns the first compound in the module
-
last_cp
¶ Returns the first compound in the module
-
name
= ''¶
-
parse_entry
(entry)¶ Parses a Kegg module entry and change the instance values. By default the reactions IDs are substituted with the KO IDs
-
parse_entry2
(entry)¶ New in version 0.3.0.
Parses a Kegg module entry and change the instance values. By default the reactions IDs are NOT substituted with the KO IDs.
-
static
parse_reaction
(line, ko_ids=None)¶ Changed in version 0.3.0: cleaned the parsing
parses the lines with the reactions and substitute reaction IDs with the corresponding KO IDs if provided
-
reactions
= None¶
-
to_edges
(id_only=None)¶ Changed in version 0.3.0: added id_only and changed to reflect changes in
reactions
Returns the reactions as edges that can be supplied to make graph.
Parameters: id_only (None, iterable) – if None the returned edges are for the whole module, if an iterable (converted to a set
), only edges for those reactions are returnedYields: tuple – the elements are the compounds and reactions in the module
-
-
class
mgkit.kegg.
KeggOrtholog
(ko_id=None, description='', reactions=None)¶ Bases:
future.types.newobject.newobject
Kegg Ortholog gene
-
__eq__
(other)¶ >>> KeggOrtholog('test') == KeggOrtholog('test') True >>> KeggOrtholog('test') == 1 False
-
__ne__
(other)¶ >>> KeggOrtholog('test') != KeggOrtholog('test1') True >>> KeggOrtholog('test') != 1 True
-
-
class
mgkit.kegg.
KeggPathway
(path_id=None, description=None, genes=None)¶ Bases:
future.types.newobject.newobject
Kegg Pathway
-
__eq__
(other)¶ >>> KeggPathway('test') == KeggPathway('test') True >>> KeggPathway('test') == 1 False
-
__ne__
(other)¶ >>> KeggPathway('test') != KeggPathway('test1') True >>> KeggPathway('test') != 1 True
-
-
class
mgkit.kegg.
KeggReaction
(entry)¶ Bases:
future.types.newobject.newobject
Changed in version 0.3.1: reworked, only stores the equation
Kegg Reaction, used for parsing the equation line
-
left_cp
= None¶
-
right_cp
= None¶
-
rn_id
= None¶
-
-
mgkit.kegg.
download_data
(fname='kegg.pickle', contact=None)¶ Deprecated since version 0.3.4.
-
mgkit.kegg.
parse_reaction
(line, prefix=('C', 'G'))¶ New in version 0.3.1.
Parses a reaction equation from Kegg, returning the left and right components. Needs testing
Parameters: line (str) – reaction string Returns: left and right components as sets Return type: tuple Raises: ValueError
– if the