Simple fasta parser and a few utility functions

Changed in version 0.1.13: now returns uppercase sequences

Loads a fasta file and returns a generator of tuples in which the first element is the name of the sequence and the second the sequence

Parameters:file_handle (str, file) – fasta file to open; a file name or a file handle is expected
Yields:tuple – first element is the sequence name/header, the second element is the sequence

New in version 0.3.4.

Loads all fasta files from a list or iterable

New in version 0.3.1.

Reads a Prodigal aminoacid fasta file and yields a dictionary with basic information about the sequences.

Parameters:file_handle (str, file) – passed to load_fasta()
Yields:dict – dictionary with the information contained in the header, the last of the attributes put into key attr, while the rest are transformed to other keys: seq_id, seq, start, end (genomic), strand, ordinal of, name_func=None)

New in version 0.3.1.

Renames the header of the sequences using name_func, which is called on each header. By default, the behaviour is to keep the header to the left of the first space (BLAST behaviour)., name_mask, num_files)

New in version 0.1.13.

Splits a fasta file into a series of smaller files.

  • file_handle (file, str) – fasta file with the input sequences
  • name_mask (str) – file name template for the splitted files, more informations are found in
  • num_files (int) – number of files in which to distribute the sequences, name, seq, wrap=60, write_mode='a')

Write a fasta sequence to file. If the file_handle is a string, the file will be opened using write_mode.

  • file_handle – file handle or string.
  • name (str) – header to write for the sequence
  • seq (str) – sequence to write
  • wrap (int) – int for the line wrapping. If None, the sequence will be written in a single line