mgkit.workflow.sampling_utils module

New in version 0.3.1.

Resampling Utilities

sample command

This command samples from a Fasta or FastQ file, based on a probability defined by the user (0.001 or 1 / 1000 by default, -r parameter), for a maximum number of sequences (100,000 by default, -x parameter). By default 1 sample is extracted, but as many as desired can be taken, by using the -n parameter.

The sequence file in input can be either be passed to the standard input or as last parameter on the command line. By defult a Fasta is expected, unless the -q parameter is passed.

The -p parameter specifies the prefix to be used, and if the output files can be gzipped using the -z parameter.

mgkit.workflow.sampling_utils.main()

Main function

mgkit.workflow.sampling_utils.sample_command(options)
mgkit.workflow.sampling_utils.set_parser()

Sets command line arguments parser

mgkit.workflow.sampling_utils.set_sample_parser(parser)