mgkit.counts.glm module

New in version 0.3.3.

GLM models with metagenomes and metatranscriptomes. Experimental

mgkit.counts.glm.fit_lowess_interpolate(endog, exog, frac=0.2, it=3, kind='slinear')[source]

Fits a lowess for the passed endog (Y) and exog (X) and returns an interpolated function that describes it. The first 4 arguments are passed to statsmodels.api.sm.nonparametric.lowess(), while the last one is passed to scipy.interpolate.interp1d()

Parameters:
  • endog (array) – array of the dependent variable (Y)
  • exog (array) – array of the indipendent variable (X)
  • frac (float) – fraction of the number of elements to use when fitting (0.0-1.0)
  • it (int) – number of iterations to fit the lowess
  • kind (str) – type of interpolation to use
Returns:

interpolated function representing the lowess fitted from the data passed

Return type:

func

mgkit.counts.glm.lowess_ci_bootstrap(endog, exog, num=100, frac=0.2, it=3, alpha=0.05, delta=0.0, min_value=0.001, kind='slinear')[source]

Bootstraps a lowess for the dependent (endog) and indipendent (exog) arguments.

Parameters:
  • endog (array) – indipendent variable (Y)
  • exog (array) – indipendent variable (X)
  • num (int) – number of iterations for the bootstrap
  • frac (float) – fraction of the array to use when fitting
  • it (int) – number of iterations used to fit the lowess
  • alpha (float) – confidence intervals for the bootstrap
  • delta (float) – passed to statsmodels.api.nonparametric.lowess()
  • min_value (float) – minimum value for the function to avoid out of bounds
  • kind (str) – type of interpolation passed to scipy.interpolate.interp1d()
Returns:

the first element is the function describing the lowest confidence interval, the second element is for the highest confidence interval and the last one for the mean

Return type:

tuple

Note

Performance increase with the value of delta.

mgkit.counts.glm.optimise_alpha_scipy(formula, data, mean_func, q1_func, q2_func)[source]

New in version 0.4.0.

Used to find an optimal alpha parameter for the Negative Binomial distribution used in statsmodels, using the lowess functions from lowess_ci_bootstrap().

Parameters:
Returns:

alpha value for the Negative Binomial

Return type:

float

mgkit.counts.glm.optimise_alpha_scipy_function(args, formula, data, criterion='aic')[source]

New in version 0.4.0.

mgkit.counts.glm.variance_to_alpha(mu, func, min_alpha=0.001)[source]

Based on the variance defined in the Negative Binomial in statsmodels

var = mu + alpha * (mu ** 2)

Parameters:
  • mu (float) – mean to calculate the alphas for
  • func (func) – function that returns the variace of the mean
  • min_alpha (float) – value of alpha if the func goes out of bounds
Returns:

value of alpha for the passed mean

Return type:

float