pydepr package

Submodules

pydepr.regression module

pydepr.regression

PyDePr is a set of tools for processing machine learning models for inferring equipment degradation. This module is meant to preprocess and develop regression models.

copyright:
  1. 2017 Eric Strong.
license:

Refer to LICENSE.txt for more information.

class pydepr.regression.RegressionCurve(model_type='ridge')

Bases: object

add_filter(data, low, high, merge_type='outer')

Adds data filters. If a filter already exists, it will merge the new data with the existing data using an outer join, by default.

Parameters:
  • data – A pandas DataFrame of filter data.
  • low – Values below the “low” parameter will be filtered out
  • high – Values above the “high” parameter will be filtered out
  • merge_type – If data already exists, it will be merged with an ‘inner’ or ‘outer’ join. Refers to pandas.concat for more information about this behavior.
add_filter_edna(tag, start_date, end_date, low, high, desc_as_label=True, custom_label=None, merge_type='outer')

Adds data filters. If a filter already exists, it will merge the new data with the existing data using an outer join, by default.

This helper function will pull data from eDNA, to be used as filters to the RegressionCurve. It is strongly recommended that you use the same start_date and end_date for all data, and that you ensure that data actually exists during the time period of interest.

Parameters:
  • tag – The full Site.Service.Tag eDNA tagname
  • start_date – must be in format mm/dd/yy hh:mm:ss
  • end_date – must be in format mm/dd/yy hh:mm:ss
  • low – Values below the “low” parameter will be filtered out
  • high – Values above the “high” parameter will be filtered out
  • desc_as_label – If true, use the eDNA description as the label of the variable in the pandas DataFrame
  • custom_label – Supply a custom variable label, as a string
  • merge_type – If data already exists, it will be merged with an ‘inner’ or ‘outer’ join. Refers to pandas.concat for more information about this behavior.
add_input(data, merge_type='outer')

Adds values to the X data. If x data already exists, it will merge the new data with the existing data using an outer join, by default.

Parameters:
  • data – A pandas DataFrame of input data.
  • merge_type – If data already exists, it will be merged with an ‘inner’ or ‘outer’ join. Refers to pandas.concat for more information about this behavior.
add_input_edna(tag, start_date, end_date, desc_as_label=True, custom_label=None, merge_type='outer')

Adds values to the X data. If x data already exists, it will merge the new data with the existing data using an outer join, by default.

This helper function will pull data from eDNA, to be used as inputs to the RegressionCurve. It is strongly recommended that you use the same start_date and end_date for all data, and that you ensure that data actually exists during the time period of interest.

Parameters:
  • tag – The full Site.Service.Tag eDNA tagname
  • start_date – must be in format mm/dd/yy hh:mm:ss
  • end_date – must be in format mm/dd/yy hh:mm:ss
  • desc_as_label – If true, use the eDNA description as the label of the variable in the pandas DataFrame
  • custom_label – Supply a custom variable label, as a string
  • merge_type – If data already exists, it will be merged with an ‘inner’ or ‘outer’ join. Refers to pandas.concat for more information about this behavior.
add_output(data)

Adds values to the Y data. WARNING- if y data already exists, it will be overwritten by the new data. Only one Y variable is supported, currently.

Parameters:data – A pandas DataFrame of output data.
add_output_edna(tag, start_date, end_date, desc_as_label=True, custom_label=None)

Adds values to the Y data. WARNING- if y data already exists, it will be overwritten by the new data. Only one Y variable is supported, currently.

This helper function will pull data from eDNA, to be used as outputs to the RegressionCurve. It is strongly recommended that you use the same start_date and end_date for all data, and that you ensure that data actually exists during the time period of interest.

Parameters:
  • tag – The full Site.Service.Tag eDNA tagname
  • start_date – must be in format mm/dd/yy hh:mm:ss
  • end_date – must be in format mm/dd/yy hh:mm:ss
  • desc_as_label – If true, use the eDNA description as the label of the variable in the pandas DataFrame
  • custom_label – Supply a custom variable label, as a string
build_equation()

Builds an equation and corrected equation based on the results from the constructed model. WARNING- “run” must be called first.

Returns:A tuple containing the equation and the y-corrected equation.
calculate_metrics(warn=3, alarm=4)

Calculates performance metrics for the performance curve. Warning- “run” must be called first.

Parameters:
  • warn – # of standard deviations for the warning limit
  • alarm – # of standard deviations for the alarm limit
Returns:

an array of: [R^2, MAE, EV, Warn limit, Alarm limit]

plot_validation(warn=3, alarm=4, title=None, save_fig=False, fig_size=(20, 15))

Creates a multi-plot to be used for model validation. WARNIGN- “run” must be called first.

Plot descriptions: 1. Residuals vs. Time 2. Residuals vs. Primary Explanatory Factor 3. Y vs. Yhat Plot 4. Histogram of the Residuals 5. Actual Y vs. X 6. Corrected Y vs. X 7. Histogram of Actual Y 8. Histogram of Corrected Y

Parameters:
  • warn – # of standard deviations for the warning limit
  • alarm – # of standard deviations for the alarm limit
  • title – an optional title for the plot
  • save_fig – if True, the figure will be saved to a file with a filename the same as the title
  • fig_size – the size of the plot
Returns:

either a figure plotted in the console, or a figure that is saved to a file

run(model_type=None)

This method will run the performance curve analysis for the initialized data.

Parameters:model_type – Overwrite the regression model chosen during initialization. Choices include Lasso, ElasticNet, Ridge, LassoLars

pydepr.waveform module

pydepr.waveform

PyDePr is a set of tools for processing degradation models. This module contains tools for processing and validating waveforms.

copyright:
  1. 2017 Eric Strong.
license:

Refer to LICENSE.txt for more information.

class pydepr.waveform.Waveform(data)

Bases: object

static list_from_edna(tag, start_date, end_date, start_val=-7291290)

Returns a list of waveforms from eDNA.

Parameters:
  • tag – the eDNA tag to pull the waveform
  • start_date – the beginning of the data pull
  • end_date – the end of the data pull
  • start_val – value in history that defines the start of the array
Returns:

a pandas DataFrame, with each row as a single array

Module contents

pydepr

PyDePr is a set of tools for processing degradation models.

copyright:
  1. 2017 by Eric Strong.
license:

Refer to LICENSE.txt for more information.