pydepr package¶

Submodules¶

pydepr.regression module¶

pydepr.regression¶

PyDePr is a set of tools for processing machine learning models for inferring equipment degradation. This module is meant to preprocess and develop regression models.

copyright:	2017 Eric Strong.
license:	Refer to LICENSE.txt for more information.

class pydepr.regression.RegressionCurve(model_type='ridge')¶

Bases: object

add_filter(data, low, high, merge_type='outer')¶

Adds data filters. If a filter already exists, it will merge the new data with the existing data using an outer join, by default.

Parameters:	data – A pandas DataFrame of filter data. low – Values below the “low” parameter will be filtered out high – Values above the “high” parameter will be filtered out merge_type – If data already exists, it will be merged with an ‘inner’ or ‘outer’ join. Refers to pandas.concat for more information about this behavior.

add_filter_edna(tag, start_date, end_date, low, high, desc_as_label=True, custom_label=None, merge_type='outer')¶

Adds data filters. If a filter already exists, it will merge the new data with the existing data using an outer join, by default.

This helper function will pull data from eDNA, to be used as filters to the RegressionCurve. It is strongly recommended that you use the same start_date and end_date for all data, and that you ensure that data actually exists during the time period of interest.

Parameters:

tag – The full Site.Service.Tag eDNA tagname
start_date – must be in format mm/dd/yy hh:mm:ss
end_date – must be in format mm/dd/yy hh:mm:ss
low – Values below the “low” parameter will be filtered out
high – Values above the “high” parameter will be filtered out
desc_as_label – If true, use the eDNA description as the label of the variable in the pandas DataFrame
custom_label – Supply a custom variable label, as a string
merge_type – If data already exists, it will be merged with an ‘inner’ or ‘outer’ join. Refers to pandas.concat for more information about this behavior.

add_input(data, merge_type='outer')¶

Adds values to the X data. If x data already exists, it will merge the new data with the existing data using an outer join, by default.

Parameters:	data – A pandas DataFrame of input data. merge_type – If data already exists, it will be merged with an ‘inner’ or ‘outer’ join. Refers to pandas.concat for more information about this behavior.

add_input_edna(tag, start_date, end_date, desc_as_label=True, custom_label=None, merge_type='outer')¶

Adds values to the X data. If x data already exists, it will merge the new data with the existing data using an outer join, by default.

This helper function will pull data from eDNA, to be used as inputs to the RegressionCurve. It is strongly recommended that you use the same start_date and end_date for all data, and that you ensure that data actually exists during the time period of interest.

Parameters:

tag – The full Site.Service.Tag eDNA tagname
start_date – must be in format mm/dd/yy hh:mm:ss
end_date – must be in format mm/dd/yy hh:mm:ss
desc_as_label – If true, use the eDNA description as the label of the variable in the pandas DataFrame
custom_label – Supply a custom variable label, as a string
merge_type – If data already exists, it will be merged with an ‘inner’ or ‘outer’ join. Refers to pandas.concat for more information about this behavior.

add_output(data)¶

Adds values to the Y data. WARNING- if y data already exists, it will be overwritten by the new data. Only one Y variable is supported, currently.

Parameters:	data – A pandas DataFrame of output data.

add_output_edna(tag, start_date, end_date, desc_as_label=True, custom_label=None)¶

Adds values to the Y data. WARNING- if y data already exists, it will be overwritten by the new data. Only one Y variable is supported, currently.

This helper function will pull data from eDNA, to be used as outputs to the RegressionCurve. It is strongly recommended that you use the same start_date and end_date for all data, and that you ensure that data actually exists during the time period of interest.

Parameters:	tag – The full Site.Service.Tag eDNA tagname start_date – must be in format mm/dd/yy hh:mm:ss end_date – must be in format mm/dd/yy hh:mm:ss desc_as_label – If true, use the eDNA description as the label of the variable in the pandas DataFrame custom_label – Supply a custom variable label, as a string

build_equation()¶

Builds an equation and corrected equation based on the results from the constructed model. WARNING- “run” must be called first.

Returns:	A tuple containing the equation and the y-corrected equation.

calculate_metrics(warn=3, alarm=4)¶

Calculates performance metrics for the performance curve. Warning- “run” must be called first.

Parameters:	warn – # of standard deviations for the warning limit alarm – # of standard deviations for the alarm limit
Returns:	an array of: [R^2, MAE, EV, Warn limit, Alarm limit]

plot_validation(warn=3, alarm=4, title=None, save_fig=False, fig_size=(20, 15))¶

Creates a multi-plot to be used for model validation. WARNIGN- “run” must be called first.

Plot descriptions: 1. Residuals vs. Time 2. Residuals vs. Primary Explanatory Factor 3. Y vs. Yhat Plot 4. Histogram of the Residuals 5. Actual Y vs. X 6. Corrected Y vs. X 7. Histogram of Actual Y 8. Histogram of Corrected Y

Parameters:	warn – # of standard deviations for the warning limit alarm – # of standard deviations for the alarm limit title – an optional title for the plot save_fig – if True, the figure will be saved to a file with a filename the same as the title fig_size – the size of the plot
Returns:	either a figure plotted in the console, or a figure that is saved to a file

run(model_type=None)¶

This method will run the performance curve analysis for the initialized data.

Parameters:	model_type – Overwrite the regression model chosen during initialization. Choices include Lasso, ElasticNet, Ridge, LassoLars

pydepr.waveform module¶

pydepr.waveform¶

PyDePr is a set of tools for processing degradation models. This module contains tools for processing and validating waveforms.

copyright:	2017 Eric Strong.
license:	Refer to LICENSE.txt for more information.

class pydepr.waveform.Waveform(data)¶

Bases: object

static list_from_edna(tag, start_date, end_date, start_val=-7291290)¶

Returns a list of waveforms from eDNA.

Parameters:	tag – the eDNA tag to pull the waveform start_date – the beginning of the data pull end_date – the end of the data pull start_val – value in history that defines the start of the array
Returns:	a pandas DataFrame, with each row as a single array

Module contents¶

pydepr¶

PyDePr is a set of tools for processing degradation models.

copyright:	2017 by Eric Strong.
license:	Refer to LICENSE.txt for more information.