Predictive Maintenance using Machine learning (LSTM python)

Jouneid Raza
5 min readFeb 3, 2020

--

Predictive Maintenance.

Preventive maintenance is a process which helps us to get know remaining useful life or fault status in coming days. So we can start preventive maintenance and save the time and assets from any big issue.

“It automates the mechanism of identifying the potential equipment failure and can recommend actions to solve the problem.”

Machines are replacing human being and able to work more fast and accurate. But it needs maintenance and failure can stop the assembly line or can affect the whole process to shut down. So to avoid these situation Artificial intelligence comes to this area and helping our staff to keep eye on performance and stop and have preventive maintenance and save time and money.

AI bringing revolution to this field where we are able to detect if any machine is going to stop for any reason. In industry we can predict gas leakage by study the pipes, accidents can be stopped by taking machine historical data to find pattern and predict the future problem so it can save money and human life as well.

Data

The very important and crucial part in this topic is data. As much correct and relevant data we have for our problem we will be closer to answer our question. We can use available open source dataset for initial phase and can use our own custom dataset for specific problem.

How can we find best data for our problem?

We need to study our problem. Develop Understanding of what we are aiming to answer. Stockholders in the problem. For data we should make sure we are going to spend more quality time on this part. For better accuracy and reliable results we need accurate and relevant data. Which can be time series sensor data, comprehensive reports data from past analysis. Validity and granularity of the data is asset to our project.

“Our model will be as good as our data “

For example if we want to predict any failure in car to avoid accident, we need to study its sensor data as much as we can get. It helps to understand the normal routine analytics and behavior or vehicle and what is impact of different reading (from the sensor) to actual performance of vehicle. Same logic can be followed to different industry solution like transformer failure prediction etc.

How predictive maintenance works in Transformer.

Car sensor are live and connected to machine learning model and sending continuously live time series data. Model is trained on historical data for given use case and able to predict any anomaly before it effects your car performance. Recommendation can be alerts, indication to get know our stockholder to have preventive maintenance on time.

Objective.

By following this approach we can achieve two things.
1. Save our asset from failure and can have preventive maintenance.

2. Secondly we can predict the remaining useful life, that how much time this machine can run properly.

Machine learning.

We have data so we only need to have computation to find pattern or detect anomaly in our scenario. Machine learning models are helping us to do our job very efficiently.

“We can predict the failure status by using classification algorithms.

We can predict the remaining useful life by using regression techniques”.

Predictive maintenance using LSTM.

You can use your own custom dataset for this example. Where your target variable ‘Faulty’ would be binary(1,0). Input training ‘X’ set may have different features with numerical data.

import pandas as pd
import numpy as np
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import confusion_matrix,accuracy_score
from keras.models import Sequential
from keras.layers import Dense, Dropout, LSTM, Activation
from keras.callbacks import EarlyStopping
import matplotlib.pyplot as plt
plt.style.use('ggplot')
%matplotlib inline

import pandas as pd

features_col_name = ['Meter_No', 'KVA Rating', 'month', 'Current_L1', 'Current_L2',
'Current_L3', 'voltage_L1', 'Voltage_L2', 'Voltage_L3', 'PF_L1',
'PF_L2', 'PF_L3', 'Avg_Current', 'Avg_Voltage', 'Average_PF',
'Avg KVA Monthly', 'Avg Peaks Amps', 'Current_Unb', 'DTS_jobs']
target_col_name='Faulty'
sc=MinMaxScaler()
df_train[features_col_name]=sc.fit_transform(df_train[features_col_name])
df_test[features_col_name]=sc.transform(df_test[features_col_name])
def gen_sequence(id_df, seq_length, seq_cols):
df_zeros=pd.DataFrame(np.zeros((seq_length-1,id_df.shape[1])),columns=id_df.columns)
id_df=df_zeros.append(id_df,ignore_index=True)
data_array = id_df[seq_cols].values
num_elements = data_array.shape[0]
lstm_array=[]
for start, stop in zip(range(0, num_elements-seq_length), range(seq_length, num_elements)):
lstm_array.append(data_array[start:stop, :])
return np.array(lstm_array)
# function to generate labels
def gen_label(id_df, seq_length, seq_cols,label):
df_zeros=pd.DataFrame(np.zeros((seq_length-1,id_df.shape[1])),columns=id_df.columns)
id_df=df_zeros.append(id_df,ignore_index=True)
data_array = id_df[seq_cols].values
num_elements = data_array.shape[0]
y_label=[]
for start, stop in zip(range(0, num_elements-seq_length), range(seq_length, num_elements)):
y_label.append(id_df[label][stop])
return np.array(y_label)
# timestamp or window size
seq_length=50
seq_cols=features_col_name
X_train=np.concatenate(list(list(gen_sequence(df_train[df_train['month']==id], seq_length, seq_cols)) for id in df_train['month'].unique()))
print(X_train.shape)
# generate y_train
y_train=np.concatenate(list(list(gen_label(df_train[df_train['month']==id], 50, seq_cols,'Faulty')) for id in df_train['month'].unique()))
print(y_train.shape)
# generate X_test
X_test=np.concatenate(list(list(gen_sequence(df_test[df_test['month']==id], seq_length, seq_cols)) for id in df_test['month'].unique()))
print(X_test.shape)
# generate y_test
y_test=np.concatenate(list(list(gen_label(df_test[df_test['month']==id], 50, seq_cols,'Faulty')) for id in df_test['month'].unique()))
print(y_test.shape)
nb_features =X_train.shape[2]
timestamp=seq_length
model = Sequential()
model.add(LSTM(
input_shape=(timestamp, nb_features),
units=100,
return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(
units=50,
return_sequences=False))
model.add(Dropout(0.2))
model.add(Dense(units=1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.summary()
model.fit(X_train, y_train, epochs=100, batch_size=200, validation_split=0.05, verbose=1,
callbacks = [EarlyStopping(monitor='val_loss', min_delta=0, patience=0, verbose=0, mode='auto')])
# training metrics
scores = model.evaluate(X_train, y_train, verbose=1, batch_size=200)
print('Accurracy: {}'.format(scores[1]))
y_pred=model.predict_classes(X_test)
print('Accuracy of model on test data: ',accuracy_score(y_test,y_pred))
print('Confusion Matrix: \n',confusion_matrix(y_test,y_pred))
Logistic Regression
# drop faulty column to assign it to X
X = df.drop('Faulty', axis = 1)
X.Meter_No = pd.to_numeric(X.Meter_No)
# assign faulty colmn to y target varable
y = df['Faulty']
X_train, X_test, y_train, y_test= train_test_split(X, y, test_size=0.3, random_state=100)
# Model Object
log_reg = LogisticRegression()
# Fit the model on training set
log_reg.fit(X_train,y_train)
# predict the results on Testing set
y_pred = log_reg.predict(X_test)
# calaculate f1 score (difference between our prediction and actual test set.)
f1score = f1_score(y_test,y_pred,average='weighted')
print("Logistic regression f1_score is : %f" % f1score)
cnf_matrix = metrics.confusion_matrix(y_test, y_pred)
cnf_matrix
Random Forest Classifier.
# Make an RandomClassifier object
rfc = RandomForestClassifier(class_weight="balanced")
# Fit the model on x_train, y_train. We tested the class weight as balanced.
rfc.fit(X_train, y_train)
# Predict the model , on testing set X_test
rfc_predict = rfc.predict(X_test)
rfc_predict
---

Decision Trees
def splitdataset(balance_data):

X = df.iloc[:,0:19]
Y = df.iloc[:,-1]
# Splitting the dataset into train and test
X_train, X_test, y_train, y_test = train_test_split(
X, Y, test_size = 0.3, random_state = 100)

return X, Y, X_train, X_test, y_train, y_test
# Function to perform training with giniIndex.
def train_using_gini(X_train, X_test, y_train):

# Creating the classifier object
clf_gini = DecisionTreeClassifier(criterion = "gini", random_state = 100,max_depth=3, min_samples_leaf=5)

# Performing training
clf_gini.fit(X_train, y_train)
return clf_gini

# Function to perform training with entropy.
def tarin_using_entropy(X_train, X_test, y_train):

# Decision tree with entropy
clf_entropy = DecisionTreeClassifier(
criterion = "entropy", random_state = 100,
max_depth = 3, min_samples_leaf = 5)

# Performing training
clf_entropy.fit(X_train, y_train)
return clf_entropy
# Function to make predictions
def prediction(X_test, clf_object):

# Predicton on test with giniIndex
y_pred = clf_object.predict(X_test)
print("Predicted values:")
print(y_pred)
return y_pred

# Function to calculate accuracy
def cal_accuracy(y_test, y_pred):

print("Confusion Matrix: ",
confusion_matrix(y_test, y_pred))

print ("Accuracy : ",
accuracy_score(y_test,y_pred)*100)

print("Report : ",
classification_report(y_test, y_pred))
# Driver code
def main():

# Building Phase
#data = importdata()
X, Y, X_train, X_test, y_train, y_test = splitdataset(df)
clf_gini = train_using_gini(X_train, X_test, y_train)
clf_entropy = tarin_using_entropy(X_train, X_test, y_train)

# Operational Phase
print("Results Using Gini Index:")

# Prediction using gini
y_pred_gini = prediction(X_test, clf_gini)
cal_accuracy(y_test, y_pred_gini)
# Calling main function
if __name__=="__main__":
main()

---

Conclusion

We experimented LSTM, Random forests, Decision Trees, and logistic regression to predict the status of our asset. Where Faulty was our target variable (0,1 ). Time series data is really helpful and provide accurate results in these solutions.

Cheers :)

Feel free to contact me at:
LinkedIn https://www.linkedin.com/in/junaidraza52/
Whatsapp +92–3225847078
Instagram https://www.instagram.com/iamjunaidrana/

--

--

Jouneid Raza
Jouneid Raza

Written by Jouneid Raza

With 7 years of industry expertise, I am a seasoned software developer specializing in data science and data engineering with diverse domain experiences.

Responses (4)