An automated open-channel deficiency rating classification model based on machine learning in Los Angeles County

J. Li

This page cannot be printed from here

Please use the dedicated print option from the 'view' drop down menu located in the blue ribbon in the top, right section of the publication.

screenshot of print menu option

Description: An automated open-channel deficiency rating classification model based on machine...

An automated open-channel deficiency rating classification model based on machine learning in Los Angeles County

Abstract

Introduction: An open channel is a concrete waterway that is widely used in many regions. Like other concrete structures, open channels suffer from many deficiencies, such as scour, cracking, spalling, and tilting. Inspections need to be performed on a regular basis aimed at identifying and rating deficiencies. Ratings represent the severities of a deficiency, which are often associated with descriptions and suggested actions. Table 1 shows an example of spall rating classification. Manually classifying deficiency ratings is often difficult, since it requires both expertise and extensive physical inspection. Therefore, automating deficiency rating work has been an industry challenge. Machine learning (ML) is a type of artificial intelligence (AI) that can learn from various data, identify patterns, and make decisions. In this study, we are interested in evaluating the power of ML on providing accurate rating classifications of channel deficiencies without human intervention. The developed ML model is described in Methodology and followed by preliminary Model Results and Conclusion. Methodology: Since the goal is to rate defects or deficiencies, a supervised ML model is selected. Specially, we train a convolutional neural network (CNN) on 80% of the total available data, then validate & test model results on the remaining 20%. For each deficiency, the total available data consists of thousands of pre-labeled deficiency photos. Table 2 lists the total available photos for each test deficiency. Figure 1 provides a proposed modelling flowchart. Convolutional neural network (CNN) is one of the most popular neural networks in computer vision. Compared with traditional neural networks, CNN consists of not only fully connected layers, but also convolutional layers and pooling layers. This structure allows CNN to utilize spatial pixel interaction information as well as reduce model complexity, which makes CNN especially suitable for image classification. The convolutional layer is used for parsing the input photo. Three brightness intensity values (red, green, and blue, or 'RGB' channels) are assigned to each pixel on a color photo. Figure 2 gives an illustrative example using a tree photo. After parsing a photo into its RGB channels, the convolutional layer employs several 'filters' that contain trainable weights to apply convolution operations on these channels. These filters move over the input layer until all pixels have been covered. The dimensions of input layer will be reduced after the convolution. A pooling layer typically follows a convolutional layer, which further decreases the input layer dimension. The pooling layer also summarizes the regional pixels information so that the model becomes robust against objective position variations. Figure 3 demonstrates a CNN example that is used to predict if a photo is a tree, streetlamp, or a stop sign. In this study, we further adopt a 'transfer learning' technique: A pre-trained CNN, ResNet-50, is used and followed by a trainable fully connected layer to yield deficiency rating predictions. Our model is developed based on open-source software libraries: TensorFlow and Keras frameworks. Model Results: Results are briefly discussed in this abstract. Figure 4 show both the loss and accuracy curves against 70 epochs for cracking as an example, with class weight considered in training. As the training proceeds, in general, the validation loss is decreasing and the validation accuracy is increasing before reaching a relative stable condition, which indicates the convergence of the model. Figure 5 plots the confusion matrix of cracking based on the cracking test data. There are three ratings for cracking (R1, R2, and R3). Each row represents the model prediction while each column is the true label. For example, there are 333 photos with the true R2 labels that are also predicted as R2. Based on the confusion matrix, the model accuracy, precision, and recall are calculated. In this study, we select accuracy and macro-averaged F1 score as two evaluation metrics. Accuracy is defined as the number of correct predictions divided by the total number of predictions, and it assumes that all the classes are equally important. Macro-averaged F1 score is a harmonic mean (tradeoff) between precision and recall, which is useful when the false negatives and false positives are essential. For some deficiencies, rating classes can be imbalanced, where one rating class only has several photos while another rating class has thousands of photos. In this case, F1 score is a preferred over accuracy since it accounts for the model ability to predict the minority class. One way to increase F1 score is to incorporate class weight into loss function, where class weight is the inverse data ratio among different rating classes. Table 3 shows the class weight information for each deficiency. Based on the class weights, Table 4 shows the test results for all four deficiencies, where 60%-70% accuracy can be generally achieved. The accuracy is generally lowered by 2% in exchange for 0.02 increase of F1 score, when comparing the no-class-weight model to the class-weight model. Due to the limited data (Table 2), tilting has the least accuracy performance. Also, because of the relatively balanced classes (Table 3), F1 score does not increase for tilting when transiting from the no-class-weight to the class-weight model. Conclusion: This abstract presents the method used to develop a machine learning model based on CNN, with the objective of assigning open-channel deficiency ratings. Four deficiencies were selected (cracking, spalling, tilting, and vegetation), and the preliminary results indicate a general 60%-70% accuracy and 0.4-0.5 F1 score. In general, the achieved accuracy performance is positively related to the number of training data. Adding class weights into the model can increase F1 score, if the deficiency rating classes are imbalanced. More discussions will be provided in the following paper.

This paper was presented at the WEF Stormwater Summit in Minneapolis, Minnesota, June 27-29, 2022.

SpeakerLi, Jinshu

Presentation time

10:45:00

11:15:00

Session time

08:30:00

12:15:00

Session number11

Session locationHyatt Regency Minneapolis

TopicInformation Technology, Machine Learning, Open Channel

Author(s)