Resumen:
|
[EN] Water main breaks can jeopardize the safe delivery of clean water and incur significant costs. To mitigate these risks, water main breaks have been predicted through physical and statistical approaches. The latter are ...[+]
[EN] Water main breaks can jeopardize the safe delivery of clean water and incur significant costs. To mitigate these risks, water main breaks have been predicted through physical and statistical approaches. The latter are less complex and can provide satisfactory results with less data. While many factors can contribute to breaks, the factors applied in previous studies depended on local data availability. Because other studies have focused on a few systems at a time, a broad comparison of factor importance has not been possible. This limits the understanding of the impact of different factors on water main deterioration. The present study identifies the most important factors driving water main breaks across 13 Canadian water systems. Twenty-eight factors describing physical, historical, protection, environmental and operational attributes were compiled and cleaned. Availability of each attribute differed by system. To evaluate the importance of both numerical and categorical attributes together, two approaches were tested, categorical principal component analysis (CATPCA) and recursive feature elimination with cross-validation (RFECV). The target variable in both cases was set as yearly break status, either broken or non-broken. While CATPCA provides the contribution of each attribute to the target, RFECV provides a tuned predictive model with selected attributes. The RFECV approach was applied with Random Forest and XGBoost models, both types of machine learning models which have been shown to produce accurate results in water main break prediction. Results from both approaches showed that physical and historical attributes are generally important across all systems. Other types of data, i.e. protection and operational are less available. When protection data is available it was shown to be even more important than physical and historical attributes. Specifically, with CATPCA, lining age and lining material were found to have a higher contribution to break status than pipe age and lining status. With RFECV lining age and lining material were also included in the best models, in particular for systems with greater percentage of lined pipes. These results indicate the choice and timing of lining are key in extending the service life of water mains. Furthermore, this data should be collected if protection practices are in place, to more accurately predict deterioration and future costs. The results also point to an opportunity to collect more operational data. Among attributes collected by only one utility, pipe pressure, roughness, and dead-end, were found to be important in CATPCA and RFECV. Thus, pipe dissipation and water stagnation could lead to greater pipe deterioration. Further studies are required to quantify the impacts of different pressure ranges and network designs on deterioration.
[-]
|