Catalogue Search | MBRL
Search Results Heading
Explore the vast range of titles available.
MBRLSearchResults
-
DisciplineDiscipline
-
Is Peer ReviewedIs Peer Reviewed
-
Item TypeItem Type
-
SubjectSubject
-
YearFrom:-To:
-
More FiltersMore FiltersSourceLanguage
Done
Filters
Reset
37
result(s) for
"Langousis, Andreas"
Sort by:
A Brief Review of Random Forests for Water Scientists and Practitioners and Their Recent History in Water Resources
by
Langousis, Andreas
,
Tyralis, Hristos
,
Papacharalampous, Georgia
in
Algorithms
,
Artificial intelligence
,
Big Data
2019
Random forests (RF) is a supervised machine learning algorithm, which has recently started to gain prominence in water resources applications. However, existing applications are generally restricted to the implementation of Breiman’s original algorithm for regression and classification problems, while numerous developments could be also useful in solving diverse practical problems in the water sector. Here we popularize RF and their variants for the practicing water scientist, and discuss related concepts and techniques, which have received less attention from the water science and hydrologic communities. In doing so, we review RF applications in water resources, highlight the potential of the original algorithm and its variants, and assess the degree of RF exploitation in a diverse range of applications. Relevant implementations of random forests, as well as related concepts and techniques in the R programming language, are also covered.
Journal Article
Super ensemble learning for daily streamflow forecasting: large-scale demonstration and comparison with multiple machine learning algorithms
by
Langousis, Andreas
,
Tyralis, Hristos
,
Papacharalampous, Georgia
in
Algorithms
,
Artificial Intelligence
,
Autoregressive models
2021
Daily streamflow forecasting through data-driven approaches is traditionally performed using a single machine learning algorithm. Existing applications are mostly restricted to examination of few case studies, not allowing accurate assessment of the predictive performance of the algorithms involved. Here, we propose super learning (a type of ensemble learning) by combining 10 machine learning algorithms. We apply the proposed algorithm in one-step-ahead forecasting mode. For the application, we exploit a big dataset consisting of 10-year long time series of daily streamflow, precipitation and temperature from 511 basins. The super ensemble learner improves over the performance of the linear regression algorithm by 20.06%, outperforming the “hard to beat in practice” equal weight combiner. The latter improves over the performance of the linear regression algorithm by 19.21%. The best performing individual machine learning algorithm is neural networks, which improves over the performance of the linear regression algorithm by 16.73%, followed by extremely randomized trees (16.40%), XGBoost (15.92%), loess (15.36%), random forests (12.75%), polyMARS (12.36%), MARS (4.74%), lasso (0.11%) and support vector regression (− 0.45%). Furthermore, the super ensemble learner outperforms exponential smoothing and autoregressive integrated moving average (ARIMA). These latter two models improve over the performance of the linear regression algorithm by 13.89% and 8.77%, respectively. Based on the obtained large-scale results, we propose super ensemble learning for daily streamflow forecasting.
Journal Article
A critical analysis of the shortcomings in spatial frequency analysis of rainfall extremes based on homogeneous regions and a comparison with a hierarchical boundaryless approach
by
Deidda, Roberto
,
Langousis Andreas
,
Hellies Matteo
in
Annual rainfall
,
Estimates
,
Extreme values
2021
We investigate and discuss limitations of the approach based on homogeneous regions (hereafter referred to as regional approach) in describing the frequency distribution of annual rainfall maxima in space, and compare its performance with that of a boundaryless approach. The latter is based on geostatistical interpolation of the at-site estimates of all distribution parameters, using kriging for uncertain data. Both approaches are implemented using a generalized extreme value theoretical distribution model to describe the frequency of annual rainfall maxima at a daily resolution, obtained from a network of 256 raingauges in Sardinia (Italy) with more than 30 years of complete recordings, and approximate density of 1 gauge per 100 km2. We show that the regional approach exhibits limitations in describing local precipitation features, especially in areas characterized by complex terrain, where sharp changes to the shape and scale parameters of the fitted distribution models may occur. We also emphasize limitations and possible ambiguities arising when inferring the distribution of annual rainfall maxima at locations close to the interface of contiguous homogeneous regions. Through implementation of a leave-one-out cross-validation procedure, we evaluate and compare the performances of the regional and boundaryless approaches miming ungauged conditions, clearly showing the superiority of the boundaryless approach in describing local precipitation features, while avoiding abrupt changes of distribution parameters and associated precipitation estimates, induced by splitting the study area into contiguous homogeneous regions.
Journal Article
Explanation and Probabilistic Prediction of Hydrological Signatures with Statistical Boosting Algorithms
by
Langousis, Andreas
,
Tyralis, Hristos
,
Papacharalampous, Georgia
in
catchment hydrology
,
flow indices
,
flow metrics
2021
Hydrological signatures, i.e., statistical features of streamflow time series, are used to characterize the hydrology of a region. A relevant problem is the prediction of hydrological signatures in ungauged regions using the attributes obtained from remote sensing measurements at ungauged and gauged regions together with estimated hydrological signatures from gauged regions. The relevant framework is formulated as a regression problem, where the attributes are the predictor variables and the hydrological signatures are the dependent variables. Here we aim to provide probabilistic predictions of hydrological signatures using statistical boosting in a regression setting. We predict 12 hydrological signatures using 28 attributes in 667 basins in the contiguous US. We provide formal assessment of probabilistic predictions using quantile scores. We also exploit the statistical boosting properties with respect to the interpretability of derived models. It is shown that probabilistic predictions at quantile levels 2.5% and 97.5% using linear models as base learners exhibit better performance compared to more flexible boosting models that use both linear models and stumps (i.e., one-level decision trees). On the contrary, boosting models that use both linear models and stumps perform better than boosting with linear models when used for point predictions. Moreover, it is shown that climatic indices and topographic characteristics are the most important attributes for predicting hydrological signatures.
Journal Article
Probabilistic Hydrological Post-Processing at Scale: Why and How to Apply Machine-Learning Quantile Regression Algorithms
by
Langousis, Andreas
,
Mamassis, Nikos
,
Papacharalampous, Georgia
in
Algorithms
,
artificial intelligence
,
evapotranspiration
2019
We conduct a large-scale benchmark experiment aiming to advance the use of machine-learning quantile regression algorithms for probabilistic hydrological post-processing “at scale” within operational contexts. The experiment is set up using 34-year-long daily time series of precipitation, temperature, evapotranspiration and streamflow for 511 catchments over the contiguous United States. Point hydrological predictions are obtained using the Génie Rural à 4 paramètres Journalier (GR4J) hydrological model and exploited as predictor variables within quantile regression settings. Six machine-learning quantile regression algorithms and their equal-weight combiner are applied to predict conditional quantiles of the hydrological model errors. The individual algorithms are quantile regression, generalized random forests for quantile regression, generalized random forests for quantile regression emulating quantile regression forests, gradient boosting machine, model-based boosting with linear models as base learners and quantile regression neural networks. The conditional quantiles of the hydrological model errors are transformed to conditional quantiles of daily streamflow, which are finally assessed using proper performance scores and benchmarking. The assessment concerns various levels of predictive quantiles and central prediction intervals, while it is made both independently of the flow magnitude and conditional upon this magnitude. Key aspects of the developed methodological framework are highlighted, and practical recommendations are formulated. In technical hydro-meteorological applications, the algorithms should be applied preferably in a way that maximizes the benefits and reduces the risks from their use. This can be achieved by (i) combining algorithms (e.g., by averaging their predictions) and (ii) integrating algorithms within systematic frameworks (i.e., by using the algorithms according to their identified skills), as our large-scale results point out.
Journal Article
Leakages in Water Distribution Networks: Estimation Methods, Influential Factors, and Mitigation Strategies—A Comprehensive Review
by
Langousis, Andreas
,
Kokosalakis, George
,
Deidda, Roberto
in
Aquatic resources
,
Aquifers
,
Drinking water
2024
While only a minimal fraction of global water resources is accessible for drinking water production, their uneven distribution combined with the climate crisis impacts leads to challenges in water availability. Leakage in water distribution networks compounds these issues, resulting in significant economic losses and environmental risks. A coherent review of (a) the most widely applied water loss estimation techniques, (b) factors influencing them, and (c) strategies for their resilient reduction provides a comprehensive understanding of the current state of knowledge and practices in leakage management. This work aims towards covering the most important leakage estimation methodologies, while also unveiling the factors that critically affect them, both internally and externally. Finally, a thorough discussion is provided regarding the current state-of-the-art technics for leakage reduction at the municipal-wide level.
Journal Article
Undersampling in action and at scale: application to the COVID-19 pandemic
2020
It is the purpose of this short communication to analyze the possible caveats in the statistical interpretation of collected data, particularly in the light of decision-making concerning the current COVID-19 coronavirus pandemic. A mitigation of undersampling is proposed, based on re-scaling of statistics that can be considered reliable, such as deaths, and epidemic properties like mortality, that may be considered comparable between countries with similar levels of health care, which would not have reached a saturation level.
Journal Article
Probabilistic estimation of minimum night flow in water distribution networks: large-scale application to the city of Patras in western Greece
by
Kokosalakis George
,
Deidda, Roberto
,
Langousis Andreas
in
Average flow
,
Confidence intervals
,
Environmental research
2022
We introduce two alternative probabilistic approaches for minimum night flow (MNF) estimation in water distribution networks (WDNs), which are particularly suited to minimize noise effects, allowing for a better representation of the low flows during night hours, as well as the overall condition of the network. The strong point of both approaches is that they allow for confidence interval estimation of the observed MNFs. The first approach is inspired by filtering theory, and proceeds by identifying a proper scale for temporal averaging to filter out noise effects in the obtained MNF estimates. The second approach is more intuitive, as it estimates MNF as the average flow of the most probable low-consumption states of the night flows. The efficiency of the developed methods is tested in a large-scale real world application, using flow-pressure data at 1-min temporal resolution for a 4-monthly winter period (i.e. November 2018–February 2019) from the water distribution network of the City of Patras (i.e. the third largest city in Greece). Patras’ WDN covers an area of approximately 27 km2, consists of 700 km of pipeline serving approximately 213,000 consumers, and includes 86 Pressure Management Areas (PMAs) equipped with automated local stations for pressure regulation. Although conceptually and methodologically different, the two probabilistic approaches lead to very similar results, substantiating the robustness of the obtained findings from two independent standpoints, making them suitable for engineering applications and beyond.
Journal Article
Revisiting the Statistical Scaling of Annual Discharge Maxima at Daily Resolution with Respect to the Basin Size in the Light of Rainfall Climatology
2020
Over the years, several studies have been carried out to investigate how the statistics of annual discharge maxima vary with the size of basins, with diverse findings regarding the observed type of scaling (i.e., simple scaling vs. multiscaling), especially in cases where the data originated from regions with significantly different hydroclimatic characteristics. In this context, an important question arises on how one can effectively conclude on an approximate type of statistical scaling of annual discharge maxima with respect to the basin size. The present study aims at addressing this question, using daily discharges from 805 catchments located in different parts of the United Kingdom, with at least 30 years of recordings. To do so, we isolate the effects of the catchment area and the local rainfall climatology, and examine how the statistics of the standardized discharge maxima vary with the basin scale. The obtained results show that: (a) the local rainfall climatology is an important contributor to the observed statistics of peak annual discharges, and (b) when the effects of the local rainfall climatology are properly isolated, the scaling of the standardized annual discharge maxima with the area of the catchment closely follows that commonly met in actual rainfields, deviating significantly from the simple scaling rule. The aforementioned findings explain to a large extent the diverse results obtained by previous studies in the absence of rainfall information, shedding light on the approximate type of scaling of annual discharge maxima with the basin size.
Journal Article
ITSO: a novel inverse transform sampling-based optimization algorithm for stochastic search
by
Bakas, Nikolaos P
,
Langousis Andreas
,
Vagelis, Plevris
in
Algorithms
,
Artificial intelligence
,
Business machines
2022
Optimization algorithms appear in the core calculations of numerous Artificial Intelligence (AI) and Machine Learning methods and Engineering and Business applications. Following recent works on AI’s theoretical deficiencies, a rigour context for the optimization problem of a black-box objective function is developed. The algorithm stems directly from the theory of probability, instead of presumed inspiration. Thus the convergence properties of the proposed methodology are inherently stable. In particular, the proposed optimizer utilizes an algorithmic implementation of the n-dimensional inverse transform sampling as a search strategy. No control parameters are required to be tuned, and the trade-off among exploration and exploitation is, by definition, satisfied. A theoretical proof is provided, concluding that when falling into the proposed framework, either directly or incidentally, any optimization algorithm converges. The numerical experiments verify the theoretical results on the efficacy of the algorithm apropos reaching the sought optimum.
Journal Article