Capabilities of deep learning models on learning physical relationships: Case of rainfall-runoff modeling with LSTM

2022-01-01
Yokoo, Kazuki
Ishida, Kei
Ercan, Ali
Tu, Tongbi
Nagasato, Takeyoshi
Kiyama, Masato
Amagasaki, Motoki
ABSTR A C T This study investigates the relationships which deep learning methods can identify between the input and output data. As a case study, rainfall-runoff modeling in a snow-dominated watershed by means of a long short-term memory (LSTM) network is selected. Daily precipitation and mean air temperature were used as model input to estimate daily flow discharge. After model training and verification, two experimental simulations were con-ducted with hypothetical inputs instead of observed meteorological data to clarify the response of the trained model to the inputs. The first numerical experiment showed that even without input precipitation, the trained model generated flow discharge, particularly winter low flow and high flow during the snow melting period. The effects of warmer and colder conditions on the flow discharge were also replicated by the trained model without precipitation. Additionally, the model reflected only 17-39% of the total precipitation mass during the snow accumulation period in the total annual flow discharge, revealing a strong lack of water mass conservation. The results of this study indicated that a deep learning method may not properly learn the explicit physical rela-tionships between input and target variables, although they are still capable of maintaining strong goodness-of-fit results. (c) 2021 Published by Elsevier B.V.
SCIENCE OF THE TOTAL ENVIRONMENT

Suggestions

Multi-time-scale input approaches for hourly-scale rainfall-runoff modeling based on recurrent neural networks
Ishida, Kei; Kiyama, Masato; Ercan, Ali; Amagasaki, Motoki; Tu, Tongbi (2021-11-01)
This study proposes two effective approaches to reduce the required computational time of the training process for time-series modeling through a recurrent neural network (RNN) using multi-time-scale time-series data as input. One approach provides coarse and fine temporal resolutions of the input time-series data to RNN in parallel. The other concatenates the coarse and fine temporal resolutions of the input time-series data over time before considering them as the input to RNN. In both approaches, first, ...
Investigation and comparison of the preprocessing algorithms for microarrayanalysis for robust gene expression calculation and performance analysis of technical replicates
İLK, HAKKI GÖKHAN; İlk Dağ, Özlem; KONU KARAKAYALI, ÖZLEN; ÖZDAĞ, Hilal (2006-04-19)
Preprocessing of microarray data involves the necessary steps of background correction, normalization and summarization of the raw intensity data obtained from cDNA or oligo-arrays before statistical analysis. Several algorithms, namely RMA, dChip, and MAS5 exist for the preprocessing of Affymetrix microarray data. Previous studies have identified RMA as one of most accurate algorithms while MAS5 was characterized with lower accuracy and sensitivity levels. In this study, performance of different preprocess...
Robust multiobjective evolutionary feature subset selection algorithm for binary classification using machine learning techniques
Deniz, Ayca; Kiziloz, Hakan Ezgi; Dokeroglu, Tansel; Coşar, Ahmet (2017-06-07)
This study investigates the success of a multiobjective genetic algorithm (GA) combined with state-of-the-art machine learning (ML) techniques for the feature subset selection (FSS) in binary classification problem (BCP). Recent studies have focused on improving the accuracy of BCP by including all of the features, neglecting to determine the best performing subset of features. However, for some problems, the number of features may reach thousands, which will cause too much computation power to be consumed ...
Effects of Content Balancing and Item Selection Method on Ability Estimation in Computerized Adaptive Tests
Sahin, Alper; ÖZBAŞI, DURMUŞ (2017-01-01)
Purpose: This study aims to reveal effects of content balancing and item selection method on ability estimation in computerized adaptive tests by comparing Fisher's maximum information (FMI) and likelihood weighted information (LWI) methods. Research Methods: Four groups of examinees (250, 500, 750, 1000) and a bank of 500 items with 10 different content domains were generated through Monte Carlo simulations. Examinee ability was estimated by fixing all settings except for the item selection methods mention...
Distance matrices as protein representations
Dinç, Mehmet; Atalay, Mehmet Volkan; Department of Computer Engineering (2022-9-02)
Representing protein sequences is a crucial problem in the field of bioinformatics since any data-driven model's performance is limited by the information contained in its input features. A protein's biological function is dictated by its structure and knowing a protein's structure can potentially help predict its interactions with drug candidates or predict its Gene Ontology (GO) term. Yet, off-the-shelf protein representations do not contain such information since only a small fraction of the billions of ...
Citation Formats
K. Yokoo et al., “Capabilities of deep learning models on learning physical relationships: Case of rainfall-runoff modeling with LSTM,” SCIENCE OF THE TOTAL ENVIRONMENT, vol. 802, pp. 0–0, 2022, Accessed: 00, 2022. [Online]. Available: https://hdl.handle.net/11511/100154.