Assessing dengue forecasting methods: a comparative study of statistical models and machine learning techniques in Rio de Janeiro, Brazil

Chen, Xiang; Moraga, Paula

doi:10.1186/s41182-025-00723-7

Research
Open access
Published: 10 April 2025

Assessing dengue forecasting methods: a comparative study of statistical models and machine learning techniques in Rio de Janeiro, Brazil

Xiang Chen¹ &
Paula Moraga¹

Tropical Medicine and Health volume 53, Article number: 52 (2025) Cite this article

618 Accesses
1 Altmetric
Metrics details

Abstract

Background

Dengue is a mosquito-borne viral disease that poses a significant public health threat in tropical and subtropical regions worldwide. Accurate forecasting of dengue outbreaks is crucial for effective public health planning and intervention. This study aims to assess the predictive performance and computational efficiency of a number of statistical models and machine learning techniques for dengue forecasting, both with and without the inclusion of climate factors, to inform the design of dengue surveillance systems.

Methods

The dengue forecasting methods comparison in this study considers dengue cases in Rio de Janeiro, Brazil, as well as climate factors known to affect disease transmission. Employing a dynamic window approach, various statistical methods and machine learning techniques were used to generate weekly forecasts at several time horizons. Error measures, uncertainty intervals, and computational efficiency obtained with each method were compared. Statistical models considered were Autoregressive (AR), Moving Average (MA), Autoregressive Integrated Moving Average (ARIMA), and Exponential Smoothing State Space Model (ETS). In addition, models incorporating temperature and humidity as covariates, such as Vector Autoregression (VAR) and Seasonal ARIMAX (SARIMAX), were employed. Machine learning techniques evaluated were Random Forest, XGBoost, Support Vector Machine (SVM), Long–Short-Term Memory (LSTM) networks, and Prophet. Ensemble approaches that integrated the top performing methods were also considered. The evaluated methods also incorporated lagged climatic variables to account for delayed effects.

Results

Among the statistical models, ARIMA demonstrated the best performance using only historical case data, while SARIMAX significantly improved predictive accuracy by incorporating climate covariates. In general, the LSTM model, particularly when combined with climate covariates, proved to be the most accurate machine learning model, despite being slower to train and predict. For long-term forecasts, Prophet with climate covariates was the most effective. Ensemble models, such as the combination of LSTM and ARIMA, showed substantial improvements over individual models.

Conclusions

This study demonstrates the strengths and limitations of various methods for dengue forecasting across multiple timeframes. It highlights the best-performing statistical and machine learning methods, including their computational efficiency, underscoring the significance of machine learning techniques and the integration of climate covariates to improve forecasts. These findings offer valuable insights for public health officials, facilitating the development of dengue surveillance systems for more accurate forecasting and timely allocation of resources to mitigate dengue outbreaks.

Introduction

Dengue is a viral infection transmitted through the bites of infected female Aedes mosquitoes. The main vector for dengue, the Aedes aegypti mosquito, thrives in high temperature and humidity, conditions that also promote viral replication. The dengue virus exists in four distinct serotypes, namely, DENV-1, DENV-2, DENV-3, and DENV-4. Infection with any serotype can result in dengue fever, manifesting from mild febrile illness to severe forms, such as dengue hemorrhagic fever or dengue shock syndrome, which can be fatal.

Dengue represents a significant public health concern in tropical and subtropical regions in the world and is now established as endemic in over hundred of countries across Africa, the Americas, the Eastern Mediterranean, South–East Asia, and the Western Pacific [1]. Moreover, dengue cases are projected to increase and expand into new territories in the coming years as climate change alters epidemiological patterns [2]. Dengue not only affects population health but also places a significant economic burden on a global scale.

Currently, there is no universal treatment for dengue. However, various prevention strategies are available, including personal protection, chemical control, and environmental management of Aedes mosquitoes [3]. Enhanced surveillance is also crucial for early detection and response to dengue outbreaks allowing public health authorities to respond promptly and implement control measures to allocate resources in areas of higher risk and prevent further spread.

Forecasting dengue cases involves various methodologies that range from traditional statistical models to advanced machine learning techniques. Each method offers distinct advantages and limitations in terms of accuracy, complexity, and applicability. Traditional statistical methods, such as ARIMA (Autoregressive Integrated Moving Average) and seasonal decomposition, have been widely used for infectious disease forecasting [4,5,6,7]. These models are particularly effective at capturing linear relationships and seasonal patterns in time-series data. However, they often require stationary data and may struggle to handle non-linear interactions or abrupt changes in trends.

In recent years, machine learning models have gained prominence due to their ability to learn complex patterns from data without explicit programming. Techniques like Random Forests, Support Vector Machines (SVM), and neural networks, including Long–Short-Term Memory (LSTM) networks, offer powerful alternatives to traditional models. Researchers have applied these approaches to predict dengue cases and outbreaks in various regions, demonstrating their effectiveness in capturing disease transmission dynamics [8,9,10]. These methods can capture complex non-linear interactions and are generally more flexible in handling various data types and structures. Nevertheless, machine learning models often require large data sets for training and can be opaque, making interpretation challenging.

Incorporating climate and environmental information has been shown to enhance both statistical and machine learning models. Temperature, rainfall, and humidity have been linked to the breeding and survival rates of mosquitoes, thereby affecting dengue transmission rates [11]. Models that integrate these environmental factors tend to perform better in predicting dengue cases over longer time horizons, reflecting the complex interplay between the pathogen, host, and environment [12,13,14,15,16,17].

In this paper, we evaluate a number of statistical models and machine learning techniques for dengue forecasting. The aim is to determine which methods provide superior predictive performance and computational efficiency, which could inform the design of dengue surveillance systems. Several studies have compared the performance of various dengue prediction methods [8, 16,17,18,19,20,21]. However, these studies have several gaps which we aim to address in our evaluation. First, unlike previous studies which often focus on monthly predictions, we consider forecasts at a weekly resolution. This finer granularity provides a more precise and actionable scale for public health decision-making. In addition, in contrast to other works that typically use a fixed window approach to assess performance, we adopt a moving window strategy, where models are continuously trained with newly acquired data. This approach reflects what would happen in real-time scenarios, where data are continuously updated. Moreover, this approach allows us to capture the evolving patterns and trends of dengue, enhancing models’ robustness and accuracy over time. While social aspects, population density, human mobility, and socioeconomic conditions are also important factors influencing dengue transmission, this study focuses on climate data due to its easy availability.

Understanding the uncertainty associated with forecasts is critical for public health decision-making. This study computes uncertainty intervals and coverage probabilities, providing a more comprehensive understanding of the predictions’ reliability. Many other studies only report point predictions, which do not convey the inherent uncertainties in the forecasts. In addition to single models, our research also compares ensemble approaches that combine multiple forecasting methods that leverage the unique strengths of individual models to enhance predictive accuracy. The consideration of ensembles highlight the potential of these approaches for dengue forecasting.

In addition, we assess the models’ performance across several forecast horizons, from short-term (1–4 weeks) to long-term (8–12 weeks) predictions. In contrast to other studies that often focus on a single forecast horizon, this comprehensive evaluation allows us to understand the models’ strengths and limitations across different time scales, providing a more detailed picture of their forecasting capabilities. Finally, we report the computational time required to run each forecasting method. This aspect is particularly important in settings with limited resources, where the time and cost associated with running complex models can be prohibitive.

In our evaluation, we utilize weekly cases in Rio de Janeiro, Brazil, a region that faces frequent dengue epidemics. Dengue cases and climate variables were obtained from InfoDengue [22], a system that partially automates the collection, organization, and analysis of climate and epidemiological data of dengue and other arboviruses in municipalities across Brazil. The analysis covers data from the first epidemiological week of 2016 to the last epidemiological week of 2023, providing an 8-year span to capture the most recent dengue fever trends in Rio de Janeiro.

The methods considered in the analysis include statistical and machine learning methods known for their demonstrated effectiveness in time series forecasting applied to epidemiology. Specifically, we utilize the statistical methods Autoregressive (AR), Moving-Average (MA), ARIMA [23, 24], and Exponential Smoothing (ETS) [25], known for their robustness in analyzing time-dependent patterns. We also consider Seasonal ARIMA with eXogenous variables (SARIMAX) [24] and Vector Autoregression (VAR) [23, 26] models, which allow us to integrate external variables and assess their impact alongside historical disease trends. In addition, we explore machine learning techniques like Random Forest [27], XGBoost [28], SVM [29, 30], LSTM [31, 32], and Prophet [33]. These methods are selected for their advanced capabilities in modeling complex interactions and non-linear relationships [34, 35], applicable both with and without climate covariates. Finally, we consider ensemble approaches that combine statistical and machine learning approaches.

Dengue and climate data in Rio de Janeiro, Brazil

Rio de Janeiro, capital of the state of the same name situated in southeastern Brazil, is a vast urban area known for its high population density and subtropical climate (Fig. 1). With a population exceeding 16 million [36], the city is particularly susceptible to dengue fever, largely because of the conducive environment for the proliferation of Aedes aegypti mosquitoes. The warm, humid climate in the area provides an optimal breeding habitat for these mosquitoes, facilitating the replication of dengue viruses.

Dengue has been a persistent problem in Rio de Janeiro state, with significant outbreaks recorded since the reintroduction of the Aedes aegypti mosquito in 1977 [39]. The state has been one of the main hotspots for dengue in Brazil, accounting for a substantial portion of the country’s cases. Between 1986 and 2015, Rio de Janeiro experienced seven major epidemics, with the introduction of different dengue serotypes (DENV-1 to DENV-4) contributing to severe outbreaks. For example, the 2002 epidemic resulted in 288,245 reported cases, including 1,831 hemorrhagic cases and 91 deaths [39]. More recently, the simultaneous circulation of all four dengue serotypes since 2011 has further complicated the control efforts in the state. Rio de Janeiro city’s status as a major tourist destination, especially during summer months, exacerbates dengue transmission due to increased human movement and dense local mosquito populations, making it one of the most receptive regions for the introduction and persistence of dengue viruses.

Dengue is a notifiable disease within Brazil’s public health framework, requiring health workers to report suspected cases through a structured notification process. This process culminates in a national database managed by the Ministry of Health, which, despite the limitation that only a fraction of cases are laboratory-confirmed, provides invaluable incidence indicators for disease monitoring and response.

The InfoDengue system [22] offers a semi-automated pipeline for data collection, harmonization, and analysis at the municipal level. This system generates crucial indicators of the epidemiological situation of dengue and other arboviruses, such as Zika and Chikungunya, facilitating timely and informed public health responses. In addition to the reported disease cases, InfoDengue incorporates weather data, acknowledging the significant impact of climate on arbovirus transmission. Specifically, temperature and humidity data, sourced from airport weather stations and satellite imagery, provided by InfoDengue at a weekly resolution, aligned with the dengue case data, are provided to understand and predict disease spread patterns. These weekly data were directly used in our models without additional aggregation, ensuring consistency in temporal granularity across all variables.

Figure 2 displays the weekly cases of dengue reported in Rio de Janeiro from 2016 to 2023. As seen in the graph, dengue cases display significant variability with an average of approximately 295 cases, and notable peaks reaching up to 3127 cases. The months with higher dengue cases typically occur between March and May, consistent with known seasonal patterns in Brazil [40,41,42]. Figure 3 illustrates the patterns of median weekly temperature and humidity over the same period. Notably, temperature and humidity exhibit opposite trends, when temperature is high, humidity tends to be low, and vice versa. The median weekly temperature over the study period was $23.15^{\circ }\hbox {C}$, with values ranging from $15.60^{\circ }\hbox {C}$ to $30.55^{\circ }\hbox {C}$. Humidity levels showed an average of 79.36%, fluctuating between 54.20% and 91.90%, reflecting seasonal variations. Temperature tends to peak during the summer months, while humidity shows some variability across seasons. The fluctuations in these climate factors often align with changes in dengue cases, underscoring their importance in predicting disease transmission.

Methodology

We compare the predictive performance and computational time of a number of statistical methods and machine learning techniques to forecast dengue cases using a moving window strategy. In this strategy, also known as a rolling forecast or rolling window approach, the model is trained on a fixed-size segment of historical data to predict future values. As new data becomes available, the window moves forward by one or more timepoints, dropping the oldest data in the set and incorporating the most recent data for subsequent predictions. This method is particularly advantageous for forecasting tasks, where the relationship between past and future values may change over time.

For the purpose of predicting dengue cases in Rio de Janeiro, we implemented a moving window strategy with a fixed window size of 6 years (Fig. 4). This window size was selected to capture the long-term trends and seasonal patterns of dengue cases while providing a sufficiently large data set for model training. Since the window size influences model performance, it is treated as a tunable parameter rather than a fixed choice. This study prioritizes introducing forecasting methods over exhaustive parameter tuning, allowing health practitioners to adjust window sizes based on specific needs to enhance accuracy in real-world applications. Initially, the training window consisted of 2016-01-03 (the first epidemiological week in 2016) to 2021-12-26 (the last epidemiological week in 2021), with a window size of 312 weeks (52 weeks per year $\times$ 6 years) and was moved 1 week until 2023-12-24 (the last epidemiological week in 2023). The forecasting horizon was set to 1, 2, 3, 4, 8, and 12 weeks ahead, allowing us to evaluate the models’ performance over short-to-medium-term predictions. In the following sections, we describe the statistical methods and machine learning techniques employed, the procedure to compute the associated uncertainty intervals, and the performance evaluation metrics utilized for comparison.

Statistical models for dengue forecasting

The statistical models considered to predict dengue include Autoregressive (AR(1)), Moving Average (MA(1)), Autoregressive Integrated Moving Average (ARIMA) [23, 24], and Exponential Smoothing State Space Model (ETS) [25]. These models were chosen for their ability to capture various patterns in time series data, such as trends, seasonality, and autocorrelation. In addition, we use models that incorporate temperature and humidity as covariates to capture the influence of these factors on dengue transmission. These models are Vector Autoregression (VAR) [23, 26] and Seasonal ARIMAX (SARIMAX) [24]. Furthermore, to account for the delayed effects of climatic factors on dengue cases, VAR and SARIMAX models are also fitted using lagged variables ranging from 1 week to 4 weeks. Previous studies support this approach, indicating that temperature and humidity at a lag of 1 month are positively associated with dengue cases [43, 44]. Let $X_t$ represent the number of dengue cases at time $t$. The statistical models employed are specified as follows:

The Autoregressive Model (AR(1)) predicts future values based on a combination of past values. It is defined as
$$\begin{aligned} X_t = \phi X_{t-1} + \epsilon _t, \end{aligned}$$
where $\phi$ is the coefficient that measures the influence of the immediately preceding value ($X_{t-1}$), and $\epsilon _t$ is the white noise error term at time $t$. This model is particularly effective for data showing a strong correlation with its immediate past value.
The Moving Average Model (MA(1)) model captures the relationship between an observation and a residual error from a moving average model applied to lagged observations. It is represented as
$$\begin{aligned} X_t = \mu + \epsilon _t + \theta _1\epsilon _{t-1}, \end{aligned}$$
where $\mu$ is the mean of the series, $\epsilon _t$ is the white noise error term, and $\theta _1$ is the coefficient for the lagged error term. This model is effective in smoothing out short-term fluctuations and identifying patterns that persist over time.
The Autoregressive Integrated Moving Average (ARIMA) model combines the AR and MA models and integrates differencing to make the data stationary. It is denoted as ARIMA(p, d, q), where $p$ is the order of the AR term, $d$ is the degree of differencing, and $q$ is the order of the MA term. The general form of an ARIMA model is
$$\begin{aligned} \Delta ^d X_t = \phi _1 \Delta ^d X_{t-1} + \cdots + \phi _p \Delta ^d X_{t-p} + \epsilon _t + \theta _1 \epsilon _{t-1} + \cdots + \theta _q \epsilon _{t-q}, \end{aligned}$$
where $\Delta ^d$ denotes differencing $d$ times to achieve stationarity. ARIMA models are versatile and can model data with trends and seasonal components.
The Exponential Smoothing State Space Model (ETS) model accounts for trends and seasonality in the data through exponential smoothing. The general form of the ETS model can be written as
$$\begin{aligned} X_t = l_{t-1} + b_{t-1} + s_{t-m} + \epsilon _t, \end{aligned}$$
where $l_t$ is the level of the series, capturing the long-term average behavior; $b_t$ is the trend, indicating the direction and speed of the change; $s_t$ is the seasonal component, capturing periodic fluctuations; $m$ is the period of seasonality, and $\epsilon _t$ is the error term. ETS models are particularly useful for capturing complex seasonality patterns in time series data.
The VAR model captures the linear interdependencies among multiple time series. It generalizes the AR model by allowing for more than one evolving variable. All the variables in a VAR are treated symmetrically; each variable has an equation explaining its evolution based on its own lags and the lags of the other model variables. For a system of $n$ time series (in our case, dengue cases, temperature, and humidity), a VAR model of order $p$ (VAR(p)) can be written as follows:
$$\begin{aligned} V_t = A_1 V_{t-1} + \cdots + A_p V_{t-p} + \epsilon _t, \end{aligned}$$
where $V_t$ is a vector of time series variables including dengue cases, temperature, and humidity at time $t$, $A_i$ are coefficient matrices, and $\epsilon _t$ is a vector of error terms. VAR models are suitable for capturing the dynamic relationships between multiple time series, such as dengue cases, temperature, and humidity.
Seasonal ARIMAX (SARIMAX) extends the ARIMA model by incorporating exogenous variables (X) and seasonal components (S). It is a powerful and flexible model that can account for complex behaviors in time series data. It is represented as SARIMAX(p, d, q)(P, D, Q)_s, where $p, d, q$ are the non-seasonal parameters, $P, D, Q$ are the seasonal parameters, and $s$ is the length of the seasonal cycle. Let $X_t$ be the number of dengue cases at time t, and let $Z_t$ be the vector of exogenous variables (i.e., temperature and humidity), the SARIMAX model can be represented as
$$\begin{aligned} \Delta ^d \Delta _s^D X_t = \phi _1 \Delta ^d \Delta _s^D X_{t-1} + \cdots + \phi _p \Delta ^d \Delta _s^D X_{t-p} + \epsilon _t + \theta _1 \epsilon _{t-1} + \cdots + \theta _q \epsilon _{t-q} + \beta Z_t. \end{aligned}$$
SARIMAX is particularly effective in incorporating seasonal effects and external influences on dengue transmission.

Machine learning techniques for dengue forecasting

We also consider various machine learning techniques that have demonstrated significant potential in enhancing dengue forecasting by utilizing both historical dengue case data and additional climate covariates, such as temperature and humidity. Our selection includes a diverse array of models, each chosen for their unique strengths in dealing with complex epidemiological data, and their flexibility allows for the integration of lagged variables, providing a richer analysis that accounts for past data trends and environmental influences. These methods include decision tree-based models like Random Forest and XGBoost, as well as Support Vector Machine (SVM), Long–Short-Term Memory (LSTM) networks, and Prophet.

For this study, we utilize these machine learning methods in two scenarios: using only historical dengue case data, and incorporating temperature and humidity as covariates. Similar to the statistical methods, we also incorporate lagged variables from 1 to 4 weeks to capture delayed effects of climatic factors on dengue cases. The machine learning methods are specified as follows:

Random Forest [27], known for its simplicity and effectiveness, it is an ensemble learning method that constructs multiple decision trees during training and outputs the average prediction of the individual trees. Each tree in the forest is built from a bootstrap sample of the training data, and at each split, a random subset of features is considered. This method is effective in reducing overfitting and improving generalization. For dengue forecasting, Random Forest can model historical case data, making it suitable for capturing the stochastic nature of dengue transmission.
Extreme Gradient Boosting (XGBoost) [28] is an advanced implementation of gradient-boosted decision trees, which has gained popularity due to its speed and performance, which stems from its ability to do gradient boosting in a more efficient way, making it particularly suited for large data sets. It builds trees sequentially, where each subsequent tree focuses on correcting the errors of the previous trees. This method enhances model accuracy and robustness by combining multiple weak learners into a strong learner. XGBoost’s ability to handle missing values and incorporate regularization makes it well-suited for dengue forecasting.
Support Vector Machine (SVM) [29, 30] for regression, known as Support Vector Regression (SVR), attempts to find a function that deviates from the actual observed values by a value no greater than a specified margin while balancing the complexity of the model. SVR is particularly effective for capturing non-linear relationships in the data. When applied to dengue forecasting, SVR can model the complex and non-linear interactions between historical dengue cases and climatic factors.
Long–Short-Term Memory (LSTM) networks [31, 32] is a type of recurrent neural network capable of learning order dependence in sequence prediction problems. LSTM cells have internal mechanisms called gates that regulate the flow of information, making them highly effective for learning from sequences of data, such as weekly dengue case reports, where past information is crucial for predicting future events. LSTM networks can learn the temporal patterns in dengue case data and the influence of lagged climatic factors, such as temperature and humidity, which allows LSTMs to forecast future dengue cases with enhanced accuracy.
Prophet [33] is a forecasting tool that models time series data with strong seasonal effects. It is a procedure for forecasting time series data based on an additive model, decomposing the time series into trend, seasonality, and holiday effects. It works well with time series that have strong seasonal effects and historical trends, making it ideal for predicting dengue cases because of its robust handling of seasonal variations and its capability to model non-linear trends influenced by yearly and weekly cycles of dengue cases. By adjusting its parameters, Prophet can also integrate additional regressors, such as temperature and humidity, aligning closely with the seasonal patterns that affect dengue transmission.

Ensemble approaches

Aiming to enhance the accuracy and stability of dengue prediction techniques, we also employed ensemble approaches that averaged the forecast outputs from both the best-performing statistical models and machine learning techniques independently trained on historical data. This approach is designed to harness the complementary strengths of each model type, namely, the statistical model’s efficacy in capturing linear trends and seasonality, and the machine learning model’s ability to understand complex, non-linear relationships in data sequences. Thus, the ensemble approaches mitigate individual model prediction errors, potentially reducing the overall forecast variance without excessively complicating the model structure.

Uncertainty intervals

Adaptive conformal prediction [45,46,47,48] is a technique used to construct prediction intervals that account for time-dependent changes in the data distribution, making it particularly suitable for time series data. This method adjusts the prediction intervals dynamically based on nonconformity scores to provide more accurate and reliable intervals. We use adaptive conformal prediction to compute uncertainty intervals associated to each of the methods under consideration.

For each time $t$ and forecast horizon $h$, the nonconformity score is defined as the residual at time $t$ for $h$ steps ahead calculated as the difference between the actual and predicted values from the time series model:

$$\begin{aligned} \text {residual}_{t,h} = \text {actual}_{t+h} - \text {predicted}_{t,h}. \end{aligned}$$

For a window size $W$ and time $t$, residuals are considered from $t-W$ to $t$. Thus, the set of nonconformity scores within the window at time $t$ for a forecast horizon $h$ is

$$\begin{aligned} \{ \text {residual}_{t-W,h}, \text {residual}_{t-W+1,h}, \ldots , \text {residual}_{t,h} \}. \end{aligned}$$

The quantiles of these nonconformity scores are then computed to determine the prediction intervals. Specifically, the lower and upper bounds of the prediction interval at each time step $t$ for forecast horizon $h$ are given by

$$\begin{aligned} \begin{aligned} \text {lower}\_\text {bound}_{t,h}&= \text {predicted}_{t,h} - Q_{1 - \alpha /2, h}, \\ \text {upper}\_\text {bound}_{t,h}&= \text {predicted}_{t,h} + Q_{\alpha /2, h}, \end{aligned} \end{aligned}$$

where $Q_{1 - \alpha /2, h}$ and $Q_{\alpha /2, h}$ are the quantiles of the residuals within the window for the $h$-step ahead forecast, and $\alpha$ is the significance level (e.g., 0.05 for a 95% prediction interval). For the $k$th quantile at forecast horizon $h$, the corresponding value is given by

$$\begin{aligned} Q_{k, h} = \inf \left\{ q \in \mathbb {R} : \frac{1}{W} \sum _{i=t-W}^t \mathbb {I} \{ \text {nonconformity}\_\text {score}_{i,h} \le q \} \ge k \right\} , \end{aligned}$$

where $\mathbb {I}$ is the indicator function.

The prediction interval is then adjusted dynamically at each time step based on these scores. This adaptive update ensures that the prediction intervals remain accurate as the underlying data distribution evolves, providing more reliable intervals that can better account for time-dependent changes in the data.

Methods implementation

Statistical models AR(1), MA(1), ARIMA, SARIMAX, and ETS are implemented using the R package forecast [49]. For the ARIMA and SARIMAX model, we used the auto.arima() function, which automatically selects the optimal parameters and handles necessary transformations, such as differencing, to ensure stationarity and invertibility [49]. While the VAR model is implemented using the R package vars [50].

Machine learning models Random Forest and XGBoost are implemented using the Python libraries scikit-learn [51] and xgboost [28], respectively, both using 5,000 estimators. The SVM model is implemented using the SVR function from scikit-learn with a radial basis function (RBF) kernel and parameters set to kernel and parameters set to $C=100, \gamma =0.1$, and $\varepsilon =0.1$. The LSTM network is implemented using the TensorFlow library with a single-layer consisting of 1,000 units (or nodes), followed by a dense layer with 1 unit for output. The LSTM model uses ReLU activation, and the input data is scaled using MinMaxScaler before training. In addition, Prophet is implemented using the prophet library [52]. The choice of these parameters and model architectures was driven by the goal of maintaining simplicity to ensure a straightforward comparison between statistical and machine learning methods. Our paper focuses on introducing these methods for dengue forecasting, emphasizing general performance comparison rather than model optimization or parameter tuning, thus choosing basic architectures. In real-world applications, health workers and practitioners could perform further parameter optimization to improve accuracy and tailor models to specific contexts.

Performance evaluation metrics

To assess the predictive performance of the statistical models and machine learning techniques considered, we employed four primary metrics, namely, Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE), Root Mean Squared Error (RMSE), and 95% Coverage Probability. These metrics provide a comprehensive view of the performance of the models, considering both the magnitude and direction of prediction errors, and help highlight the strengths and weaknesses in dengue forecasting methods. In addition, we also compare the computational efficiency of the forecasting techniques considered by measuring the duration that a technique takes to complete its training and prediction processes. This comparison provides insights on the practicality of deploying each method in real-world situations, where computational resources and response times are often limited. It also assists public health workers on which models to deploy, balancing the need between predictive accuracy and computational efficiency.

Let $y_i$ and $\hat{y}_i$ represent, respectively, the actual and forecast number of dengue cases, and let n be the number of observations. The Mean Absolute Error (MAE) measures the average magnitude of the errors in a set of predictions, without considering their direction. It is calculated as the average of the absolute differences between the forecasted values and the actual values:

$$\begin{aligned} \text {MAE} = \frac{1}{n} \sum _{i=1}^{n} |y_i - \hat{y}_i|. \end{aligned}$$

The Mean Absolute Percentage Error (MAPE) expresses the accuracy as a percentage, which provides a relative measure of the errors. It is calculated as the average of the absolute percentage differences between the predicted and actual values:

$$\begin{aligned} \text {MAPE} = \frac{100\%}{n} \sum _{i=1}^{n} \left| \frac{y_i - \hat{y}_i}{y_i} \right| . \end{aligned}$$

The Root Mean Squared Error (RMSE) is a quadratic scoring rule that measures the average magnitude of the error. It is defined as the square root of the average of squared differences between prediction and actual observation:

$$\begin{aligned} \text {RMSE} = \sqrt{\frac{1}{n} \sum _{i=1}^{n} (y_i - \hat{y}_i)^2}. \end{aligned}$$

The 95% Coverage Probability assesses the proportion of actual dengue cases that fall within the 95% prediction intervals provided by the models. A well-calibrated model should have approximately 95% of the observations within this interval. It is a measure of the reliability of the predictive intervals.

Results

In this section, we evaluate the predictive performance of the statistical models, machine learning techniques, and ensemble approaches considered. We provide the methods’ predictive accuracy in terms of MAE, RMSE, and MAPE across various forecast horizons. Then, we provide the 95% coverage probability and the average width of the uncertainty intervals of all models across various forecast horizons. A comparison of the computational efficiency of each method is also provided. In the Supplementary Information, Fig. S1 shows a comparison of the predictions obtained by the best performing methods for different time horizons. Figures S2 to S5 depict boxplots of the absolute error, $| y_i - \hat{y}_i |$, and absolute percentage errors, $\left| \frac{y_i - \hat{y}_i}{y_i} \right|$, obtained with each method. Specifically, Figures S2 and S3 correspond to methods that only use dengue cases, while Figures S4 and S5 show boxplots for the methods using covariates. In all figures, boxplots are shown in the order from best to worst approach forecasting method.

Performance of statistical models

Table 1 shows the accuracy metrics for the statistical models that did not include covariates, namely, AR(1), MA(1), ARIMA, and ETS. Among these models, ARIMA stands out as the most consistent performer across varying forecast horizons. Its ability to blend autoregressive (AR), differencing (I), and moving average (MA) components allows it to capture both short-term trends and long-term seasonal patterns effectively.

Table 1 Accuracy of the statistical models without covariates at various forecast horizons

Full size table

For 1-week ahead forecasts, ARIMA exhibited the lowest MAE at 78.26 cases and the lowest MAPE at 18.25%, indicating its superior performance for immediate forecasting compared to other models. The ETS model closely followed, with an MAE of 78.66 cases and a slightly higher MAPE of 19.17%. The AR(1) and MA(1) models showed slightly higher errors but were competitive in their forecasting ability. In terms of RMSE, ARIMA also had the lowest value (124.33), making it the most accurate at this horizon. The results for the 2-week forecast showed a similar trend to the 1-week forecast, reflecting consistent model performance across these short-term horizons.

Medium-term predictions of 3- and 4-week forecasts showed a similar pattern. For example, extending the forecast horizon to 4 weeks ahead, we observed an increase in the errors across all models, as expected due to the increasing uncertainty with longer forecast periods. ARIMA maintained a relatively lower MAE and MAPE, affirming its consistency in performance across the short-term forecast horizons. For instance, ARIMA’s MAE and MAPE at 4 weeks were 175.38 and 36.07%, respectively, which were the best among the considered models.

As the forecast horizon expanded to 8 and 12 weeks, all models’ performance naturally deteriorated, reflecting the inherent challenge of predicting dengue cases over longer periods. Notably, the ARIMA model’s performance remained robust relative to the other models, with the lowest MAE of 308.92 and 396.91 for 8 and 12 weeks ahead, respectively. However, the ETS model’s error metrics significantly increased, with a notable jump to 392.21 (MAE) and 89.16% (MAPE) at 8 weeks and 649.15 (MAE) and 150.31% (MAPE) at 12 weeks, suggesting that it may be less suitable for medium- and long-term forecasting in the absence of covariates.

In an effort to enhance the predictive accuracy of the statistical models, we considered SARIMAX and VAR models that included temperature and humidity covariates, both with and without lagged covariates. Results corresponding to these models are shown in Table 2.

Table 2 Accuracy of the statistical models with covariates temperature and humidity at various forecast horizons

Full size table

In the short-term forecast horizon of 1 week, SARIMAX outperformed others, achieving the lowest MAPE of 17.25% and an MAE of 79.24 cases. When considering lagged covariates, SARIMAX_Lag’s performance slightly declined, with an MAE of 81.80 cases and a MAPE of 25.23%. VAR showed competitive performance with an MAE of 78.10 cases but a higher MAPE of 25.49%.

As the horizon extends to 4 weeks, the predictive accuracy decreased across all models. SARIMAX continued to demonstrate strong performance, with an MAE of 173.52 cases and a MAPE of 36.59%. The inclusion of lagged covariates in SARIMAX_Lag resulted in an MAE of 160.28 and a MAPE of 39.96%, indicating a slight performance degradation. VAR exhibited an MAE of 179.45 cases and a MAPE of 55.57%, while VAR_Lag had an MAE of 167.53 and a MAPE of 48.79%.

For 8 and 12 weeks ahead forecasts, the performance of all models continued to decline. SARIMAX_Lag, however, consistently maintained a lower error rate relative to other models, with an MAE of 279.15 and 375.15 and an MAPE of 54.86% and 73.33% for 8 and 12 weeks ahead, respectively. VAR showed increased errors, with an MAE of 313.80 and 386.34 for 8 and 12 weeks, respectively. The VAR_Lag model’s performance was comparatively better, with an MAE of 295.42 and 386.34 for the same horizons.

Overall, incorporating temperature and humidity covariates improved the models’ predictive accuracy. The SARIMAX model, in particular, demonstrated superior performance compared to ARIMA, indicating that these covariates are valuable for short-term predictions. In addition, models incorporating lagged variables, such as SARIMAX_Lag and VAR_Lag, showed enhanced predictive capability for medium- and long-term forecasts. This suggests that lagged variables can significantly improve the accuracy of dengue case predictions over longer periods.

Performance of machine learning techniques

In this section, we describe the performance of the five machine learning techniques considered, namely, SVM, Random Forest, XGBoost, LSTM and Prophet. Table 3 shows the accuracy measures MAE, RMSE, and MAPE across various forecast horizons for the machine learning techniques that did not use covariates. Table 4 shows the accuracy metrics for the machine learning techniques that included temperature and humidity covariates. The results are provided for techniques that used covariates for the same week as well as for lagged covariates.

Table 3 Accuracy of the machine learning techniques considered without covariates at various forecast horizons

Full size table

Table 4 Accuracy of the machine learning techniques considered with covariates temperature and humidity at various forecast horizons

Full size table

For 1-week ahead forecasts, SVM exhibited the lowest MAE at 82.82 cases and the lowest MAPE at 19.15%, indicating its strong short-term predictive ability. LSTM also performed well, with an MAE of 85.93 cases and a MAPE of 24.49%, showing its competitive edge for immediate forecasting. Random Forest and XGBoost showed slightly higher errors but were still effective, with MAEs of 88.68 and 102.24 cases, respectively. Prophet, however, showed much higher errors, with an MAE of 218.66 cases and a MAPE of 53.08%, indicating its limitations in short-term forecasting.

Extending the forecast horizon to 4 weeks, we observed an increase in the errors across all models. LSTM model demonstrated the lowest MAE at 159.45 cases and a MAPE of 37.46%, indicating its robustness in medium-term forecasting. SVM, while having a higher MAE of 166.93, still maintained a relatively lower MAPE of 50.25%. XGBoost showed an MAE of 199.46 cases and a MAPE of 42.09%, performing better than Random Forest and Prophet at this horizon. Prophet continued to exhibit higher errors with an MAE of 270.79 cases and a MAPE of 61.08%, reinforcing its unsuitability for medium-term forecasts. The 3-week forecast results were similar to the 4-week outcomes, further supporting the observed trends in medium-term predictions.

For 8 and 12 weeks ahead forecasts, all models’ performance naturally deteriorated due to the increasing uncertainty over longer periods. LSTM continued to show the lowest errors with an MAE of 300.81 for 8 weeks ahead. However, for the 12-week horizon, Prophet exhibited the best performance with the lowest MAE of 351.92 cases and a MAPE of 70.84%. This indicates that while Prophet is not suitable for short-term forecasts, it shows strong predictive capabilities for very long-term forecasts.

Overall, the machine learning models’ performance varied across different forecast horizons, with SVM and LSTM models showing superior accuracy for short-term and medium-term predictions, respectively. Prophet, despite its poor short-term performance, demonstrated notable effectiveness in long-term forecasting.

Incorporating covariates, LSTM was the best performer, achieving the lowest MAE of 71.35 cases and a MAPE of 22.31% for 1-week ahead forecasts. This model continued to show the best performance across all forecast horizons up to 12 weeks, with an MAE of 410.96 cases and a MAPE of 79.92%. However, the other models showed significantly higher errors when covariates were included, particularly the tree-based models, such as Random Forest and XGBoost. For example, Random Forest with covariates had an MAE of 410.44 and a MAPE of 119.94% for 1 week ahead, which did not improve substantially over longer horizons.

Lagged covariates negatively impacted the performance of all machine learning models across all forecast horizons, worsening MAE, MAPE and RMSE values, indicating potential overfitting due to the redundant of lag data. The lagged models, including SVM-Lag, Random Forest-Lag, XGBoost-Lag, LSTM-Lag, and Prophet-Lag, exhibited higher errors across all forecast horizons compared to their counterparts without lagged covariates. For instance, SVM-Lag had an MAE of 188.99 cases and a MAPE of 50.10% for 1 week ahead, with the performance deteriorating over longer horizons to an MAE of 341.82 and a MAPE of 121.50% at 12 weeks. Similarly, the LSTM-Lag model, despite having some competitive metrics for short-term forecasts, showed significant error increases for longer horizons.

Prophet with covariates showed improved performance compared to its counterpart without covariates, particularly for long-term forecasts. For 12 week ahead, Prophet with covariates had an MAE of 342.47 and a MAPE of 61.86%, making it effective for very long-term predictions. However, like other models, its short-term performance remained relatively weak.

Overall, while incorporating temperature and humidity covariates improved the LSTM model’s predictive accuracy significantly, other models did not benefit similarly and, in many cases, performed worse. In addition, the use of lagged covariates generally led to poorer performance, suggesting that these models may suffer from overfitting when lagged variables are included. Prophet with covariates, although not performing well in the short-term, continued to show strong predictive capabilities for very long-term forecasts.

In summary, LSTM including covariates dominated the short- to mid-term horizons (i.e., 1–8 weeks), providing consistently low errors across these intervals. Prophet proved to be a reliable model for long-term forecasting, excelling at 12 weeks with its ability to identify and capture longer-term patterns in dengue case data. LSTM and Prophet demonstrate complementary strengths. Specifically, LSTM is highly effective for immediate and mid-term predictions, while Prophet is better suited for capturing long-term trends.

Performance ensemble approaches

In this section, we discuss the performance of the ensemble models, which combine statistical methods and machine learning approaches to enhance predictive accuracy. The choice of models for the ensembles was based on their individual performance metrics. By averaging the best-performing statistical models and machine learning models, we aimed to leverage the strengths of each model.

For predictions using dengue cases alone, we created an ensemble model by averaging the forecasts of the ARIMA and LSTM models. Table 5 presents the performance metrics for this ensemble model compared to the individual ARIMA and LSTM models.

Table 5 Accuracy of ensemble model of LSTM & ARIMA compared with ARIMA and LSTM without covariates at various forecast horizons

Full size table

The LSTM–ARIMA ensemble shows significant improvements over the individual ARIMA and LSTM models. For instance, for 1-week ahead forecasts, the ensemble model achieved the lowest MAE of 69.98 cases and a MAPE of 17.38%. This outperformed both ARIMA and LSTM models individually, which had MAEs of 78.26 and 85.93 cases, respectively. The ensemble model also exhibited the lowest RMSE of 108.27, indicating enhanced predictive accuracy.

As the forecast horizon extends, the LSTM–ARIMA ensemble continues to demonstrate superior performance, particularly noticeable at 4 week ahead, where the ensemble’s MAE and MAPE were 154.43 cases and 32.19%, respectively, compared to higher errors in the individual models. Although the performance of all models declines over longer horizons, the ensemble approach still maintains a comparative advantage, highlighting its robustness in dengue case forecasting.

For predictions including temperature and humidity covariates, we created several ensemble models. These included combinations of SARIMAX, SARIMAX-Lag, and VAR-Lag with LSTM, as well as a more complex ensembles LSTM-SARIMAX-VAR_Lag and LSTM-SARIMAX-Lag-VAR-Lag. Table 6 presents the performance metrics for these ensemble models compared to the individual models.

Table 6 Accuracy of ensemble model of LSTM, SARIMAX & VAR compared with the individual models with covariates temperature and humidity at various forecast horizons

Full size table

When including covariates, the LSTM–SARIMAX ensemble performed better than the individual LSTM and SARIMAX models in short to medium forecast horizons. For example, for 1-week ahead forecasts, the LSTM–SARIMAX ensemble achieved the lowest MAE of 65.24 cases and a MAPE of 15.82%, outperforming the individual LSTM and SARIMAX models. The ensemble also showed the lowest RMSE of 103.52, indicating a significant improvement in predictive accuracy. However, as the forecast horizon extends, the performance advantage of the ensemble models becomes less pronounced. For instance, at 12 weeks ahead, the complex ensemble of LSTM-SARIMAX-Lag-VAR-Lag demonstrated an MAE of 380.73 cases and a MAPE of 64.88%, which was not significantly better than the individual models. This suggests that while complex ensembles may show occasional improvements, these are not consistent across all metrics, hinting at potential overfitting or inefficiencies due to increased model complexity.

Overall, the ensemble approaches show promise, particularly in short to medium-term forecasts. By combining the strengths of multiple models, the ensembles can provide more accurate and reliable predictions. However, careful consideration must be given to the potential drawbacks of increased complexity and the risk of overfitting, especially in longer forecast horizons.

Uncertainty intervals

In this section, we consider the 95% uncertainty intervals for each of the individual models and the top performing ensemble approaches, and provide their corresponding 95% coverage probabilities and average interval widths. Tables 7 and 8 present the 95% coverage probabilities for models using dengue cases alone and including climate covariates, respectively. Tables 9 and 10 display the average width of the uncertainty intervals for models using dengue cases alone and including climate covariates, respectively.

Table 7 95% coverage probability of the uncertainty intervals of models using cases alone

Full size table

Table 8 95% coverage probability of the uncertainty intervals of models including covariates

Full size table

Table 9 Average width of the uncertainty intervals of models using cases alone

Full size table

Table 10 Average width of the uncertainty intervals of models including covariates

Full size table

As we can see from the results, all models exhibit an increase in average interval width over time, while their coverage probability remains relatively stable, near 90%. In the Supporting information, an illustration of the uncertainty intervals computed with SARIMAX and LSTM with covariates for various forecast horizons are shown in Figures S6 and S7. When using dengue case data alone, all models show consistent coverage probabilities across different forecast horizons. ARIMA, for instance, maintains a high coverage probability, averaging around 86.5% in predicting 1 week ahead and slightly increasing to 92.31% in predicting 3 weeks and 4 weeks ahead, then remain the same 86.54% at 12 weeks ahead. However, it shows one of the highest interval widths of 395.18 for 1- week ahead predictions, indicating some uncertainty in its predictions. The LSTM model, on the other hand, starts with a coverage probability of 80.77% for 1 week ahead, increasing to 86.54% by 12 weeks ahead. Its interval width is 343.56 for 1 week ahead, expanding to 1837.23 for 12 weeks ahead. Compared to ARIMA, LSTM demonstrates a slightly lower coverage probability but a more significant increase in interval width over time.

With the inclusion of climate covariates, we see improvements in the models’ performance. SARIMAX starts with a high coverage probability of 83.65% in 1 week ahead, which increases to 86.54% by 12 weeks ahead. Its interval width ranges from 392.05 in 1 week to 1983.28 in 12 weeks. Compared to ARIMA, SARIMAX shows slightly worse coverage probability but similar trends in interval width expansion. The LSTM model with climate covariates also starts strong with 80.77% coverage in 1 week and increases to 86.54% by 12 weeks. Its interval width ranges from 306.98 in week 1 to 1761.05 in week 12, showing an overall better performance than using dengue cases alone.

The ensemble models LSTM_ARIMA and LSTM_SARIMAX achieve a similar performance as the individual models across different weeks. The coverage probabilities are in the range of 82.69% to 92.31%. For LSTM_ARIMA, the interval width starts at 333.64 in week 1 and rises to 1881.55 by week 12. LSTM_SARIMAX has interval width ranging from 303.93 in 1 week to 1845.90 in 12 weeks.

Computational efficiency

In this section, we compare the computational efficiency of the statistical models and machine learning techniques used for dengue forecasting. The computation times reported are the total times taken to generate all weekly predictions from January 2022 to December 2023 across all forecast horizons, namely, 1, 2, 3, 4, 8, and 12 weeks. The models were run on a MacBook Pro (13-inch, 2020) with processor 2 GHz quad-core Intel Core i5, and memory 16 GB 3733 MHz LPDDR4X.

Figure S8 illustrates the computational time required for running each model. The left-hand side of the bar plot displays the computational times for models using only dengue cases. On the right-hand side, the bar plot shows the computational times for models incorporating temperature and humidity covariates. Computational times are on a square root scale to accommodate the order of magnitude differences.

For models using dengue cases alone, computational times vary significantly between statistical models and machine learning techniques. Statistical models, such as AR and MA, are exceptionally fast, taking less than 0.5 s each. This efficiency is primarily due to their simpler mathematical foundations, which involve fewer computations. ARIMA takes 16.16 s, which is significantly longer but still considerably faster compared to most machine learning models. The increased time for ARIMA is due to its more complex structure, which is capable of capturing intricate temporal dependencies.

In the realm of machine learning, LSTM, despite its superior predictive accuracy, has a notably high computation time of 724.83 s. This significant computational demand reflects the complexity of LSTM in modeling long-term dependencies and capturing intricate patterns in the data. Random Forest, another effective machine learning model, requires 650.03 s, also indicating its intensive computational nature. Prophet, with a computation time of 28.29 s, offers a more efficient alternative among machine learning models, providing a reasonable trade-off between computational cost and accuracy.

When incorporating temperature and humidity covariates, the computational times reflect similar trends. The SARIMAX model, which was highly effective in utilizing covariates for accurate predictions, takes 28.61 s. This is relatively efficient, considering the model’s ability to handle seasonality and multiple covariates. The VAR model, although simpler, is extremely fast at 1.29 s, making it suitable for quick, real-time forecasting.

Among the machine learning models, LSTM with covariates continues to require substantial computation time, clocking in at 767.93 s. This underscores the model’s complexity and its ability to integrate and learn from covariate data, enhancing predictive accuracy at a high computational cost. Prophet, taking 55.50 s, balances complexity and efficiency, making it a practical choice for many forecasting scenarios involving covariates.

In summary, statistical models like ARIMA and SARIMAX offer a good balance of computational efficiency and predictive accuracy, especially when covariates are involved. Machine learning techniques like LSTM, while providing superior accuracy, require significantly more computational resources. The choice between these models depends on the specific requirements of the forecasting task, with simpler statistical models being preferable for real-time applications in settings, where computational resources are limited, and more complex machine learning models suitable for scenarios, where predictive accuracy is paramount.

Discussion

This research assesses the predictive performance and computational efficiency of a number of statistical and machine learning techniques for dengue forecasting, both with and without the inclusion of climate covariates. The study utilizes dengue cases as well as temperature and humidity in Rio de Janeiro, Brazil, a region prone to dengue outbreaks, where data is readily available from the InfoDengue system [22]. The statistical models considered include AR(1), MA(1), ARIMA, ETS, VAR, and SARIMAX. Machine learning techniques utilized are SVM, Random Forest, XGBoost, LSTM, and Prophet. The study provides a thorough assessment of the forecasting performance of these methods, as well as ensemble approaches that combine individual methods, across various time frames.

Unlike other performance evaluations, we generate weekly predictions that assess the predictive accuracy of the methods at actionable scales. The flexibility in handling different forecast horizons (i.e., 1, 2, 3, 4, 8, and 12 weeks ahead) conveys the utility of the forecasting system for both immediate response planning and longer-term strategic interventions. In addition, we compute uncertainty intervals to convey the reliability of point estimates. Our evaluation also includes the computational efficiency of each model, which is an important consideration in resource-constrained environments.

In our implementation, we use a moving window strategy that allows models to continuously adapt to new data, capturing the evolving patterns and trends in dengue cases. This adaptability is crucial for accurate predictions, especially in the context of a disease influenced by various fluctuating factors, such as climate, population movement, and public health interventions. The results highlight the nuanced capabilities of each method, which can inform the implementation of dengue surveillance systems and the allocation of resources to combat dengue outbreaks.

Among the statistical models evaluated, ARIMA emerged as the best model when using only historical case data. Its simplicity, rapidity and robust predictive accuracy make it a reliable choice for short to medium-term forecasts. ARIMA’s ability to capture temporal dependencies effectively contributed to its strong performance. However, the inclusion of climate covariates significantly enhanced its predictive power through the SARIMAX model. By accounting for climate factors, such as temperature and humidity, SARIMAX provided a more comprehensive analysis, leading to improved accuracy. Moreover, the use of lagged covariates in SARIMAX further enhanced long-term prediction capabilities, addressing the inherent uncertainties associated with extended forecast horizons. These lags capture both immediate and cumulative climatic effects on dengue transmission by accounting for the timing of environmental influences on mosquito dynamics. In addition, they reflect data availability, as climate data is typically available only up to the current time. In specific public health settings, the lag duration can be appropriately adjusted based on local conditions and the availability of real-time climate data to ensure the most accurate and actionable forecasts.

The Long–Short-Term Memory (LSTM) model, particularly when combined with climate covariates, proved to be the most accurate machine learning model overall. LSTM’s recurrent neural network structure excels at handling non-linear temporal patterns, making it highly effective in capturing the complex dynamics influenced by climate factors. This resulted in consistently lower errors across all forecast horizons. Despite being slower to train and predict due to its computational complexity, LSTM’s superior accuracy makes it an unmatched choice for highly precise predictions. The model’s ability to integrate and learn from additional covariate data further bolstered its performance, especially in the context of medium to long-term forecasts. While we acknowledge that the use of city-level data resulted in a smaller data set, we employed a longer input window of 312 weeks and iterative training to generate sufficient training examples. This approach helped mitigate data limitations, demonstrating the applicability of machine learning models for dengue forecasting, even with limited data.

For long-term forecasts (i.e., 12 weeks), the Prophet model with climate covariates demonstrated the best accuracy. Prophet’s strength lies in its ability to identify and adapt to long-term patterns, making it particularly effective for distant predictions. The inclusion of climate covariates enabled Prophet to capture seasonal variations and other long-term trends more accurately. While Prophet may not match the short-term accuracy of models like LSTM or ARIMA, its performance in long-term forecasting highlights its utility in scenarios, where understanding and predicting extended trends are crucial. We considered a 12-week forecast as a “long-term forecast” that could be used by public health authorities to plan resources in advance. However, our approach could also be extended beyond this horizon, with appropriate consideration of increasing uncertainty.

In addition to evaluating individual models, we explored the potential benefits of ensemble approaches by combining the strengths of the best-performing statistical and machine learning models. By averaging forecasts from both statistical and machine learning models, the ensembles capitalized on the strengths of each method, resulting in more robust and reliable predictions. The use of ensemble models, particularly those combining ARIMA and LSTM for cases alone and SARIMAX with LSTM for models including covariates, demonstrates significant promise in advancing the accuracy and reliability of dengue predictions.

This study has some limitations that indicate areas for future research. First, we evaluate the methods in a single geographical location, namely, Rio de Janeiro. Future research could explore the generalizability of these models by applying them to different geographical areas with different climatic and socio-economic conditions. This would help to validate the models’ robustness and enhance their utility in diverse settings. In addition, challenges such as data scarcity, seasonality, and socio-economic disparities can influence dengue transmission and forecasting accuracy. In regions with incomplete or underreported data, uncertainty estimation techniques and external validation data sets may help improve model reliability. Furthermore, strong seasonal trends in dengue incidence could impact forecast stability, necessitating methods that explicitly capture temporal variations.

In addition, we considered temperature and humidity as covariates in some of the models. While these are critical factors for dengue transmission, additional variables such as population density, mobility patterns, socio-economic factors, and land use changes could also play significant roles in dengue dynamics [11]. Future studies should consider incorporating a broader range of predictive factors which could improve the models’ accuracy and provide a more comprehensive understanding of dengue transmission mechanisms.

Moreover, we primarily focused on the assessment of the predictive accuracy and computational efficiency of different models. While these are crucial aspects, future research could also examine the interpretability and usability of the models from a public health perspective [53]. Ensuring that the models are not only accurate but also interpretable is essential to implement strategies for disease prevention and control. Finally, exploring the use of spatial models that borrow information of close or connected regions [54,55,56,57,58], and advanced machine learning techniques, such as deep reinforcement learning [59, 60], could further enhance dengue forecasting capabilities. Future research should also explore the development of interactive, web-based platforms that allow users to visualize forecasts, interpret uncertainty ranges, and customize predictions based on evolving outbreak conditions [61].

To conclude, this study provides a thorough evaluation of dengue prediction methods, showcasing how statistical methods, advanced machine learning models, and climate covariates yield valuable insights for proactive disease surveillance. Our findings provide evidence to inform the development of more robust and comprehensive models that better support public health efforts. By leveraging dengue forecasts, officials can optimize resource allocation, implement timely control measures, and reduce the impact of dengue outbreaks on the population.

Availability of data and materials

All data used in this study is open access and freely available on the internet, see the “Methods” section for details. For reproducibility purposes, we provide a permanent GitHub repository with the codes, which can be found at https://github.com/ChenXiang1998/Assessing-Dengue-Forecasting-Methods.

References

World Health Organization. Dengue and severe dengue – Fact Sheet. 2024. https://www.who.int/news-room/fact-sheets/detail/dengue-and-severe-dengue. Accessed 25 May 2024.
Paz-Bailey G, Adams LE, Deen J, Anderson KB, Katzelnick LC. Dengue. Lancet. 2024;403(10427):667–82.
Article PubMed Google Scholar
Gavi, the Vaccine Alliance. Why we need integrated dengue management to achieve zero deaths; 2024. https://www.gavi.org/vaccineswork/why-we-need-integrated-dengue-management-achieve-zero-deaths. Accessed 25 May 2024.
Silawan T, Singhasivanon P, Kaewkungwal J, Nimmanitya S, Suwonkerd W, et al. Temporal patterns and forecast of dengue infection in Northeastern Thailand. Southeast Asian J Trop Med Public Health. 2008;39(1):90.
PubMed Google Scholar
Luz PM, Mendes BV, Codeço CT, Struchiner CJ, Galvani AP, et al. Time series analysis of dengue incidence in Rio de Janeiro, Brazil. Am Soc Trop Med Hyg. 2008;79:933.
Article Google Scholar
Cortes F, Martelli CMT, de Alencar Ximenes RA, Montarroyos UR, Junior JBS, Cruz OG, et al. Time series analysis of dengue surveillance data in two Brazilian cities. Acta Trop. 2018;182:190–7.
Article PubMed Google Scholar
Buczak AL, Baugher B, Moniz LJ, Bagley T, Babin SM, Guven E. Ensemble method for dengue prediction. PloS ONE. 2018;13(1):e0189988.
Article PubMed PubMed Central Google Scholar
Roster K, Connaughton C, Rodrigues FA. Machine-learning-based forecasting of dengue fever in Brazilian cities using epidemiologic and meteorological variables. Am J Epidemiol. 2022;191(10):1803–12.
Article PubMed Google Scholar
Kakarla SG, Kondeti PK, Vavilala HP, Boddeda GSB, Mopuri R, Kumaraswamy S, et al. Weather integrated multiple machine learning models for prediction of dengue prevalence in India. Int J Biometeorol. 2023;67(2):285–97.
Article PubMed Google Scholar
Zhao N, Charland K, Carabali M, Nsoesie EO, Maheu-Giroux M, Rees E, et al. Machine learning and dengue forecasting: comparing random forests and artificial neural networks for predicting dengue burden at national and sub-national scales in Colombia. PLOS Negl Trop Dis. 2020;14(9):1–16. https://doiorg.publicaciones.saludcastillayleon.es/10.1371/journal.pntd.0008056.
Article Google Scholar
Colón-González FJ, Gibb R, Khan K, Watts A, Lowe R, Brady OJ. Projecting the future incidence and burden of dengue in Southeast Asia. Nat Commun. 2023;14(1):5439.
Article PubMed PubMed Central Google Scholar
Gharbi M, Quenel P, Gustave J, Cassadou S, Ruche GL, Girdary L, et al. Time series analysis of dengue incidence in Guadeloupe, French West Indies: forecasting models using climate variables as predictors. BMC Infect Dis. 2011;11:1–13.
Article Google Scholar
Johansson MA, Reich NG, Hota A, Brownstein JS, Santillana M. Evaluating the performance of infectious disease forecasts: a comparison of climate-driven and seasonal dengue forecasts for Mexico. Sci Rep. 2016;6(1):33707.
Article PubMed PubMed Central Google Scholar
López MS, Gómez AA, Müller GV, Walker E, Robert MA, Estallo EL. Relationship between climate variables and dengue incidence in Argentina. Environ Health Perspect. 2023;131(5):057008.
Article PubMed PubMed Central Google Scholar
Pavani J, Bastos LS, Moraga P. Joint spatial modeling of the risks of co-circulating mosquito-borne diseases in ceará, brazil. Sp Spatio-temporal Epidemiol. 2023;47:100616.
Article Google Scholar
Salim NAM, Wah YB, Reeves C, Smith M, Yaacob WFW, Mudin RN, et al. Prediction of dengue outbreak in Selangor Malaysia using machine learning techniques. Sci Rep. 2021;11(1):939.
Article PubMed PubMed Central Google Scholar
Zhao X, Li K, Ang CKE, Cheong KH. A deep learning based hybrid architecture for weekly dengue incidences forecasting. Chaos Solitons Fractals. 2023;168:113170.
Article Google Scholar
Baquero OS, Santana LMR, Chiaravalloti-Neto F. Dengue forecasting in São Paulo city with generalized additive models, artificial neural networks and seasonal autoregressive integrated moving average models. PloS ONE. 2018;13(4):e0195065.
Article PubMed PubMed Central Google Scholar
Khaira U, Utomo PEP, Aryani R, Weni I. A comparison of SARIMA and LSTM in forecasting dengue hemorrhagic fever incidence in Jambi, Indonesia. J Phys Conf Ser. 2020;1566:012054.
Article Google Scholar
Mussumeci E, Coelho FC. Large-scale multivariate forecasting models for Dengue-LSTM versus random forest regression. Sp Spatio-temporal Epidemiol. 2020;35:100372.
Article Google Scholar
Nguyen VH, Tuyet-Hanh TT, Mulhall J, Minh HV, Duong TQ, Chien NV, et al. Deep learning models for forecasting dengue fever based on climate data in Vietnam. PLoS Negl Trop Dis. 2022;16(6):e0010509.
Article PubMed PubMed Central Google Scholar
Codeco C, Coelho F, Cruz O, Oliveira S, Castro T, Bastos L. Infodengue: a nowcasting system for the surveillance of arboviruses in Brazil. Revue d’Épidémiologie et de Santé Publique. 2018;66:S386.
Article Google Scholar
Hamilton JD. Time series analysis. Princeton: Princeton University Press; 1994.
Book Google Scholar
Box GEP, Jenkins GM, Reinsel GC, Ljung GM. Time series analysis: forecasting and control. Hoboken: John Wiley & Sons; 2015.
Google Scholar
Gardner ES. Exponential smoothing: the state of the art. J Forecast. 1985;4(1):1–28.
Article Google Scholar
Lütkepohl H. New introduction to multiple time series analysis. Berlin: Springer; 2005.
Book Google Scholar
Breiman L. Random forests. Mach Learn. 2001;45:5–32.
Article Google Scholar
Chen T, Guestrin C. Xgboost: A scalable tree boosting system. In: Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining. 2016; 785-94.
Cortes C, Vapnik V. Support-vector networks. Mach Learn. 1995;20:273–97.
Article Google Scholar
Schölkopf B, Smola AJ. Learning with kernels: support vector machines, regularization, optimization, and beyond. Cambridge: MIT press; 2002.
Google Scholar
Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput. 1997;9(8):1735–80.
Article PubMed Google Scholar
Greff K, Srivastava RK, Koutník J, Steunebrink BR, Schmidhuber J. LSTM: a search space odyssey. IEEE Trans Neural Netw Learn Syst. 2016;28(10):2222–32.
Article PubMed Google Scholar
Taylor SJ, Letham B. Forecasting at scale. Am Stat. 2018;72(1):37–45.
Article Google Scholar
Leung XY, Islam RM, Adhami M, Ilic D, McDonald L, Palawaththa S, et al. A systematic review of dengue outbreak prediction models: current scenario and future directions. PLOS Negl Trop Dis. 2023;17(2):e0010631.
Article PubMed PubMed Central Google Scholar
Cabrera M, Leake J, Naranjo-Torres J, Valero N, Cabrera JC, Rodríguez-Morales AJ. Dengue prediction in Latin America using machine learning and the one health perspective: a literature review. Trop Med Infect Dis. 2022;7(10):322.
Article PubMed PubMed Central Google Scholar
Instituto Brasileiro de Geografia e Estatística (IBGE). Brazilian Institute of Geography and Statistics. 2024. https://www.ibge.gov.br. Accessed 25 May 2024.
Pereira RHM, Gonçalves CN, et al. geobr: Loads Shapefiles of Official Spatial Data Sets of Brazil; 2019. GitHub repository. https://github.com/ipeaGIT/geobr. Accessed 25 May 2024.
de Geografia e Estatística (IBGE) IB. Brazilian Institute of Geography and Statistics; 2024. https://geoftp.ibge.gov.br/organizacao_do_territorio/malhas_territoriais/malhas_municipais/. Accessed 25 May 2024.
Xavier DR, Magalhães MdAFM, Gracie R, Reis ICd, Matos VPd, Barcellos C. Difusão espaço-tempo do dengue no Município do Rio de Janeiro, Brasil, no período de 2000–2013. Cadernos de Saúde Pública. 2017;33:e00186615.
Article PubMed Google Scholar
Junior JBS, Massad E, Lobao-Neto A, Kastner R, Oliver L, Gallagher E. Epidemiology and costs of dengue in Brazil: a systematic literature review. Int J Infect Dis. 2022;122:521–8.
Article PubMed Google Scholar
Gomes AF, Nobre AA, Cruz OG. Temporal analysis of the relationship between dengue and meteorological variables in the city of Rio de Janeiro, Brazil, 2001–2009. Cadernos de Saúde Pública. 2012;28(11):2189–97.
Article PubMed Google Scholar
Xavier LL, Honório NA, Pessanha JFM, Peiter PC. Analysis of climate factors and dengue incidence in the metropolitan region of Rio de Janeiro, Brazil. PLoS ONE. 2021;16(5):e0251403.
Article PubMed PubMed Central Google Scholar
Lu L, Lin H, Tian L, Yang W, Sun J, Liu Q. Time series analysis of dengue fever and weather in Guangzhou, China. BMC Public Health. 2009;9(1):395. https://doiorg.publicaciones.saludcastillayleon.es/10.1186/1471-2458-9-395.
Article PubMed PubMed Central Google Scholar
Kakarla SG, Caminade C, Mutheneni SR, Morse AP, Upadhyayula SM, Kadiri MR, et al. Lag effect of climatic variables on dengue burden in India. Epidemiol Infect. 2019;147:e170.
Article PubMed PubMed Central Google Scholar
Vovk V, Gammerman A, Shafer G. Algorithmic learning in a random world. Berlin: Springer; 2005.
Google Scholar
Balasubramanian V, Ho SS, Vovk V. Conformal prediction for reliable machine learning: theory, adaptations and applications. Oxford: Newnes; 2014.
Google Scholar
Gibbs I, Candes E. Adaptive conformal inference under distribution shift. Adv Neural Inf Process Syst. 2021;34:1660–72.
Google Scholar
Zaffran M, Féron O, Goude Y, Josse J, Dieuleveut A. Adaptive conformal predictions for time series. In: International Conference on Machine Learning. PMLR. 2022; 25834-66.
Hyndman RJ, Khandakar Y. forecast: Forecasting functions for time series and linear models. 2020. https://cran.r-project.org/web/packages/forecast/index.html. Accessed 25 May 2024.
Pfaff B. vars: VAR Modelling. 2008. https://cran.r-project.org/web/packages/vars/index.html. Accessed 25 May 2024.
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. Scikit-learn: machine learning in python. J Mach Learn Res. 2011;12:2825–30.
Google Scholar
Taylor S, Letham B. Prophet: forecasting at scale. 2018. https://facebook.github.io/prophet/. Accessed 25 May 2024.
El Morr C, Ozdemir D, Asdaah Y, Saab A, El-Lahib Y, Sokhn ES. AI-based epidemic and pandemic early warning systems: a systematic scoping review. Health Inform J. 2024;30(3):14604582241275844.
Article Google Scholar
Kraemer MU, Bisanzio D, Reiner R, Zakar R, Hawkins JB, Freifeld CC, et al. Inferences about spatiotemporal variation in dengue virus transmission are sensitive to assumptions about human mobility: a case study using geolocated tweets from Lahore, Pakistan. EPJ Data Sci. 2018;7(1):1–17.
Article Google Scholar
Moraga P. Geospatial health data: modeling and visualization with R-INLA and shiny. Boca Raton: Chapman & Hall/CRC Biostatistics Series; 2019.
Book Google Scholar
Moraga P. Spatial statistics for data science: theory and practice with R. Boca Raton: Chapman & Hall/CRC Data Science Series; 2023.
Book Google Scholar
Chen X, Moraga P. Forecasting dengue across Brazil with LSTM neural networks and SHAP-driven lagged climate and spatial effects. BMC Public Health. 2025;25(1):973. https://doiorg.publicaciones.saludcastillayleon.es/10.1186/1471-2458-9-395.
Article PubMed PubMed Central Google Scholar
Chen X, Moraga P. Dengue forecasting and outbreak detection in Brazil using LSTM: integrating human mobility and climate factors. medRxiv. 2025. https://doiorg.publicaciones.saludcastillayleon.es/10.1101/2025.03.02.25323168.
Ohi AQ, Mridha M, Monowar MM, Hamid MA. Exploring optimal control of epidemic spread using reinforcement learning. Sci Rep. 2020;10(1):22106.
Article PubMed PubMed Central Google Scholar
Libin PJ, Moonens A, Verstraeten T, Perez-Sanjines F, Hens N, Lemey P, et al. Deep reinforcement learning for large-scale epidemic control. In: Machine Learning and Knowledge Discovery in Databases. Applied Data Science and Demo Track: European Conference, ECML PKDD 2020, Ghent, Belgium, September 14–18, 2020, Proceedings, Part V. Springer. 2021; 155-70.
Moraga P. SpatialEpiApp: a Shiny web application for the analysis of spatial and spatio-temporal disease data. Sp Spatio-temporal Epidemiol. 2017;23:47–57.
Article Google Scholar

Download references

Acknowledgements

Not applicable.

Funding

This research received financial support from The Letten Prize (https://lettenprize.com/), with a personal award to Paula Moraga. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Author information

Authors and Affiliations

Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), 23955-6900, Thuwal, Saudi Arabia
Xiang Chen & Paula Moraga

Authors

Xiang Chen
View author publications
You can also search for this author inPubMed Google Scholar
Paula Moraga
View author publications
You can also search for this author inPubMed Google Scholar

Contributions

PM and XC conceived and designed the study. XC collected and processed the data, and performed the experiments. All the figures and maps are drawn by XC. XC and PM were contributors in writing and revising the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Xiang Chen.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no Competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

41182_2025_723_MOESM1_ESM.zip

Supplementary material 1. S1 Comparison of the predictions obtained by the best performing statistical and machine learning techniques for different time horizons. S2 Boxplots of the absolute errors (cases) obtained by the forecasting methods when using only cases. S3 Boxplots of the absolute percentage errors (%) obtained by the forecasting methods when using only cases. S4 Boxplots of the absolute errors (cases) obtained by the forecasting methods when including covariates. Ensemble refers to the best ensemble approach using covariates which is LSTM and SARIMAX. S5 Boxplots of the absolute percentage errors (%) obtained by the forecasting methods including covariates. Ensemble refers to the best ensemble approach using covariates which is LSTM and SARIMAX. S6 Real cases, predictions, and 95% uncertainty intervals computed with SARIMAX across various forecast horizons. S7 Real cases, predictions, and 95% uncertainty intervals computed with LSTM including covariates across various forecast horizons. S8 Computational time of each forecasting method.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Chen, X., Moraga, P. Assessing dengue forecasting methods: a comparative study of statistical models and machine learning techniques in Rio de Janeiro, Brazil. Trop Med Health 53, 52 (2025). https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s41182-025-00723-7

Download citation

Received: 27 December 2024
Accepted: 04 March 2025
Published: 10 April 2025
DOI: https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s41182-025-00723-7

Assessing dengue forecasting methods: a comparative study of statistical models and machine learning techniques in Rio de Janeiro, Brazil

Abstract

Background

Methods

Results

Conclusions

Introduction

Dengue and climate data in Rio de Janeiro, Brazil

Methodology

Statistical models for dengue forecasting

Machine learning techniques for dengue forecasting

Ensemble approaches

Uncertainty intervals

Methods implementation

Performance evaluation metrics

Results

Performance of statistical models

Performance of machine learning techniques

Performance ensemble approaches

Uncertainty intervals

Computational efficiency

Discussion

Availability of data and materials

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethics approval and consent to participate

Consent for publication

Competing interests

Additional information

Publisher's Note

Supplementary Information

41182_2025_723_MOESM1_ESM.zip

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Tropical Medicine and Health

Contact us