When a forecast of the total value over a number of time periods ahead is required, forecasters are presented with two temporal aggregation (TA) approaches approaches to produce required forecasts: i) aggregated forecast (AF) or ii) aggregate data unisg non-overlapping temporal aggregation (AD). Often, the recommendation is to aggregate data to a frequency relevant to the decision the eventual forecasts will support and then produce the forecast. However, this might not be always the best choice and we argue that both AF and AD approaches may outperform each other in different situations. Moreover, there is a lack of evidence on what indicators may determine the superirity of each approach. We design and execute an empirical experiment framework to first explore the performance of these approaches using monthly time series of M4 competition dataset. We further turn the problem into a classification supervised learning and build a machine learnign algorithm to investigate the connection between time series features and the performance of temporal aggregation. This is the first study in time series forecasting that explores the association between time series features and temporal aggreagtion performance. Our findings suggest that neither AF or AD approaches perform accurately for each indivisual series. AF is shown to be singnigicantely better than AD for the monthly M4 time series, especially for longer horizons. We build several machine learning approaches using a set of extracted time series features as input to predict accurately whether AD or AF oshould be used. We find out Random Forest (RF) is the most accurate approach in correctly classifying the oucome examined both by staitical measures such as missclassification error, F-statistics, and area under the curve an a utility measure. The RF approach reveals that curvature, nonlinearity, seas_pacf, unitroot_up, mean, ARCHM.LM, Coifficient of Variation, stability, linearity and max_level_shif are among the most important features in driving the predictions of the model. Our findings indicate that the strength of trend, ARCH.LM, hurst, autocorrelation lag 1 and unitroot_pp and seas_pacf may favor AF approach, while lumpiness, entropy, no-linearity, curvature, stremgth of seasonality may increase the chance of AD performing better. We conclude the study by sumamrising the finding and present an agenda for further research.