With an explosion of data in the nearly every domain, from physical and environmental sciences to econometrics and even social science, we have entered a golden era of time-domain data science. In addition to ever increasing data volumes and varieties of time-series, there's a plethora of techniques available to analyses them. Unsurprisingly, machine learning approaches are very popular. In environmental sciences (ranging from climate studies to space weather/solar physics), machine learning (ML )and deep learning (DL) approaches complement the traditional data assimilation approaches. In solar and astrophysics, multiwavelength time-series or lightcurves along with observations of other messengers like gravitational waves provide a treasure trove of time-domain data for ML/DL approaches for modeling and forecasting. In this regard, deep learning algorithms such as recurrent neural networks (RNNs) should prove extremely powerful for forecasting as it has in several other domains. Such methods agnostic of the physics can only predict transient flares that are represented in the original distribution trained on (eg. coloured noise fluctuations) and not special, isolated episodes (eg. a one-off outburst like a major coronal mass ejection) potentially distinguishing them. Therefore it is vital to test robustness of these techniques ; here we attempt to make a small contribution to this by exploring sensitivity of RNNs with some rudimentary experiments. We test the performance of a very successful class of RNNs, the Long Short Term Memory (LSTM) algorithms with simulated lightcurves. As an example, we focus on univariate forecasting of time-series (or lightcurves) characteristic certain variable astrophysical objects (eg. active galactic nuclei or AGN). Specifically, we explore the sensitivity of training and test losses to key parameters of the LSTM network and data characteristics namely gaps and complexity measured in terms of number of Fourier components. We find that typically, the performances of LSTMs are better for pink or flicker noise type sources. The key parameters on which performance is dependent are batch size for LSTM and the gap percentage of the lightcurves. While a batch size of $10-30$ seems optimal, the most optimal test and train losses are under $10 \%$ of missing data for both periodic and random gaps in pink noise. The performance is far worse for red noise. This compromises detectability of transients. Thus, we show that generative models (time-series simulations) are excellent guides for use of RNN-LSTMs in building forecasting systems