This afternoon we had two soft intro talks to the area of Change Point Detection and Time Series Analysis (in a discrete setting mainly). In some extent, change point detection is time series as well. Many times, people are interested in detecting the changes over time and making good predictions for the future. Also, when one abrupt change happens, a natural question to ask is - Is the change intrinsic or is it just an outlier? These are interesting areas I'd like to explore further, but here let me refresh my memory with what I've done with time series so far.
I remember when I had my first taste of time series analysis as an undergrad, I started analysing time series just by looking at all those graphs. The teacher introduced all the features that time series data can possibly have by showing us different graphs and basically let us first have our eyeball check. Sounds silly, but even for experts, this is always the first step. "Building a time series model is as much an art as it is a science." Afterwards when I was doing my masters, I chose to take time series course again, as if I have not had enough during my undergrad time. It turned out there is indeed much more to explore. The second time of taking time series course, I kind of beaded together all the piecemeal knowledge I remembered from my undergrad time series course, and looked at it through a more holistic perspective.
As a final output from my graduate time series course, I finished an analysis in modelling Dow Jones Industrial Average data from 2009 to 2016. The purpose of this analysis is to extract all the trend, seasonality, heteroscedasticity, and other patterns left in this time series data, and to make forecasts on future closing prices.
The following plot on the left shows that the daily closing prices of Dow Jones Industrial Index increases in general over the past seven years. However, there is no apparent pattern in the movement of the closing prices over time. These observations imply that a first order difference of the raw data might be appropriate in order to make the data stationary. Since there is no other discernible pattern in the in the data, I performed first-order difference on the original data. The data after differencing is shown as follows on the right:
Following by the differencing, it is always good to check whether the differenced data is in fact, stationary (You can't always rely on your eyes). The way I used to do it, was using the Augmented Dicky-Fuller (ADF) test to see if the transformed data is stationary or not. If it is, then look at the ACF and PACF plot to make a reasonable choice of the order for the ARIMA model, or SARIMA if there exists seasonality.
After an ARIMA model has been chosen, then comes the main business, which is to make forecasts. There are multiple forecasting methods out there, such as ARIMA forecasting and Holt-Winters forecasting. What baffles me always is: what is the use of forecasts, especially in financial (e.g. stock market) setting, when all I want to know is whether the stock price goes up or down the next day, but you tell me a predicted value with a confidence interval that covers both up and down.