本章数据重点是检查每日收盘水平,道琼斯工业平均指数)和每日收盘指数,其中共包括1月2日34家上市公司的股票价格以及道琼斯工业指数从1973年2001年的变化趋势。表3.1列出了相关的数据序列.列表中显示的两家公司在道琼斯工业指数是最大和最重要的产业,在此我们需要更正价格红利、资本变动和股票分割。作为无风险利率的代理,我们日常使用的数据可以做为我们3个月的存款凭证。一些研究发现,技术交易规则变化最为明显的为电力产业,我们在两个相对区间范围内可以分裂我们的数据样本进行分析。
通过数据系列的百分比变化我们可以看出很多问题。第十一个列显示Ljung-Box(1978)统计数据测试是否有返回现像,还是作为一个整体而不是归于零。第十二栏显示异方差性调整范围指数。最后一列显示Ljung-Box(1978)Q型统计数据是测试相关平方的利润指数。所有数据系列,除了伯利恒钢铁公司之外其它公司在整个样本期间意味着每年返回平均利润为11.5%。返回分布的负偏态的迹象,特别是对于道琼斯工业指数的,伊士曼柯达公司和宝洁(Procter & Gamble)可以做为重点来参考。总的来说34个独立公司的股票风险是高于指数所显示的标准偏差的利润指数。
The data series examined in this chapter are the daily closing levels of the Dow-Jones Industrial Average (DJIA) and the daily closing stock prices of 34 companies listed in the DJIA in the period January 2, 1973 through June 29, 2001. Table 3.1 lists the data series. The companies in the DJIA are the largest and most important in their industries. Prices are corrected for dividends, capital changes and stock splits. As a proxy for the risk-free interest rate we use daily data on US 3-month certificates of deposits. Several studies found that technical trading rules show significant forecasting power in the era until 1987 and no forecasting power anymore from then onwards. Therefore we split our data sample in two subperiods. Table 3.2 shows the summary statistics for the period 1973-2001 and the tables 3.3 and 3.4 show the summary statistics for the two subperiods 1973-1986 and 1987-2001. Because the first 260 data points are used for initializing the technical trading strategies, the summary statistics are shown from January 1, 1974. In the tables the first and second column show the names of the data series examined and the number of available data points. The third column shows the mean yearly effective return in percentage/100 terms. The fourth through seventh column show the mean, standard deviation, skewness and kurtosis of the logarithmic daily return. The eight column shows the t-ratio to test whether the mean logarithmic return is significantly different from zero. The ninth column shows the Sharpe ratio, that is the extra return over the riskfree interest rate per extra point of risk, as measured by the standard deviation. The tenth column shows the largest cumulative loss, that is the largest decline from a peak.
To a through, of the data series in percentage/100 terms. The eleventh column shows the Ljung-Box (1978) Q-statistic testing whether the first 20 autocorrelations of the return series as a whole are significantly different from zero. The twelfth column shows the heteroskedasticity adjusted Box-Pierce (1970) Q-statistic, as derived by Diebold (1986). The final column shows the Ljung-Box (1978) Q-statistic testing for autocorrelations in the squared returns. All data series, except Bethlehem Steel, show in the full sample period a positive mean yearly return which is on average 11.5%. The return distributions are strongly leptokurtic and show signs of negative skewness, especially for the DJIA, Eastman Kodak and Procter & Gamble. The 34 separate stocks are riskier than the index, which is shown by the standard deviation of the returns. On average it is 1.9% for the 34 stocks, while it is 1% for the DJIA. Thus it is clear that firm specific risks are reduced by a diversified index. The Sharpe ratio is negative for 12 stocks, which means that these stocks were not able to beat a continuous risk free investment. Table 3.1 shows that the largest decline of the DJIA is equal to 36% and took place in the period August 26, 1987 until October 19, 1987 that covers the crash of 1987. October 19, 1987 showed the biggest one-day percentage loss in history of the DJIA and brought the index down by 22.61%. October 21, 1987 on its turn showed the largest one-day gain and brought the index up by 9.67%. However the largest decline of each of the 34 separate stocks is larger, on average 61%. For only five stocks (GoodYear Tire, HP, Home Depot, IBM, Wal-Mart) the largest decline started around August 1987. As can be seen in the table, the increasing oil prices during the seventies, caused initially by the oil embargo of the Arab oil exporting countries against countries supporting Israel in “The Yom Kippur War” in 1973, had the largest impact on stock prices. The doubling of oil prices led to a widespread recession and a general crisis of confidence. Bethlehem Steel did not perform very well during the entire 1973-2001 period and declined 97% during the largest part of its sample. AT&T declined 73% within two years: February 4, 1999 until December 28, 2000 which covers the so-called burst of the internet and telecommunications bubble. If the summary statistics of the two subperiods in tables 3.3 and 3.4 are compared, then some substantial differences can be noticed. The mean yearly return of the DJIA is in the first subperiod 1973-1986 equal to 6.1%, while in the second subperiod 19872001 it is equal to 12.1%, almost twice as large. For almost all data series the standard deviation of the returns is higher in the second subperiod than in the first subperiod. The Sharpe ratio is negative for only 5 stocks in the subperiod 1987-2001, while it is negative for 22 stocks and the DJIA in the period 1973-1986, clearly indicating that buyand-hold stock investments had a hard time in beating a risk free investment particularly。
To test that the first q autocorrelations as a whole are not significantly different from zero, is asymptotically χ-squared distributed with q degrees of freedom. Typically autocorrelations of the returns are small with only few lags being significant. It is noteworthy that for most data series the second order autocorrelation is negative in all periods. The first order autocorrelation is negative for only 3 data series in the period 1973-1986, while it is negative for 18 data series in the period 1987-2001. The Ljung-Box (1978) Q-statistics in the second to last columns of tables 3.2, 3.3 and 3.4 reject for all periods for almost all data series the null hypothesis that the first 20 autocorrelations of the returns as a whole are equal to zero. In the first subperiod only for Boeing and HP this null is not rejected, while in the second subperiod the null is not rejected only for GM, HP, IBM and Walt Disney. Hence HP is the only stock which does not show significant autocorrelation in all periods. When looking at the first to last column with Diebold’s (1986) heteroskedasticityconsistent Box-Pierce (1970) Q-statistics it appears that heteroskedasticity indeed affects.
The inferences about serial correlation in the returns. For the full sample period 1973-2001 and the two subperiods 1973-1986 and 1987-2001 for respectively 18, 9 and 19 data series the null hypothesis of no autocorrelation is not rejected by the adjusted Q-statistic, while it is rejected by the Ljung-Box (1978) Q-statistic. The autocorrelation functions of the squared returns show that for all data series and for all periods the autocorrelations are high and significant up to order 20. The Ljung-Box (1978) Q-statistics reject the null of no autocorrelation in the squared returns firmly. Hence, all data series exhibit significant volatility clustering, that is large (small) shocks are likely to be followed by large (small) shocks.
We refer to section 2.3 for an overview of the technical trading rules applied in this chapter. In this thesis we mainly confine ourselves to objective trend-following technical trading techniques which can be implemented on a computer. In total we test in this chapter a set of 787 technical trading strategies2 . This set is divided in three different groups: moving-average rules (in total 425), trading range break-out (also called supportand- resistance) rules (in total 170) and filter rules (in total 192). These strategies are also described by Brock, Lakonishok and LeBaron (1992), Levich and Thomas (1993) and Sullivan, Timmermann and White (1999). We use the parameterizations of Sullivan et al. (1999) as a starting point to construct our sets of trading rules. The parameterizations are presented in Appendix B. If a signal is generated at the end of day t, we assume that the corresponding trading position at day t + 1 is executed against the price at the end of day t. Each trading strategy divides the data set of prices in three subsets. A buy (sell) period is defined as the period after a buy (sell) signal up to the next trading signal. A neutral period is defined as the period after a neutral signal up to the next buy or sell signal. The subsets consisting of buy, sell or neutral periods will be called the set of buy sell or neutral days.
In the risky asset is held, while on a sell signal the position in the risky asset is sold and the proceeds are invested against the risk-free interest rate. If a technical trading rule has forecasting power, then it should beat the buy-and-hold strategy consistently and persistently. It should advise to buy when prices rise and it should advise to sell when prices fall. Therefore its performance, i.e. mean return or Sharpe ratio, will be compared to the buy-and-hold performance to examine whether the trading strategy generates valuable signals. The advantage of this procedure is that it circumvents the question whether it is possible to hold an actual short3 position in an asset. We define Pt as the price of the risky asset, It as the investment in the risky asset and St as the investment in the risk free asset at the end of period t. The percentage/100 costs of initializing or liquidating a trading position is denoted by c. The real profit during a certain trading position including the costs of initializing and liquidating the trading position is determined as follows:t + 1, part of the costs, because of liquidating the risk free position at the end of day t, are at the expense of the profit at day t and part of the costs, because of initializing the long position at the beginning of day t + 1 against the price at the end of day t, are at the expense of the profit at day t + 1. In this chapter, 0, 0.10, 0.25, 0.50, 0.75 and 1% costs per trade are implemented. This wide range of transaction costs captures a range of different trader types. For example, floor traders and large investors, such as mutual funds, can trade against relatively low transaction costs in the range of 0.10 to 0.25%. Home investors face higher costs in the range of 0.25 to 0.75%, depending whether they trade through the internet, by telephone or through their personal account manager. Next, because of the bid-ask spread, extra costs over the transaction costs are faced. By examining a wide range of 0 to 1% costs per trade, we belief that we can capture most of the cost possibilities faced in reality by most of the traders.
Data snooping is the danger that the performance of the best forecasting model found in a given data set is just the result of chance instead of the result of truly superior forecasting power. The search over many different models should be taken into account before making inferences on the forecasting power of the best model. It is widely acknowledged by empirical researchers that data snooping is a dangerous practice to be avoided. Building on the work of Diebold and Mariano (1995) and West (1996), White (2000) developed a simple and straightforward procedure for testing the null hypothesis that the best model encountered in a specification search has no predictive superiority over a given benchmark model. This procedure is called White’s Reality Check (RC) for data snooping. We briefly discuss the method hereafter. The performance of each technical trading strategy used in this chapter is compared to the benchmark of a buy-and-hold strategy. Predictions are made for M periods, indexed from J + 1 through T = J + 1 + M , where the first J data points are used to initialize the K technical trading strategies, so that each technical trading strategy starts at least generating signals at time t = J + 1. The performance of strategy k in excess of the buy-and-hold is defined as fk . The null hypothesis that the best strategy is not superior to the benchmark of buy-and-hold is given by H0 : max E (fk ) ≤ 0,
k =1...K.
where r f is the mean risk-free interest rate and s.e.(.) is the standard error of the corresponding return series. The Sharpe ratio measures the excess return of a strategy over the risk-free interest rate per unit of risk, as measured by the standard deviation, of the strategy. The higher the Sharpe ratio, the better the reward attained per unit of risk taken. The null hypothesis can be evaluated by applying the stationary bootstrap algorithm of Politis and Romano (1994). This algorithm resamples blocks with varying length from the original data series, where the block length follows the geometric distribution4 , to form a bootstrapped data series. The purpose of the stationary bootstrap is to capture and preserve any dependence in the original data series in the bootstrapped data series. The stationary bootstrap algorithm is used to generate B bootstrapped data series. Applying strategy k to the bootstrapped data series yields B bootstrapped values of f k.
where 1(.) is an indicator function that takes the value one if and only if the expression within brackets is true. White (2000) applies the Reality Check to a specification search directed toward forecasting the daily returns of the S&P 500 one day in advance in the period May 29, 1988 through May 31, 1994 (the period May 29, 1988 through June 3, 1991 is used as initialization period). In the specification search linear forecasting models that make use of technical indicators, such as momentum, local trend, relative strength indexes and moving averages, are applied to the data set. The mean squared prediction error and directional accuracy are used as prediction measures. White (2000) shows that the Reality Check does not reject the null hypothesis that the best technical indicator model cannot beat the buy-and-hold benchmark. However, if one looks at the p-value of the best strategy not corrected for the specification search, the so called data-mined p-value, the null is not rejected marginally in the case of the mean squared prediction error accuracy, and is rejected in the case of directional accuracy. Sullivan, Timmermann and White (1999, 2001) utilize the RC to evaluate simple technical trading strategies and calendar effects applied to the Dow-Jones Industrial Average (DJIA) in the period 1897-1996. As performance measures the mean return and the Sharpe ratio are chosen. The benchmark is the buy-and-hold strategy. Sullivan et al. (1999) find for both performance measures that the best technical trading rule has superior forecasting power over the buy-and-hold benchmark in the period 1897-1986 and for several subperiods, while accounting for the effects of data snooping. Thus it is found that the earlier results of Brock et al. (1992) survive the danger of data snooping. However for the period 1986-1996 this result is not repeated. The individual data-mined p-values still reject the null hypothesis, but the RC p-values do not reject the null hypothesis anymore. For the calendar effects (Sullivan et al., 2001) it is found that the individual data-mined p-values do reject the null hypothesis in the period 1897-1996, while the RC, which corrects for the search of the best model, does not reject the null hypothesis of no superior forecasting power of the best model over the buy-and-hold benchmark. Hence Sullivan et al. (1999, 2001) show that if one does not correct for data snooping one can make wrong inferences about the significant forecasting power of the best model. Hansen (2001) identifies a similarity condition for asymptotic tests of composite hypotheses and shows that this condition is a necessary condition for a test to be unbiased. The similarity condition used is called “asymptotic similarity on the boundary of a null hypothesis” and Hansen (2001) shows that White’s RC does not satisfy this condition. This causes the RC to be a biased test, which yields inconsistent p-values. Further the RC is sensitive to the inclusion of poor and irrelevant models, because the p-value can be increased by including poor models. The RC is therefore a subjective test.
Null hypothesis can finally be rejected by including enough poor models. Also the RC has unnecessary low power, which can be driven to zero by the inclusion of “silly” models. Hansen (2001) concludes that the RC can misguide the researcher to believe that no real forecasting improvement is provided by a class of competing models, even though one of the models indeed is a superior forecasting model. Therefore Hansen (2001) applies within the framework of White (2000) the similarity condition to derive a test for superior predictive ability (SPA), which reduces the influence of poor performing strategies in deriving the critical values. This test is unbiased and is more powerful than the RC. The null hypothesis tested is that none of the alternative models is superior to the benchmark model. The alternative hypothesis is that one or more of the alternative models are superior to the benchmark model. The SPA-test p-value is determined by comparing the test statistic (3.1) to the quantiles of √ ∗ ∗ V b = max { P (f k,b − g (f k ))}.
Equations (3.3) and (3.4) ensure that poor and irrelevant strategies cannot have a large impact on the SPA-test p-value, because (3.4) filters the strategy set for these kind of strategies. Hansen (2001) uses the RC and the SPA-test to evaluate forecasting models applied to US annual inflation in the period 1952 through 2000. The forecasting models are linear regression models with fundamental variables, such as employment, inventory, interest, fuel and food prices, as the regressors. The benchmark model is a random walk and as performance measure the mean absolute deviation is chosen. Hansen (2001) shows that the null hypothesis is neither rejected by the SPA-test p-value, nor by the RC p-value, but that there is a large difference in magnitude between both p-values, likely to be caused by the inclusion of poor models in the space of forecasting models.
Technical trading rule performance In section 3.2 we have shown that in the subperiod 1973-1986 most stocks could not even beat a risk free investment, while they boosted in the subperiod 1987-2001. However the larger rewards came with greater risks. One may question whether technical trading strategies can persistently generate higher pay-offs than the buy-and-hold benchmark. In total we apply 787 objective computerized trend-following technical trading techniques with and without transaction costs to the DJIA and to the stocks listed in the DJIA. Tables 3.5 and 3.6 show for the full sample period, 1973:1-2001:6, for each data series some statistics of the best strategy selected by the mean return criterion, if 0% and 0.25% costs per trade are implemented. Column 2 shows the parameters of the best strategy. In the case of a moving-average (MA) strategy these parameters are “[short run MA, long run MA]” plus the refinement parameters “[%-band filter, time delay filter, fixed holding period, stop-loss]”. In the case of a trading range break, also called supportand-resistance (SR), strategy, the parameters are “[the number of days over which the local maximum and minimum is computed]” plus the refinement parameters as with the moving averages. In the case of a filter (FR) strategy the parameters are “[the %-filter, time delay filter, fixed holding period]”. Columns 3 and 4 show the mean yearly return and excess mean yearly return of the best-selected strategy over the buy-and-hold benchmark, while columns 5 and 6 show the Sharpe ratio and excess Sharpe ratio of the best strategy over the buy-and-hold benchmark. Column 7 shows the maximum loss the best strategy generates. Columns 8, 9 and 10 show the number of trades, the percentage of profitable trades and the percentage of days profitable trades last. Finally, the last column shows the standard deviation of the returns of the data series during profitable trades divided by the standard deviation of the returns of the data series during non-profitable trades. To summarize, table 3.7 shows for the full sample period, 1973:1-2001:6, and for the two subperiods, 1973:1-1986:12 and 1987:1-2001:6, for each data series examined, the mean yearly excess return over the buy-and-hold benchmark of the best strategy selected by the mean return criterion, after implementing 0, 0.10, 0.25 and 0.75%5 costs per trade. For transaction costs between 0 − 1% it is found for each data series that the excess return of the best strategy over the buy-and-hold is positive in almost all cases; the only exception is Caterpillar in the full sample period if 1% costs per trade are implemented.
Even for Bethlehem Steel, which stock shows considerable losses in all periods, the best strategy generates not only a positive excess return, but also a positive normal return. By this we mean that the best strategy on its own did generate profits. This is important because excess returns can also be positive in the case when a non-profitable strategy loses less than the buy-and-hold benchmark. If transaction costs increase from 0 to 0.75% per trade, then it can be seen in the last row of table 3.7 that on average the excess return by which the best strategy beats the buy-and-hold benchmark decreases; for example from 19 to 5.34% for the full sample period. Further, the technical trading rules yield the best results in the first subperiod 1973-1986, the period during which the stocks performed the worst. On average, in the case of no transaction costs, the mean excess return in this period is equal to 33% yearly, almost twice as large as in the period 1987-2001, when it is equal to 17.3% yearly. In comparison, the DJIA advanced by 6.1% yearly in the 19731986 period, while it advanced by 12.1% yearly in the 1987-2001 period. Thus from these results we can conclude that in all sample periods technical trading rules are capable of beating a buy-and-hold benchmark, also after correction for transaction costs. From table 3.5 (full sample) it can be seen that in the case of zero transaction costs the best-selected strategies are mainly strategies which generate a lot of trading signals. Trading positions are held for only a few days. For example, the best strategy found for the DJIA is a single crossover moving-average strategy with no extra refinements, which generates a signal when the price series crosses a 2-day moving average. The mean yearly return of this strategy is 25%, which corresponds with a mean yearly excess return of 14.4%. The Sharpe ratio is equal to 0.0438 and the excess Sharpe ratio is equal to 0.0385. The maximum loss of the strategy is 25.1%, while the maximum loss of buying and holding the DJIA is equal to 36.1%. The number of trades executed by following the strategy is very large, once every two days, but also the percentage of profitable trades is very large, namely 69.7%. These profitable trades span 80.8% of the total number of trading days. Although the trading rules show economic significance, they all go through periods of heavy losses, well above the 50% for most stocks (table 3.1). Comparable results are found for the other data series and the two subperiods. If transaction costs are increased to 0.25% per trade, then table 3.6 shows that the bestselected strategies are strategies which generate substantially fewer signals in comparison with the zero transaction costs case. Trading positions are now held for a longer time. For example, for the DJIA the best strategy generates a trade every 2 years and 4 months. Also the percentage of profitable trades and the percentage of days profitable trades last increases for most data series. Similar results are found in the two subperiods.
Dooley and Shafer (1983) notice for floating exchange rates that there is some relationship between variability in the returns, as measured by standard deviation and technical trading rule profits. They find that a large increase in the variability is associated with a dramatic increase in the profitability. If no transaction costs are implemented, then from table 3.5, last column, it can be seen that the standard deviations of the returns of the data series themselves during profitable trades are higher than the standard deviations of the returns during non-profitable trades for almost all stocks, except Exxon Mobil, Home Depot and Wal-Mart Stores. However, if 0.25% costs per trade are implemented, then for 18 data series out of 35 the standard deviation ratio is larger than one. According to the EMH it is not possible to exploit a data set with past information to predict future price changes. The good performance of the technical trading rules could therefore be the reward for holding a risky asset needed to attract investors to bear the risk. Since the technical trading rule forecasts only depend on past price history, it seems unlikely that they should result in unusual risk-adjusted profits.
On day t of the price-weighted Dow-Jones Industrial Average, which represents the market
f is the risk-free interest rate. The coefficient β measures the riskiness portfolio, and rt of the active technical trading strategy relatively to the passive strategy of buying and holding the market portfolio. If β is not significantly different from one, then it is said that the strategy has equal risk as a buying and holding the market portfolio. If β > 1 (β < 1), then it is said that the strategy is more risky (less risky) than buying and holding the market portfolio and that it should therefore yield larger (smaller) returns. The coefficient α measures the excess return of the best strategy applied to stock i after correction of bearing risk. If it is not possible to beat a broad market portfolio after correction for risk and hence technical trading rule profits are just the reward for bearing risk, then α should not be significantly different from zero. For the full sample period table 3.8 shows for different transaction cost cases the estimation results, if for each data series the best strategy is selected by the mean return criterion. Estimation is done with Newey-West (1987) heteroskedasticity and autocorrelation consistent (HAC) standard errors. Table 3.9 summarizes the CAPM estimation results for all periods and all transaction cost cases by showing the number of data series for which significant estimates of α or β are found at the 10% significance level.
Table 3.9: Summary: significance CAPM estimates, mean return criterion. For all periods
and for each transaction cost case, the table shows the number of data series for which significant estimates are found at the 10% significance level for the coefficients in the Sharpe-Lintner CAPM (3.5). Columns 1 and 2 show the number of data series for which the estimate of α is significantly negative and positive. Columns 3 and 4 show the number of data series for which the estimate of β is significantly smaller and larger than one. Column 5 shows the number of data series for which the estimate of α is significantly positive as well as the estimate of β is significantly smaller than one. Column 6 shows the number of data series for which the estimate of α is significantly positive as well as the estimate of β is significantly larger than one. Note that for the periods 1973-2001, 1973-1986 and 1987-2001, the number of data series analyzed is equal to 35, 30 and 35.
For example, for the best strategy applied to the DJIA in the case of zero transaction costs, the estimate of α is significantly positive at the 1% significance level and is equal to 5.39 basis points per day, that is approximately 13.6% per year. The estimate of β is significantly smaller than one at the 10% significance level, which indicates that although the strategy generates a higher reward than simply buying and holding the index, it is less risky. If transaction costs increase, then the estimate of α decreases to 1.91 basis points per day, 4.8% per year, in the case of 1% transaction costs, but is still significantly positive. The estimate of β is significantly smaller than one for all transaction cost cases at the 10% significance level. As further can be seen in tables 3.8 and 3.9, if no transaction costs are implemented, then for the full sample period the estimate of α is significantly positive for 28 out of 34 stocks. For none of the data series the estimate of α is significantly negative. Thus, for only six stocks the estimate of α is not significantly different from zero. The estimate of α decreases as costs increase and becomes less significant for more data series.
and 1% transaction costs cases, only for respectively 7 and 8 data series out of 35 the estimate of α is significantly positive. Further the estimate of β is significantly smaller than one for 14 data series, if zero transaction costs are implemented. Only for three stocks β is significantly larger than one. Further, table 3.9 shows that for all periods and all transaction cost cases the estimate of α is never significantly negative, indicating that the best strategy is never performing significantly worse than the buy-and-hold benchmark. Also for the two subperiods it is found that for more than half of the data series the estimate of α is significantly positive, if no transaction costs are implemented. Moreover, especially for the second subperiod, it is found that the estimate of β is significantly smaller than one for many data series, indicating that the best strategy is less risky than the market portfolio. From the findings until now we conclude that there are trend-following technical trading techniques which can profitably be exploited, even after correction for transaction costs, when applied to the DJIA and to the stocks listed in the DJIA in the period 19732001 and in the two subperiods 1973-1986 and 1987-2001. As transaction costs increase, the best strategies selected are those which trade less frequently. Furthermore, it becomes more difficult for more and more stocks to reject the null hypothesis that the profit of the best strategy is just the reward of bearing risk. However, for transaction costs up to 1% per trade it is found for a group of stocks that the best strategy, selected by the mean return criterion, can statistically significantly beat the buy-and-hold benchmark strategy. Moreover, for many data series it is found that the best strategy, although it does not necessarily beats the buy-and-hold, is less risky than the buy-and-hold strategy. Data snooping The question remains open whether the findings in favour of technical trading for particular stocks are the result of chance or of real superior forecasting power. Therefore we apply White’s (2000) Reality Check and Hansen’s (2001) Superior Predictive Ability test. Because Hansen (2001) showed that the Reality Check is biased in the direction of one, p-values are computed for both tests to investigate whether these tests lead in some cases to different conclusions. If the best strategy is selected by the mean return criterion, then table 3.10 shows the nominal, RC and SPA-test p-values for the full sample period 1973-2001 in the case of 0 and 0.10% costs per trade, for the first subperiod 1973-1986 in the case of 0 and 0.25% costs per trade and for the second subperiod 1987-2001 only in the case of 0% costs per trade. Table 3.11 summarizes the results for all periods and all transaction cost cases by showing the number of data series for which the corresponding p-value is smaller than.
Table 3.11: Summary: Testing for predictive ability, mean return criterion. For all periods and for each transaction cost case, the table shows the number of data series for which the nominal (pn ), White’s (2000) Reality Check (pW ) or Hansen’s (2001) Superior Predictive Ability test (pH ) p-value is smaller than 0.10. Note that for the periods 1973-2001, 1973-1986 and 1987-2001, the number of data series analyzed is equal to 35, 30 and 35. The nominal p-value, also called data mined p-value, tests the null hypothesis that the best strategy is not superior to the buy-and-hold benchmark, but does not correct for data snooping. From the tables it can be seen that this null hypothesis is rejected for all periods and for all cost cases at the 10% significance level. However, for the full sample period, if we correct for data snooping, then we find, in the case of no transaction costs, that for all of the data series the null hypothesis that the best strategy is not superior to the benchmark after correcting for data snooping is not rejected by the RC. However, for 8 data series the null hypothesis that none of the strategies are superior to the benchmark after correcting for data snooping is rejected by the SPA-test. In 8 cases the two data snooping tests lead thus to different inferences about predictive ability of technical trading in the 1973-2001 period. For these 8 cases the biased RC misguides by not rejecting the null, even though one of the technical trading strategies is indeed superior, as shown by the SPA-test. However, if we implement as little as 0.10% costs for the full sample period, then both tests do not reject their null anymore for all data series. For the subperiod 1973-1986 we find that the SPA-test p-value does reject the null for 13 data series, while the RC p-value does reject the null for only 1 data series at the 10% significance level. However, if 0.25% costs are implemented, then both tests do not reject their null for all data series. For the second subperiod 1987-2001 we find that the two tests are in agreement. Even if no transaction costs are implemented, then both tests do not reject the null at the 10% significance level in almost all cases. Hence, we conclude that the best strategy, selected by the mean return criterion, is not capable of beating the buy-and-hold benchmark strategy, after a correction is made for transaction costs and data snooping.
Technical trading rule performance Similar to tables 3.5 and 3.6, table 3.12 shows for the full sample period for some data series some statistics of the best strategy selected by the Sharpe ratio criterion, if 0 or 0.25% costs per trade are implemented. Only the results for those data series are presented for which the best strategy selected by the Sharpe ratio criterion differs from the best strategy selected by the mean return criterion. To summarize, table 3.13 shows for all periods and for each data series the Sharpe ratio of the best strategy selected by the Sharpe ratio criterion, after implementing 0, 0.10, 0.25 or 0.75% costs per trade, in excess of the Sharpe ratio of the buy-and-hold benchmark. It is found that the Sharpe ratio of the best-selected strategy in excess of the Sharpe ratio of the buy-and-hold is positive in almost all cases; the only exceptions are Caterpillar in the full sample period and Wal-Mart Stores in the last subperiod, both in the case of 1% transaction costs. If transaction costs increase from 0 to 0.75%, then in the last row of table 3.13 it can be seen that for the full sample period the excess Sharpe ratio declines on average from 0.0258 to 0.0078. For the full sample period table 3.12 shows that the best strategies selected in the case of zero transaction costs are mainly strategies that generate a lot of signals. Trading positions are held for only a short period. Moreover, for most data series the best-selected strategy is the same as in the case that the best strategy is selected by the mean return criterion. If costs are increased to 0.25% per trade, then the best-selected strategies generate fewer signals and trading positions are held for longer periods. Now for 14 data series the best-selected strategy differs from the case when the best strategy is selected by the mean return criterion. For the two subperiods similar results are found. However the excess Sharpe ratios are higher in the period 1973-1986 than in the period 1987-2001. As for the mean return criterion it is found that for each data series the best strategy, selected by the Sharpe ratio criterion, beats the buy-and-hold benchmark and that this strategy can profitably be exploited, even after correction for transaction costs. The results show that technical trading strategies were most profitable in the period 19731986, but also profits are made in the period 1987-2001. CAPM The estimation results of the Sharpe-Lintner CAPM in tables 3.14 and 3.15 for the Sharpe ratio selection criterion are similar to the estimation results in tables 3.8 and 3.9 for the mean return selection criterion. In the case of zero transaction costs for most data series.
Table 3.15: Summary: significance CAPM estimates, Sharpe ratio criterion. For all periods
and for each transaction cost case, the table shows the number of data series for which significant estimates are found at the 10% significance level for the coefficients in the Sharpe-Lintner CAPM (3.5). Columns 1 and 2 show the number of data series for which the estimate of α is significantly negative and positive. Columns 3 and 4 show the number of data series for which the estimate of β is significantly smaller and larger than one. Column 5 shows the number of data series for which the estimate of α is significantly positive as well as the estimate of β is significantly smaller than one. Column 6 shows the number of data series for which the estimate of α is significantly positive as well as the estimate of β is significantly larger than one. Note that for the periods 1973-2001, 1973-1986 and 1987-2001, the number of data series analyzed is equal to 35, 30 and 35.
Data snooping If the best strategy is selected by the Sharpe ratio criterion, then table 3.16 shows the nominal, White’s RC and Hansen’s SPA-test p-values for all periods and different transaction costs cases. The results are shown for the full sample period 1973-2001 in the case of 0 and 0.10% costs per trade and for the two subperiods 1973-1986 and 1987-2001 in the case of 0 and 0.25% costs per trade. Table 3.17 summarizes the results for all periods and all transaction cost cases by showing the number of data series for which the corresponding p-value is smaller than 0.10. If the nominal p-value is used to test the null hypothesis that the best strategy is not superior to the buy-and-hold benchmark, then the null is rejected in all periods for.
Table 3.17: Summary: Testing for predictive ability, Sharpe ratio criterion. For all periods and for each transaction cost case, the table shows the number of data series for which the nominal (pn ), White’s (2000) Reality Check (pW ) or Hansen’s (2001) Superior Predictive Ability test (pH ) p-value is smaller than 0.10. Note that for the periods 1973-2001, 1973-1986 and 1987-2001, the number of data series analyzed is equal to 35, 30 and 35. most data series at the 5% significance level. For the full sample period, if a correction is made for data snooping, then it is found, in the case of zero transaction costs, that for 4 data series the null hypothesis that the best strategy is not superior to the benchmark after correcting for data snooping is rejected by the RC at the 10% significance level. However, for 16 data series the null hypothesis that none of the strategies is superior to the benchmark after correcting for data snooping is rejected by the SPA-test. Thus for 12 data series the RC leads to wrong inferences about the forecasting power of the best-selected strategy. However, if we implement as little as 0.10% costs, then these contradictory results only occur for 3 data series (the null is rejected for none of the data series by the RC) and if we increase the costs even further to 0.25%, then for none of the data series either test rejects the null. In the first subperiod 1973-1986, if zero transaction costs are implemented, then the RC p-value rejects the null for 10 data series, while the SPA-test p-value rejects the null for 21 data series. For the second subperiod 1987-2001, if no transaction costs are implemented, then the results of both tests are more in conjunction. The RC rejects the null for none of the data series, while the SPA-test rejects the null for 5 data series. If transaction costs are increased to 0.25%, then for both subperiods both tests do not reject the null for almost all data series. Only for Goodyear Tire, in the last subperiod, the SPA-test does reject the null, even in the 1% costs case. Hence, we conclude that the best strategy, selected by the Sharpe ratio criterion, is not capable of beating the benchmark of a buy-and-hold strategy, after a correction is made for transaction costs and data snooping.
Like most academic literature on technical analysis, we investigated the profitability and forecastability of technical trading rules in sample, instead of out of sample. White’s (2000) RC and Hansen’s (2001) SPA-test, as we applied them, are indeed in-sample test procedures as they test whether the best strategy in a certain trading period has significant forecasting power, after correction for the search for the best strategy in that specific trading period. However, whether a technical trading strategy applied to a financial time series in a certain period shows economically/statistically significant forecasting power does not say much about its future performance. If it shows forecasting power, then profits earned in the past do not necessarily imply that profits can also be made in the future. On the other hand, if the strategy does not show forecasting power, then it could be that during certain subperiods the strategy was actually performing very well due to some characteristics in the data, but the same strategy was loosing during other subperiods, because the characteristics of the data changed. Therefore, only inferences about the forecastability of technical analysis can be made by testing whether strategies that performed well in the past, also perform well in the future. In this section we test the forecasting power of our set of trend-following technical trading techniques by applying a recursive optimizing and testing procedure. For example, recursively at the beginning of each month we investigate which technical trading rule performed the best in the preceding six months (training period) and we select this best strategy to generate trading signals during the coming month (testing period). Sullivan et al. (1999) also apply a recursive out-of-sample forecasting procedure. However, in their procedure, the strategy which performed the best from t = 0 is selected to make one step ahead forecasts. We instead use a moving window, as in Lee and Mathur (1995), in which strategies are compared and the best strategy is selected to make forecasts for some period thereafter. Our approach is similar to the recursive modeling, estimation and forecasting approach of Pesaran and Timmermann (1995, 2000) and Marquering and Verbeek (2000). They use a collection of macro-economic variables as information set to base trading decisions upon. A linear regression model, with a subset of the macro-economic variables as regressors and the excess return of the risky asset over the risk-free interest rate as dependent variable, is estimated recursively with ordinary least squares. The subset of macro-economic variables which yields the best fit to the excess returns is selected to make an out-of-sample forecast of the excess return for the next period. According to a certain trading strategy a position in the market is chosen on the basis of the forecast. They show that historical fundamental information can help in predicting excess returns. We will do essentially the same for.
Technical trading strategies, using only past observations from the financial time series itself. We define the training period on day t to last from t − T r until and including t − 1, where T r is the length of the training period. The testing period lasts from t until and including t + T e − 1, where T e is the length of the testing period. For each of the 787 strategies the performance during the training period is computed. Then the best technical trading strategy is selected by the mean return or Sharpe ratio criterion and is applied in the testing period to generate trading signals. After the end of the testing period this procedure is repeated again until the end of the data series is reached. For the training and testing periods we use 36 different parameterizations of [T r , T e] which can be found in Appendix C. In the case of 0.25% transaction costs tables 3.18 and 3.19 show for the DJIA and for each stock in the DJIA some statistics of the best recursive optimizing and testing procedure, if the best strategy in the training period is selected by the mean return and Sharpe ratio criterion respectively. Because the longest training period is five years, the results are computed for the period 1978:10-2001:6. Table 3.20A, B (i.e. table 3.20 panel A, panel B) summarizes the results for both selection criteria in the case of 0, 0.10 and 0.50% costs per trade. In the second to last row of table 3.20A it can be seen that, if in the training period the best strategy is selected by the mean return criterion, then the excess return over the buy-and-hold of the best recursive optimizing and testing procedure is, on average, 12.3, 6.9, 2.8 and −1.2% yearly in the case of 0, 0.10, 0.25 and 0.50% costs per trade. Thus the excess returns decline on average sharply when implementing as little as 0.10% costs. If the Sharpe ratio criterion is used for selecting the best strategy during the training period, then the Sharpe ratio of the best recursive optimizing and testing procedure in excess of the Sharpe ratio of the buy-and-hold benchmark is on average 0.0145, 0.0077, 0.0031 and −0.0020 in the case of 0, 0.10, 0.25 and 0.50% costs per trade, also declining sharply when low costs are implemented (see second to last row of table 3.20B). Thus in our recursive out-of-sample testing procedures small transaction costs cause forecastability to disappear. For comparison, the last row in table 3.20A, B shows the average over the results of the best strategies selected by the mean return or Sharpe ratio criterion in sample for each data series tabulated. As can be seen, clearly the results of the best strategies selected in sample are better than the results of the best recursive out-of-sample forecasting procedure. If the mean return selection criterion is used, then table 3.21A shows for the 0 and 0.10% transaction cost cases6 for each data series the estimation results of the Sharpe6.
Lintner CAPM (see equation 3.5) where the return of the best recursive optimizing and testing procedure in excess of the risk-free interest rate is regressed against a constant α and the return of the DJIA in excess of the risk-free interest rate. Estimation is done with Newey-West (1987) heteroskedasticity and autocorrelation consistent (HAC) standard errors. Table 3.22 summarizes the CAPM estimation results for all transaction cost cases by showing the number of data series for which significant estimates of α and β are found at the 10% significance level. In the case of zero transaction costs for 12 data series out of 35 the estimate of α is significantly positive at the 10% significance level. This number decreases to 3 (1, 0) if 0.10% (0.25, 0.50%) costs per trade are implemented. Table 3.21B shows the results of the CAPM estimation for the case that the best strategy in the training period is selected by the Sharpe ratio criterion. Now in the case of zero transaction costs for 14 data series it is found that the estimate of α is significantly positive at the 10% significance level. If transaction costs increase to 0.10% (0.25, 0.50%), then only for 7 (6, 1) out of 35 data series the estimate of α is significantly positive. Hence, after correction for transaction costs and risk it can be concluded, independently of the selection criterion used, that the best recursive optimizing and testing procedure shows no statistically significant out-of-sample forecasting power.
costs 0% 0.10% 0.25% 0.50% costs 0% 0.10% 0.25% 0.50% Selection criterion: mean return α < 0 α > 0 β < 1 β > 1 α > 0∧ α > 0∧ β<1 β>1 0 12 13 3 5 2 0 3 12 5 2 1 0 1 8 7 0 1 1 0 7 7 0 0 Selection criterion: Sharpe ratio α < 0 α > 0 β < 1 β > 1 α > 0∧ α > 0∧ β<1 β>1 0 14 15 4 7 1 0 7 16 3 2 0 1 6 14 3 2 1 0 1 12 4 0 0.
Table 3.22: Summary: significance CAPM estimates for best out-of-sample testing procedure. For each transaction cost case, the table shows the number of data series for which significant estimates are found at the 10% significance level for the coefficients in the Sharpe-Lintner CAPM. Columns 1 and 2 show the number of data series for which the estimate of α is significantly negative and positive. Columns 3 and 4 show the number of data series for which the estimate of β is significantly smaller and larger than one. Column 5 shows the number of data series for which the estimate of α is significantly positive as well as the estimate of β is significantly smaller than one. Column 6 shows the number of data series for which the estimate of α is significantly positive as well as the estimate of β is significantly larger than one.
In this chapter we apply a set of 787 objective computerized trend-following technical trading techniques to the Dow-Jones Industrial Average (DJIA) and to 34 stocks listed in the DJIA in the period January 1973 through June 2001. For each data series the best technical trading strategy is selected by the mean return or Sharpe ratio criterion. Because numerous research papers found that technical trading rules show some forecasting power in the era until 1987, but not in the period thereafter, we split our sample in two subperiods: 1973-1986 and 1987-2001. We find for all periods and for both selection criteria that for each data series a technical trading strategy can be selected that is capable of beating the buy-and-hold benchmark, even after correction for transaction costs. Although buy-and-hold stock investments had difficulty in beating a continuous risk free investment during the 1973-1986 subsample, the strongest results in favour of technical trading are found for this subperiod. For example, in the full sample period 1973-2001 it is found that the best strategy beats the buy-and-hold benchmark on average with 19, 10, 7.5, 6.1, 5.3 and 4.9% yearly in the case of 0, 0.10, 0.25, 0.50, 0.75 and 1% transaction costs, if the best strategy is selected by the mean return criterion. These are quite substantial numbers. The profits generated by the technical trading strategies could be the reward necessary to attract investors to bear the risk of holding the asset. To test this hypothesis we estimate Sharpe-Lintner CAPMs. For each data series the daily return of the best strategy in excess of the risk-free interest rate is regressed against a constant (α) and the daily return of the DJIA in excess of the risk-free interest rate. The coefficient of the last regression term is called β and measures the riskiness of the strategy relatively to buying and holding the market portfolio. If technical trading rules do not generate excess profits after correction for risk, then α should not be significantly different from zero. If no transaction costs are implemented, then we find for both selection criteria that in all periods for most data series the estimate of α is significantly positive. This means that the best selected technical trading rules show forecasting power after a correction is made for risk. However, if costs are increased, we are less able to reject the null hypothesis that technical trading rule profits are the reward for bearing risk. But still, in numerous cases the estimate of α is significantly positive. An important question is whether the positive results found in favour of technical trading are due to chance or the fact that the best strategy has genuine superior forecasting power over the buy-and-hold benchmark. This is called the danger of data snooping. We apply White’s (2000) Reality Check (RC) and Hansen’s (2001) Superior Predictive Ability.
(SPA) test, to test the null hypothesis that the best strategy found in a specification search is not superior to the benchmark of a buy-and-hold if a correction is made for data snooping. Hansen (2001) showed that the RC is sensitive to the inclusion of poor and irrelevant forecasting rules. Because we compute p-values for both tests, we can investigate whether both test procedures lead to contradictory inferences. If no transaction costs are implemented, then we find for the mean return and the Sharpe ratio criterion that the RC and the SPA-test in some cases lead to different conclusions, especially for the subperiod 1973-1986. The SPA-test finds in numerous cases that the best strategy does beat the buy-and-hold significantly after correction for data snooping and the implementation of bad strategies. Thus the biased RC misguides the researcher in several cases by not rejecting the null. However, if as little as 0.25% costs per trade are implemented, then both tests lead for both selection criteria, for all sample periods and for all data series to the same conclusion: the best strategy is not capable of beating the buy-and-hold benchmark after a correction is made for the specification search that is used to find the best strategy. We therefore finally conclude that the good performance of trend-following technical trading techniques applied to the DJIA and to the individual stocks listed in the DJIA, especially in the 1973-1986 subperiod, is merely the result of chance than of good forecasting power. Next we apply a recursive optimizing and testing method to test whether the best strategy found in a specification search during a training period shows also forecasting power during a testing period thereafter. For example, every month the best strategy from the last 6 months is selected to generate trading signals during that month. In total we examine 36 different training and testing period combinations. In the case of no transaction costs, the best recursive optimizing and testing procedure yields on average an excess return over the buy-and-hold of 12.3% yearly, if the best strategy in the training period is selected by the mean return criterion. Thus the best strategy found in the past continues to generate good results in the future. However, if as little as 0.25% transaction costs are implemented, then the excess return decreases to 2.8%. Finally, estimation of Sharpe-Lintner CAPMs shows that, after correction for transaction costs and risk, the best recursive optimizing and testing procedure has no statistically significant forecasting power anymore. Hence, in short, after correcting for transaction costs, risk, data snooping and outof-sample forecasting, we conclude that objective trend-following technical trading techniques applied to the DJIA and to the stocks listed in the DJIA in the period 1973-2001 are not genuine superior, as suggested by their performances, to the buy-and-hold benchmark.
The data series examined in this chapter are the daily closing levels of the Amsterdam Stock Exchange Index (AEX-index) and the daily closing prices of all stocks listed in this index in the period January 3, 1983 through May 31, 2002. The AEX-index is a market-weighted average of the 25 most important stocks traded at the Amsterdam Stock Exchange. These stocks are chosen once a year and their selection is based on the value of trading turnover during the preceding year. At the moment of composition of the index the weights are restricted to be at maximum 10%. Table 4.1 shows an historical overview when and which stocks entered or left the index and in some cases the reason why. For example, Algemene Bank Nederland (ABN) merged with AMRO Bank at August 27, 1990 and the new combination was listed under the new name ABN AMRO Bank. In total we evaluate a set of 50 stocks. All data series are corrected for dividends, capital changes and stock splits. As a proxy for the risk-free interest rate we use daily data on Dutch monthly interbank rates. Table 4.2 shows for each data series the sample period and the largest cumulative loss, that is the largest decline from a peak to a through. Next, table 4.3 shows the summary statistics. Because the first 260 data points are used for initializing the technical trading strategies, the summary statistics are shown from January 1, 1984. The first and second column show the names of the data series examined and the number of available data points. The third column shows the mean yearly effective return in percentage/100 terms. The fourth through seventh column show the mean, standard deviation, skewness and kurtosis of the logarithmic daily return. The eight column shows the t-ratio to test whether the mean logarithmic daily return is significantly different from zero. The ninth column shows the Sharpe ratio, that is the extra return over the risk-free interest rate per extra point of risk, as measured by the standard deviation. The tenth column shows the largest cumulative loss of the stocks in percentage/100 terms. The eleventh column shows the Ljung-Box (1978) Q-statistic testing whether the first 20 autocorrelations of the return series as a whole are significantly different from zero. The twelfth column shows the heteroskedasticity adjusted Box-Pierce (1970) Q-statistic, as derived by Diebold (1986). The final column shows the Ljung-Box (1978) Q-statistic testing for autocorrelations in the squared returns. The mean yearly effective return of the AEX-index during the 1983-2002 period is equal to 10.4% and the yearly standard deviation is approximately equal to 19%. For the AEX-index and 21 stocks the mean logarithmic return is significantly positive, as tested with the simple t-ratios, while for 5 stocks the mean yearly effective return is severely and significantly negative. For example, the business firm Ceteco and truck builder Daf went.
Broke, while the communications and cable networks related companies KPNQWest, UPC and Versatel stopped recently all payments due to their creditors. For the other 4 stocks which show negative returns, plane builder Fokker went broke, software builder Baan was taken over by the British Invensys, while telecommunications firm KPN and temporary employment agency Vedior are nowadays struggling for survival. The return distribution is strongly leptokurtic for all data series, especially for Ceteco, Fokker, Getronics and Nedlloyd, and is negatively skewed for the AEX-index and 32 stocks. On individual basis the stocks are more risky than the market-weighted AEX-index, as can be seen by the standard deviations and the largest cumulative loss numbers. Thus it is clear that firm specific risks are reduced by a diversified index. The Sharpe ratio is negative for 12 stocks, which means that these stocks were not able to beat a risk free investment. Among them are ABN, KLM and the earlier mentioned stocks. The largest cumulative loss of the AEXindex is equal to 47% and took place in the period August 12, 1987 through November 10, 1987. October 19, 1987 showed the biggest one-day percentage loss in history of the AEX-index and brought the index down by 12%. November 11, 1987 on its turn showed the largest one-day gain and brought the index up by 11.8%. For 30 stocks, for which we have data starting before the crash of 1987, only half showed a largest cumulative loss during the year 1987, and their deterioration started well before October 1987, indicating that stock prices were already decaying for a while before the crash actually happened. The financials, for example, lost approximately half of their value during the 1987 period. For the other stocks, for which we have data after the crash of 1987, the periods of largest decline started ten years later in 1997. Baan, Ceteco, Getronics, KPN, KPNQWest, OCE, UPC and Versatel lost almost their total value within two years during the burst of the internet and telecommunications bubble. The summary statistics show no largest declines after the terrorist attack against the US on September 11, 20011. With hindsight, the overall picture is that financials, chemicals and foods produced the best results. We computed autocorrelation functions (ACFs) of the returns and significance is tested with Bartlett (1946) standard errors and Diebold’s (1986) heteroskedasticity-consistent standard errors2 . Typically autocorrelations of the returns are small with only few lags being significant. Without correcting for heteroskedasticity we find for 36 of the 50 stocks a significant first order autocorrelation, while when corrected for heteroskedasticity we find for 24 stocks a significant first order autocorrelation at the 10% significance level.
At the moment of writing the stock exchanges were reaching new lows, which is not visible in these data until May 2002. 2 See section 3.2, page 99, for an explanation. Separate ACFs of the returns are computed for each data series, but not presented here to save space. The tables are available upon request from the author.
No severe autocorrelation is found in the AEX-index. It is noteworthy that for most data series the second order autocorrelation is negative, while only in 8 out of 51 cases it is positive. The first order autocorrelation is negative in 10 cases. The Ljung-Box (1978) Q-statistics in the second to last column of table 4.3 reject for almost all data series the null hypothesis that the first 20 autocorrelations of the returns as a whole are equal to zero. For only 10 data series the null is not rejected. When looking at the first to last column with Diebold’s (1986) heteroskedasticity-consistent Box-Pierce (1970) Q-statistics it appears that heteroskedasticity indeed seriously affects the inferences about serial correlation in the returns. When a correction is made for heteroskedasticity, then for the AEX-index and 41 stocks the null of no autocorrelation is not rejected. The autocorrelation functions of the squared returns show that for all data series the autocorrelations are high and significant up to order 20. The Ljung-Box (1978) Q-statistics reject the null of no autocorrelation in the squared returns firmly, except for steel manufacturer Corus. Hence, almost all data series exhibit significant volatility clustering, that is large (small) shocks are likely to be followed by large (small) shocks.
Technical trading rule performance In section 4.2 we have shown that almost no significant autocorrelation in the daily returns can be found after correction for heteroskedasticity. This implies that there is no linear dependence present in the data. One may thus question whether technical trading strategies can persistently beat the buy-and-hold benchmark. However, as noted by Alexander (1961), the dependence in price changes can be of such a complicated nonlinear form that standard linear statistical tools, such as serial correlations, may provide misleading measures of the degree of dependence in the data. Therefore he proposed to use nonlinear technical trading rules to test for dependence. In total we apply 787 objective computerized trend-following technical trading techniques with and without transaction costs to the AEX-index and to 50 stocks listed in the AEX-index (see sections 2.3 and 3.3 and Appendix B of Chapter 3 for the technical trading rule parameterizations). Tables 4.4 and 4.5 show for each data series some statistics of the best strategy selected by the mean return criterion, if 0% and 0.25% costs per trade are implemented. Column 2 shows the parameters of the best strategy. In the case of a moving-average (MA) strategy these parameters are “[short run MA, long run MA]” plus the refinement parameters .
filter, time delay filter, fixed holding period, stop-loss]”. In the case of a trading range break, also called support-and-resistance (SR), strategy, the parameters are “[the number of days over which the local maximum and minimum is computed]” plus the refinement parameters as with the moving averages. In the case of a filter (FR) strategy the parameters are “[the %-filter, time delay filter, fixed holding period]”. Columns 3 and 4 show the mean yearly return and excess mean yearly return of the best-selected strategy over the buy-and-hold benchmark, while columns 5 and 6 show the Sharpe ratio and excess Sharpe ratio of the best-selected strategy over the buy-and-hold benchmark. Column 7 shows the maximum loss the best strategy generates. Columns 8, 9 and 10 show the number of trades, the percentage of profitable trades and the percentage of days profitable trades last. Finally, the last column shows the standard deviation of the returns of the data series during profitable trades divided by the standard deviation of the returns of the data series during non-profitable trades. To summarize, for each data series examined table 4.7A (i.e. table 4.7 panel A) shows the mean yearly excess return over the buy-and-hold benchmark of the best strategy selected by the mean return criterion, after implementing 0, 0.10, 0.25, 0.50, 0.75 and 1% costs per trade. This wide range of costs captures a range of different trader types. For example, floor traders and large investors, such as mutual funds, can trade against relatively low transaction costs in the range of 0.10 to 0.25%. Home investors face higher costs in the range of 0.25 to 0.75%, depending whether they trade through the internet, by telephone or through their personal account manager. Next, because of the bid-ask spread, extra costs over the transaction costs are faced. By examining a wide range of 0 to 1% costs per trade, we belief that we can capture most of the cost possibilities faced in reality by most of the traders. The results in table 4.7A are astonishing. As can be seen in the last row of the table, on average, the mean yearly excess return of the best strategy over the buy-andhold benchmark is equal to 152% in the case of zero transaction costs, and it still is 124% in the case of 1% transaction costs. These incredibly good results are mainly caused by the communications and cable network firms KPNQWest, UPC and Versatel. However, subtracting all stocks for which the best strategy generates a return of more than 100% yearly in excess of the buy-and-hold, then, on average, the yearly excess return of the best strategy is equal to 32% in the case of no transaction costs, declining to 15%, if transaction costs increase to 1% per trade. Thus from these results we conclude that technical trading rules are capable of beating a buy-and-hold benchmark even after correction for transaction costs. These results are substantially better than when the same strategy set is applied to the DJIA and to stocks listed in the DJIA.
The period 1987-2001, on average, the mean yearly excess return over the buy-and-hold benchmark declines from 17% to 7%, if transaction costs are increased from 0% to 1% per trade (see section 3.6.1, page 106, and table 3.7, page 128). It is interesting to compare our results to Fama (1965) and Theil and Leenders (1965). It was found by Theil and Leenders (1965) that the proportions of securities advancing and declining today on the Amsterdam Stock Exchange can help in predicting the proportions of securities advancing and declining tomorrow. However, Fama (1965) in contrast found that this is not true for the New York Stock Exchange. In our study we find that this difference in forecastability of both stock markets tends to persists into the 1980s and 1990s. From table 4.4 it can be seen that in the case of zero transaction costs the best-selected strategies are mainly strategies which generate a lot of signals. Trading positions are held for only a few days. With hindsight, the best strategy for the Fokker and UPC stocks was to never have bought them, earning a risk-free interest rate during the investment period. For the AEX-index, in contrast, the best strategy is a single crossover movingaverage rule which generates a signal if the price series crosses a 25-day moving average and where the single refinement is a 10% stop-loss. The mean yearly return is equal to 25%, which corresponds with a mean yearly excess return of 13.2%. The Sharpe ratio is equal to 0.0454 and the excess Sharpe ratio is equal to 0.0307. These excess performance measures are considerably large. The maximum loss of the strategy is 43.9%, slightly less than the maximum loss of buying and holding the AEX-index, which is equal to 46.7% (table 4.2). Once every 12 days the strategy generates a trade and in 65.9% of the trades is profitable. These profitable trades span 85% of the total number of trading days. Although the technical trading rules show economic significance, they all go through periods of heavy losses, well above the 50% for most stocks. If transaction costs are increased to 0.25%, then table 4.5 shows that the best-selected strategies are strategies which generate substantially fewer signals in comparison with the zero transaction costs case. Trading positions are now held for a longer time. For example, for the AEX-index the best-selected strategy generates a trade every one-and-a-half year. Also the percentage of profitable trades and the percentage of days profitable trades last increases for most data series. Most extremely this is the case for the AEX-index; the 13 trading signals of the best-selected strategy were all profitable. CAPM If no transaction costs are implemented, then from the last column in table 4.4 it can be seen that the standard deviations of the returns of the data series themselves during profitable trades are higher than the standard deviations of the returns during non-profitable.
Trades for the AEX-index and almost all stocks, except for Gist Brocades, Stork, TPG and Unilever. However, if 0.25% costs per trade are calculated, then for 22 data series out of 51 the standard deviation ratio is larger than one. According to the efficient markets hypothesis it is not possible to exploit a data set with past information to predict future price changes. The excellent performance of the technical trading rules could therefore be the reward for holding a risky asset needed to attract investors to bear the risk. Since the technical trading rule forecasts only depend on past price history, it seems unlikely that they should result in unusual risk-adjusted profits. To test this hypothesis we regress Sharpe-Lintner capital asset pricing models (CAPMs).
Technical trading strategy relatively to the passive strategy of buying and holding the market portfolio. If β is not significantly different from one, then it is said that the strategy has equal risk as a buying and holding the market portfolio. If β > 1 (β < 1), then it is said that the strategy is more risky (less risky) than buying and holding the market portfolio and that it therefore should yield larger (smaller) returns. The coefficient α measures the excess return of the best strategy applied to stock i after correction of bearing risk. If it is not possible to beat a broad market portfolio after correction for risk and hence technical trading rule profits are just the reward for bearing risk, then α should not be significantly different from zero. Table 4.8A shows for the 0 and 0.50% transaction costs cases3 the estimation results if for each data series the best strategy is selected by the mean return criterion. Estimation is done with Newey-West (1987) heteroskedasticity and autocorrelation consistent (HAC) standard errors. Table 4.10 summarizes the CAPM estimation results for all transaction cost cases by showing the number of data series for which significant estimates of α or β are found at the 10% significance level. For example, for the best strategy applied to the AEX-index in the case of zero transaction costs, the estimate of α is significantly positive at the 1% significance level and is equal to 5.27 basis points per day, that is approximately 13.3% per year. The estimate of β is significantly smaller than one at the 5% significance level, which indicates that although the strategy generates a higher reward than simply buying and holding the index, it is less risky. If transaction costs increase to 1%, then the estimate of α decreases
We also estimated the Sharpe-Lintner CAPMs for the 0.10, 0.25, 0.75 and 1% transaction costs cases. The estimation results for the separate stocks are not presented here to save space.
Table 4.10: Summary: significance CAPM estimates, mean return criterion. For each transaction cost case, the table shows the number of data series for which significant estimates are found at the 10% significance level for the coefficients in the Sharpe-Lintner CAPM (4.1). Columns 1 and 2 show the number of data series for which the estimate of α is significantly negative and positive. Columns 3 and 4 show the number of data series for which the estimate of β is significantly smaller and larger than one. Column 5 shows the number of data series for which the estimate of α is significantly positive as well as the estimate of β is significantly smaller than one. Column 6 shows the number of data series for which the estimate of α is significantly positive as well as the estimate of β is significantly larger than one. Note that the number of data series analyzed is equal to 51 (50 stocks and the AEX-index).
To 3.16 basis points per day, 8% per year, but is still significantly positive. However the estimate of β is not significantly smaller than one anymore if as little as 0.10% costs per trade are charged. As further can be seen in the tables, if no transaction costs are implemented, then for most of the stocks the estimate of α is also significantly positive at the 10% significance level. Only for 2 stocks the estimate of α is significantly smaller than zero, while it is significantly positive for 36 stocks. Further the estimate of β is significantly smaller than one for 36 stocks (Fokker and UPC excluded). Only for two stocks β is significantly larger than one. The estimate of α decreases as costs increase and becomes less significant in more cases. However in the 0.50% and 1% costs per trade cases for example, still for respectively 31 and 24 data series out of 51 the estimate of α is significantly positive at the 10% significance level. Notice that for a large number of cases it is found that the estimate of α is significantly positive while simultaneously the estimate of β is significantly smaller than one. This means that the best-selected strategy did not only generate a statistically significant excess return over the buy-and-hold benchmark, but is also significantly less risky than the buy-and-hold benchmark. From the findings until now we conclude that there are trend-following technical trading techniques which can profitably be exploited, also after correction for transaction costs, when applied to the AEX-index and to stocks listed in the AEX-index in the period January 1983 through May 2002.
The best strategy applied to a stock is less risky, i.e. β < 1, than buying and holding the market portfolio. Hence we can reject the null hypothesis that the profits of technical trading are just the reward for bearing risk. Data snooping The question remains open whether the findings in favour of technical trading for particular stocks are the result of chance or of real superior forecasting power. Therefore we apply White’s (2000) Reality Check (RC) and Hansen’s (2001) Superior Predictive Ability (SPA) test. Because Hansen (2001) showed that White’s RC is biased in the direction of one, p-values are computed for both tests to investigate whether these tests lead in some cases to different inferences. In the case of 0 and 0.10% transaction costs table 4.9A shows the nominal, White’s (2000) RC and Hansen’s (2001) SPA-test p-values, if the best strategy is selected by the mean return criterion. Calculations are also done for the 0.25, 0.50, 0.75 and 1% costs per trade cases, but these yield no remarkably different results compared with the 0.10% costs per trade case. Table 4.11 summarizes the results for all transaction cost cases by showing the number of data series for which the corresponding p-value is smaller than 0.10. That is, the number of data series for which the null hypothesis is rejected at the 10% significance level.
Technical trading rule performance Similar to tables 4.4 and 4.5, table 4.6 shows for some data series some statistics of the best strategy selected by the Sharpe ratio criterion, if 0 or 0.25% costs per trade are implemented. Only the results for those data series are presented for which the best strategy selected by the Sharpe ratio criterion differs from the best strategy selected by the mean return criterion. Further table 4.7B shows for each data series the Sharpe ratio of the best strategy selected by the Sharpe ratio criterion, after implementing 0, 0.10, 0.25, 0.50, 0.75 and 1% transaction costs, in excess of the Sharpe ratio of the buy-andhold benchmark. It is found that the Sharpe ratio of the best-selected strategy in excess of the Sharpe ratio of the buy-and-hold benchmark is positive in all cases. In the last row of table 4.7B it can be seen that the average excess Sharpe ratio declines from 0.0477 to 0.0311 if transaction costs increase from 0 to 1%. For the full sample period table 4.6 shows that the best strategies selected in the case of zero transaction costs are mainly strategies that generate a lot of signals. Trading positions are held for only a short period. Moreover, for most data series, except 13, these best-selected strategies are the same as in the case that the best strategies are selected by the mean return criterion. If transaction costs are increased to 0.25% per trade, then the best strategies generate fewer signals and trading positions are held for longer periods. In that case for the AEX-index and 18 stocks the best-selected strategy differs from the case where strategies are selected by the mean return criterion. As for the mean return criterion it is found that for each data series the best technical trading strategy, selected by the Sharpe ratio criterion, beats the buy-and-hold benchmark.
And that this strategy can profitably be exploited, even after correction for transaction costs. CAPM The estimation results of the Sharpe-Lintner CAPM in tables 4.8B and 4.12 for the Sharpe ratio criterion are similar to the estimation results in tables 4.8A and 4.10 for the mean return criterion. If zero transaction costs are implemented, then it is found for 39 out of 51 data series that the estimate of α is significantly positive at the 10% significance level. This number decreases to 32 and 25 data series if transaction costs increase to 0.50 and 1% per trade. The estimates of β are in general significantly smaller than one. Thus, after correction for transaction costs and risk, for approximately half of the data series examined it is found that the best technical trading strategy selected by the Sharpe ratio criterion outperforms the strategy of buying and holding the market portfolio and is even less risky.
Table 4.12: Summary: significance CAPM estimates, Sharpe ratio criterion. For each transaction cost case, the table shows the number of data series for which significant estimates are found at the 10% significance level for the coefficients in the Sharpe-Lintner CAPM (4.1). Columns 1 and 2 show the number of data series for which the estimate of α is significantly negative and positive. Columns 3 and 4 show the number of data series for which the estimate of β is significantly smaller and larger than one. Column 5 shows the number of data series for which the estimate of α is significantly positive as well as the estimate of β is significantly smaller than one. Column 6 shows the number of data series for which the estimate of α is significantly positive as well as the estimate of β is significantly larger than one. Note that the number of data series analyzed is equal to 51 (50 stocks and the AEX-index).
Data snooping In the case of 0 and 0.10% transaction costs table 4.9B shows the nominal, White’s RC and Hansen’s SPA-test p-values, if the best strategy is selected by the Sharpe ratio criterion. Table 4.13 summarizes the results for all transaction cost cases by showing the number of data series for which the corresponding p-value is smaller than 0.10. The results for the Sharpe ratio selection criterion differ from the mean return selection criterion. If the nominal p-value is used to test the null that the best strategy is not.
Superior to the benchmark of buy-and-hold, then the null is rejected for most data series at the 10% significance level for all cost cases. If a correction is made for data snooping, then it is found for the no transaction costs case that for 10 data series the null hypothesis that the best strategy is not superior to the benchmark after correcting for data snooping is rejected by the RC. However for 30 data series the null hypothesis that none of the alternative strategies is superior to the buy-and-hold benchmark after correcting for data snooping is rejected by the SPA-test. The two data snooping tests thus give contradictory results for 20 data series. Even if costs are charged it is found that in a large number of cases the SPA-test rejects the null, while the RC does not. If costs are increased to 0.10 and 1%, then for respectively 17 and 15 data series the null of no superior predictive ability is rejected by the SPA-test. Note that these results differ substantially from the mean return selection criterion where in the cases of 0.10 and 1% transaction costs the null was rejected for respectively 2 and 1 data series. Hence, we conclude that the best strategy selected by the Sharpe ratio criterion is capable of beating the benchmark of a buy-and-hold strategy for approximately 30% of the stocks analyzed, after a correction is made for transaction costs and data snooping.
In section 3.7 we argued to apply a recursive out-of-sample forecasting approach to test whether technical trading rules have true out-of-sample forecasting power. For example, recursively at the beginning of each month it is investigated which technical trading rule performed the best in the preceding six months (training period) and this strategy is used to generate trading signals during the coming month (testing period). In this section we apply the recursive out-of-sample forecasting procedure to the data series examined in this chapter.
We define the training period on day t to last from t − T r until and including t − 1, where T r is the length of the training period. The testing period lasts from t until and including t + T e − 1, where T e is the length of the testing period. At the end of the training period the best strategy is selected by the mean return or Sharpe ratio criterion. Next, the selected technical trading strategy is applied in the testing period to generate trading signals. After the end of the testing period this procedure is repeated again until the end of the data series is reached. For the training and testing periods we use 28 different parameterizations of [T r , T e] which can be found in Appendix B. Table 4.14A, B shows the results for both selection criteria in the case of 0, 0.10, 0.25, 0.50, 0.75 and 1% transaction costs. Because the longest training period is one year, the results are computed for the period 1984:12-2002:5. In the second to last row of table 4.14A it can be seen that, if in the training period the best strategy is selected by the mean return criterion, then the excess return over the buy-and-hold of the best recursive optimizing and testing procedure is, on average, 32.23, 26.45, 20.85, 15.05, 10.43 and 8.02% yearly in the case of 0, 0.10, 0.25, 0.50, 0.75 and 1% costs per trade. If transaction costs increase, the best recursive optimizing and testing procedure becomes less profitable. However, the excess returns are considerable large. If the Sharpe ratio criterion is used for selecting the best strategy during the training period, then the Sharpe ratio of the best recursive optimizing and testing procedure in excess of the Sharpe ratio of the buyand-hold benchmark is on average 0.0377, 0.0306, 0.0213, 0.0128, 0.0082 and 0.0044 in the case of 0, 0.10, 0.25, 0.50, 0.75 and 1% costs per trade, also declining if transaction costs increase (see second to last row of table 4.14B). For comparison, the last row in table 4.14A, B shows the average over the results of the best strategies selected by the mean return or Sharpe ratio criterion in sample for each data series tabulated. As can be seen, clearly the results of the best strategies selected in sample are much better than the results of the best recursive out-of-sample forecasting procedure. Mainly for the network and telecommunications related companies the out-of-sample forecasting procedure performs much worse than the in-sample results. If the mean return selection criterion is used, then table 4.15A shows for the 0 and 0.50% transaction cost cases for each data series the estimation results of the SharpeLintner CAPM (see equation 4.1) where the return of the best recursive optimizing and testing procedure in excess of the risk-free interest rate is regressed against a constant α and the return of the AEX-index in excess of the risk-free interest rate. Estimation is done with Newey-West (1987) heteroskedasticity and autocorrelation consistent (HAC) standard errors. Table 4.16 summarizes the CAPM estimation results for all transaction cost cases by showing the number of data series for which significant estimates of.
β are found at the 10% significance level. In the case of zero transaction costs for 31 data series out of 51 the estimate of α is significantly positive at the 10% significance level. This number decreases to 21 (10, 4, 3, 2) if 0.10% (0.25, 0.50, 0.75, 1%) costs per trade are implemented. Table 4.15B shows the results of the CAPM estimation for the case that the best strategy in the training period is selected by the Sharpe ratio criterion. Now in the case of zero transaction costs for 33 data series it is found that the estimate of α is significantly positive at the 10% significance level. If transaction costs increase to 0.10% (0.25, 0.50, 0.75, 1%), then for 24 (11, 2, 2, 2) out of 51 data series the estimate of α is significantly positive. Hence, after correction for 1% transaction costs and risk it can be concluded, independently of the selection criterion used, that the best recursive optimizing and testing procedure shows no statistically significant out-of-sample forecasting power.
costs 0% 0.10% 0.25% 0.50% 0.75% 1% costs 0% 0.10% 0.25% 0.50% 0.75% 1% Selection criterion: mean return α < 0 α > 0 β < 1 β > 1 α > 0∧ α > 0∧ β<1 β>1 1 31 35 2 25 0 1 21 32 3 15 0 1 10 34 4 8 0 2 4 31 3 1 0 3 3 29 4 1 1 3 2 30 2 1 0 Selection criterion: Sharpe ratio α < 0 α > 0 β < 1 β > 1 α > 0∧ α > 0∧ β<1 β>1 0 33 42 2 30 1 0 24 39 1 21 0 0 11 40 2 10 0 0 2 36 2 1 0 0 2 34 2 1 0 0 2 35 2 1 0.
Table 4.16: Summary: significance CAPM estimates for best out-of-sample testing procedure. For each transaction cost case, the table shows the number of data series for which significant estimates are found at the 10% significance level for the coefficients in the Sharpe-Lintner CAPM. Columns 1 and 2 show the number of data series for which the estimate of α is significantly negative and positive. Columns 3 and 4 show the number of data series for which the estimate of β is significantly smaller and larger than one. Column 5 shows the number of data series for which the estimate of α is significantly positive as well as the estimate of β is significantly smaller than one. Column 6 shows the number of data series for which the estimate of α is significantly positive as well as the estimate of β is significantly larger than one. Note that the number of data series analyzed is equal to 51 (50 stocks and the AEX-index).
data series the best technical trading strategy is selected by the mean return or Sharpe ratio criterion. The advantage of the Sharpe ratio selection criterion over the mean return selection criterion is that it selects the strategy with the highest return/risk payoff. Although for 12 stocks it is found that they could not even beat a continuous risk free investment, we find for both selection criteria that for each data series a technical trading strategy can be selected that is capable of beating the buy-and-hold benchmark, even after correction for transaction costs. For example, if the best strategy is selected by the mean return criterion, then on average, the best strategy beats the buy-and-hold benchmark with 152, 141, 135, 131, 127 and 124% yearly in the case of 0, 0.10, 0.25, 0.50, 0.75 and 1% transaction costs. However these extremely high numbers are mainly caused by IT and telecommunications related companies. If we discard these companies from the calculations, then still on average, the best strategy beats the buy-and-hold benchmark with 32, 22, 19, 17, 16 and 15% for the six different costs cases. These are quite substantial numbers. The profits generated by the technical trading strategies could be the reward necessary to attract investors to bear the risk of holding the asset. To test this hypothesis we estimate Sharpe-Lintner CAPMs. For each data series the daily return of the best strategy in excess of the risk-free interest rate is regressed against a constant (α) and the daily return of the market-weighted AEX-index in excess of the risk-free interest rate. The coefficient of the last regression term is called β and measures the riskiness of the strategy relatively to buying and holding the market portfolio. If technical trading rules do not generate excess profits after correction for risk, then α should not be significantly different from zero. In the case of zero transaction costs it is found for the mean return as well as the Sharpe ratio criterion that for respectively 37 and 39 data series the estimate of α is significantly positive at the 10% significance level. Even if transaction costs are increased to 1% per trade, then we find for half of the data series that the estimate of α is still significantly positive. Moreover it is found that simultaneously the estimate of β is significantly smaller than one for many data series. Thus for both selection criteria we find for approximately half of the data series that in the presence of transaction costs the best technical trading strategies have forecasting power and even reduce risk. An important question is whether the positive results found in favour of technical trading are due to chance or the fact that the best strategy has genuine superior forecasting power over the buy-and-hold benchmark. This is called the danger of data snooping. We apply White’s (2000) Reality Check (RC) and Hansen’s (2001) Superior Predictive Ability (SPA) test, to test the null hypothesis that the best strategy found in a specification search is not superior to the benchmark of a buy-and-hold if a correction is made for.
Data snooping. Hansen (2001) showed that White’s RC is biased in the direction of one, caused by the inclusion of poor strategies. Because we compute p-values for both tests, we can investigate whether the two test procedures result in different inferences about forecasting ability of technical trading. If zero transaction costs are implemented, then we find for the mean return selection criterion that the RC and the SPA-test in some cases lead to different conclusions. The SPA-test finds in numerous cases that the best strategy does beat the buy-and-hold significantly after correction for data snooping and the inclusion of bad strategies. Thus the biased RC misguides the researcher in several cases by not rejecting the null. However, if as little as 0.10% costs per trade are implemented, then both tests lead for almost all data series to the same conclusion: the best technical trading strategy selected by the mean return criterion is not capable of beating the buy-and-hold benchmark after correcting for the specification search that is used to find the best strategy. In contrast, for the Sharpe ratio selection criterion we find totally different results. Now the SPA-test rejects its null for 30 data series in the case of zero transaction costs, while the RC rejects its null for only 10 data series. If transaction costs are increased further to even 1% per trade, then for approximately one third of the stocks analyzed, the SPA-test rejects the null of no superior predictive ability at the 10% significance level, while the RC rejects the null for only two data series. We find for the Sharpe ratio selection criterion large differences between the two testing procedures. Thus the inclusion of poor performing strategies for which the SPA-test is correcting, can indeed influence the inferences about the predictive ability of technical trading rules. The results show that technical trading has forecasting power for a certain group of stocks listed in the AEX-index. Further the best way to select technical trading strategies is on the basis of the Sharpe ratio criterion. However the testing procedures are mainly done in sample. Therefore next we apply a recursive optimizing and testing method to test whether the best strategy found in a specification search during a training period shows also forecasting power during a testing period thereafter. For example, every month the best strategy from the last 6 months is selected to generate trading signals during that month. In total we examine 28 different training and testing period combinations. In the case of zero transaction costs the best recursive optimizing and testing procedure yields on average an excess return over the buy-and-hold of 32.23% yearly, if the best strategy in the training period is selected by the mean return criterion. Thus the best strategy found in the past continues to generate good results in the future. If 0.50% (1%) transaction costs are implemented, then the excess return decreases to 15.05% (8.02%). These are quite substantial numbers.
Testing procedure has significant forecasting power for more than 40% of the data series examined. However, if transaction costs increase to 1%, then for almost all data series the best recursive optimizing and testing procedure has no statistically significant forecasting power anymore. Hence, in short, after correcting for sufficient transaction costs, risk, data snooping and out-of-sample forecasting, we conclude that objective trend-following technical trading techniques applied to the AEX-index and to stocks listed in the AEX-index in the period 1983-2002 are not genuine superior, as suggested by their performances, to the buy-andhold benchmark. Only for transaction costs below 0.10% technical trading is statistically profitable, if the best strategy is selected by the Sharpe ratio criterion.
Brock, Lakonishok and LeBaron (1992) found that technical trading rules show forecasting power when applied to the Dow-Jones Industrial Average (DJIA) in the period 1896-1986. Sullivan, Timmermann and White (1999) confirmed their results for the same period, after they made a correction for data snooping. However they noticed that the forecasting power seems to disappear in the period after 1986. Next, Bessembinder and Chan (1998) found that break even transaction costs, that is costs for which trading rule profits disappear, are lower than real transaction costs in the period 1926-1991 and that therefore the technical trading rules examined by Brock et al. (1992) are not economically significant when applied to the DJIA. The trading rule set of Brock et al. (1992) has been applied to many other local main stock market indices. For example, to Asian stock markets by Bessembinder and Chan (1995), to the UK stock market by Hudson, Dempsey and Keasey (1996) and Mills (1997), to the Spanish stock market by Fern´ andez-Rodr´ ıguez, Sosvilla-Rivero, and Andrada-F´ elix (2001), to Latin-American stock markets by Ratner and Leal (1999) and to the Hong Kong stock market by Coutts and Cheung (2000). In this chapter we test whether objective computerized trend-following technical trading techniques can profitably be exploited after correction for risk and transaction costs when applied to the local main stock market indices of 50 countries and the MSCI World Index. Firstly, we do as if we are a local trader and we apply the technical trading rules to the indices in local currency and we compute the profits in local currency. However these profits could be spurious if the local currencies weakened against other currencies. Therefore, secondly, we do as if we are an US-based trader and we calculate the profits 187.
That could be made by technical trading rules in US Dollars. For this second case we generate technical trading signals in two different ways. Firstly by using the local main stock market index in local currency and secondly by using the local main stock market index recomputed in US Dollars. Observed technical trading rule profits could be the reward for risk. Therefore we test by estimating a Sharpe-Lintner capital asset pricing model (CAPM) whether the best technical trading rule selected for each stock market index is also profitable after correction for risk. Both the local index and the MSCI World Index are used as the benchmark market portfolio in the CAPM estimation equation. If the technical trading rules show economically significant forecasting power after correction for risk and transaction costs, then further it is tested whether the best strategy found for each local main stock market index is indeed superior to the buy-and-hold benchmark after correction for data snooping. This chapter may therefore be seen as an empirical application of White’s (2000) Reality Check and Peter Hansen’s (2001) test for superior predictive ability. Further we test by recursively optimizing our technical trading rule set whether technical analysis shows true out-of-sample forecasting power. In section 5.2 we list the local main stock market indices examined in this chapter and we show the summary statistics. We refer to the sections 3.3, 3.4 and 3.5 for the discussions on the set of technical trading rules applied, the computation of the performance measures and finally the problem of data snooping. Section 5.3 presents the empirical results of our study. In section 5.4 we test whether recursively optimizing and updating our technical trading rule set shows genuine out-of-sample forecasting ability. Finally, section 5.5 summarizes and concludes.
The data series examined in this chapter are the daily closing levels of local main stock market indices in Africa, the Americas, Asia, Europe, the Middle East and the Pacific in the period January 2, 1981 through June 28, 2002. Local main stock market indices are intended to show a representative picture of the local economy by including the most important or most traded stocks in the index. The MSCI1 World Index is a market capitalization index that is designed to measure global developed market equity performance. In this chapter we analyze in total 51 indices. Column 1 of table 5.1 shows for each country which local main stock market index is chosen. Further for each country data is collected on the exchange rate against the US Dollar. As a proxy for the risk-free Morgan Stanley Capital International. MSCI indices are the most widely used benchmarks by global portfolio managers.
Interest rate we use for most countries daily data on 1-month interbank interest rates when available or otherwise rates on 1-month certificates of deposits. Table 5.1 shows the summary statistics of the stock market indices expressed in local currency, while table 5.2 shows the summary statistics if the stock market indices are expressed in US Dollars. Hence in table 5.2 it can be seen whether the behavior of the exchange rates of the local currencies against the US Dollar alters the features of the local main stock market data. Because the first 260 data points are used for initializing the technical trading strategies, the summary statistics are shown from January 1, 1982. In the tables the first and second column show the names of the indices examined and the number of available data points. The third column shows the mean yearly effective return in percentage/100 terms. The fourth through seventh column show the mean, standard deviation, skewness and kurtosis of the logarithmic daily return. The eight column shows the t-ratio to test whether the mean daily logarithmic return is significantly different from zero. The ninth column shows the Sharpe ratio, that is the extra return over the risk-free interest rate per extra point of risk. The tenth column shows the largest cumulative loss, that is the largest decline from a peak to a through, of the indices in percentage/100 terms. The eleventh column shows for each stock market index the Ljung-Box (1978) Q-statistic testing whether the first 20 autocorrelations of the return series as a whole are significantly different from zero. The twelfth column shows the heteroskedasticity adjusted Box-Pierce (1970) Q-statistic, as derived by Diebold (1986). The final column shows the Ljung-Box (1978) Q-statistic testing for autocorrelations in the squared returns. The mean yearly effective return of the MSCI World Index during the 1982-2002 period is equal to 8.38% and the yearly standard deviation of the returns is approximately equal to 12%. Measured in local currency 7 indices show a negative mean yearly effective return, although not significantly. These are stock market indices mainly in Asia, Eastern Europe and Latin America. For 17 indices a significantly positive mean return is found, mainly for the West European and US indices, but also for the Egyptian CMA and the Israeli TA100 index. If measured in US Dollars, then the number of indices which show a negative mean return more than doubles and increases to 16, while the number of indices which show a significantly positive mean return declines from 17 to 10. Especially for the Asian and Latin American stock market indices the results in US Dollars are worse than in local currency. For example, in the Latin American stock markets the Brazilian Bovespa shows a considerable positive mean yearly return of 13.85% if measured in Brazilian Reals, while it shows a negative mean yearly return of −2.48% if measured in US Dollars. In the Asian stock markets it is remarkable that the results for the Chinese Shanghai Composite, the Hong Kong Hang Seng and the Singapore Straits Times are not affected.
By a recomputation in US Dollars, despite the Asian crises. The separate indices are more risky than the MSCI World Index, as can be seen by the standard deviations and the largest cumulative loss numbers. Thus it is clear that country specific risks are reduced by the broad diversified world index. The return distribution is strongly leptokurtic for all indices and is negatively skewed for a majority of the indices. Thus large negative shocks occur more frequently than large positive shocks. The local interest rates are used for computing the Sharpe ratio (i.e. the extra return over the risk-free interest rate per extra point of risk as measured by the standard deviation) in local currency, while the rates on 1-month US certificates of deposits are used for computing the Sharpe ratio in US Dollars. The Sharpe ratio is negative for 23 indices, if expressed in local currency or in US Dollars, indicating that these indices were not able to beat a continuous risk free investment. Only the European and US stock market indices as well as the Egyptian and Israeli stock market indices were able in generating a positive excess return over the riskfree interest rate. For more than half of the indices the largest cumulative loss is larger than 50% if expressed in local currency or US Dollars. For example, during the Argentine economic crises the Merval lost 77% of its value in local currency and 91% of its value in US Dollars. The Russian Moscow Times lost 94% of its value in US Dollars in a short period of approximately one year between August 1997 and October 1998. The largest decline of the MSCI World Index is equal to 39% and occurred in the period March 27, 2000 through September 21, 2001. Of the 14 indices for which we have data preceding the year 1987, only for 4 indices, namely the DJIA, the NYSE Composite, the Australian ASX and the Dutch AEX, the largest cumulative loss, when measured in local currency, occurred preceding and during the crash of 1987. If measured in US Dollars, only the largest decline for the Dutch AEX changes and took place in the period January 4, 2000 through September 21, 2001. Remarkably for most indices the largest decline started well before the terrorist attack against the US on September 11, 2001, but stopped only 10 days after it2 . With hindsight, the overall picture is that the European and US stock markets performed the best, but also the Egyptian and Israeli stock markets show remarkably good results. We computed autocorrelation functions (ACFs) of the returns and significance is tested with Bartlett (1946) standard errors and Diebold’s (1986) heteroskedasticity-consistent standard errors3 . Typically autocorrelations of the returns are small with only few lags At the moment of writing the stock markets are reaching new lows. See section 3.2, page 99, for an explanation. Separate ACFs of the returns are computed for each stock market index, but not presented here to save space. The tables are available upon request from the author.
Being significant. Without correcting for heteroskedasticity we find for 35 of the 51 indices a significant first order autocorrelation both in local and US currency, while when corrected for heteroskedasticity we find for 30 (23) indices measured in local (US) currency a significant first order autocorrelation at the 10% significance level. It is noteworthy that for more than half of the indices the second order autocorrelation is negative. In contrast, the first order autocorrelation is negative for only 5 (10) indices in local (US) currency. The Ljung-Box (1978) Q-statistics in the second to last columns of tables 5.1 and 5.2 reject for almost all indices the null hypothesis that the first 20 autocorrelations of the returns as a whole are equal to zero. For only 3 (5) indices the null is not rejected in the local (US) currency case, see for example New Zealand’s NZSE30 and the Finnish HEX. When looking at the first to last column with Diebold’s (1986) heteroskedasticity-consistent Box-Pierce (1970) Q-statistics it appears that heteroskedasticity indeed seriously affects the inferences about serial correlation in the returns. Now for 26 (34) indices the null of no autocorrelation is not rejected in the local (US) currency case. The autocorrelation functions of the squared returns show that for all indices the autocorrelations are high and significant up to order 20. The Ljung-Box (1978) statistics reject the null of no autocorrelation in the squared returns firmly, except for the Venezuela Industrial if expressed in US Dollars. Hence, almost all indices exhibit significant volatility clustering, that is large (small) shocks are likely to be followed by large (small) shocks.
Technical trading rule performance In section 5.2 we showed that almost half of the local main stock market indices could not even beat a continuous risk free investment. Further we showed that for half of the indices no significant autocorrelation in the daily returns can be found after correction for heteroskedasticity. This implies that there is no linear dependence present in the data. One may thus question whether technical trading strategies can persistently beat the buy-and-hold benchmark. However as noted by Alexander (1961), the dependence in price changes can be of such a complicated nonlinear form that standard linear statistical tools, such as serial correlations, may provide misleading measures of the degree of dependence in the data. Therefore he proposed to use nonlinear technical trading rules to test for dependence. If technical trading rules can capture dependence, which they can profitably trade upon, the question remains whether the profits disappear after implementing.