Union Investment
[an error occurred while processing this directive]
Risikomanagement Suche

Arbeitspapier: Information Ratio – ein Indikator für Fonds und ihre Manager?

    How Informative is the Information Ratio
    for Evaluating Mutual Fund Managers?

    Thomas Bossert*, Roland Füss, Philipp Rindler, and Christoph Schneider§

    Working Paper
    December 2009

    ____________________

    Corresponding author: Union Investment Chair of Asset Management, European Business School (EBS), International University Schloss Reichartshausen, D-65375 Oestrich-Winkel, Germany, Phone: +49 (0)6723 991-213, Fax +49 (0)6723 991-216, email: roland.fuess@ebs.edu.

    * Union Investment Institutional GmbH, D-60329 Frankfurt, Germany, Phone: +49 (0)69 2567-3251, email: thomas.bossert@union-investment.de.

    Union Investment Chair of Asset Management, European Business School (EBS), International University Schloss Reichartshausen, D-65375 Oestrich-Winkel, Germany, Phone: +49 (0)6723 991-263, email: philipp.rindler@ebs.edu.

    § Morgan Stanley, Investment Banking Division, D-60311 Frankfurt, Germany, email: christoph.schneider@morganstanley.com.

    How Informative is the Information Ratio
    for Evaluating Mutual Fund Managers?

    Abstract

This paper answers the question whether the information ratio (IR) is a useful and reliable performance measure to evaluate mutual fund managers. Based on a dataset of nearly 10,000 mutual funds for the period January 1998 to December 2008, the empirical results show that the IR varies over time and also across different fund categories. We find that to represent the true volatility in the return generating process within a calendar year requires data with a higher frequency than monthly. This reference index or basket should cover a large proportion of the investment universe of the respective fund. As the IR induces managers to hug the benchmark, it should be supplemented e.g. with the active share to control for the activity level in the portfolio. Finally, in order to separate lucky managers from skilled ones, the long-term track record plays an important role, as luck generally is not persistent over time.

    JEL-Classification: G11

    Keywords: Performance measurement; Information Ratio (IR); mutual fund managers; benchmark; information coefficient.

    1 Introduction

    A commonly used performance measure is the Information Ratio (IR) by Treynor and Black (1973). It is the ratio of the excess portfolio return over a specified benchmark and the excess returns’ volatility. Closely connected is the fundamental law of active portfolio management by Grinold (1989) which relates the skills of a fund manager to the IR. This framework gives insights on how to use the IR to construct active portfolios within predefined risk limits. For investors to apply the ratio to a specific portfolio choice problem guidelines are required that enable them to identify superior funds. Grinold and Kahn (2000) state that top quartile managers have IRs of at least 0.5 while exceptional managers achieve values above 1.0. These numbers are unqualified and should hold irrespective of asset class, country, or time period. To the best of our knowledge, the characteristics of the IR across different asset classes and countries have not yet been extensively studied. Hence, this paper addresses the question whether the IR is a useful and reliable performance ratio. In particular, we focus on empirically observable quartile ranges for various asset classes and countries that can be used by investors as guidelines to distinguish good funds from bad ones.

    Based on return data of nearly 10,000 mutual funds for the period January 1998 to December 2008, the empirical results show that static breakpoints can be widely misleading and that an asset class focused approach is necessary. Moreover, it is shown that the quality and reliability of the IR is dependent on certain estimation choices. Firstly, the choice of the benchmark strongly affects the ratio. Ideally, this benchmark should cover a large proportion of the respective investment universe. Secondly, data frequency should be as high as possible as monthly data do not accurately represent the volatility of returns. Thirdly, non-normally distributed fund returns can substantially affect the use of the IR. Finally, in order to separate lucky managers from skilled ones the long-term track record plays an important role.

    The remainder of the paper is organized as follows: The next section discusses the IR as well as its role within active portfolio management. Section 3 presents the data set and explains the choice of funds and benchmarks used. The empirical results are presented in Section 4. The section starts by testing the IR for stability over time and across different fund categories and continues by discussing the robustness of the IR against the selection of different benchmarks and data frequencies. Finally, in order to separate lucky from skilled managers the persistency of IRs over time is discussed. Section 5 concludes.

    2 The Information Ratio

    Treynor (1965) defines two characteristics of a “good” performance measure. First, it should provide the same value for the same performance irrespective of market conditions. Second, it needs to incorporate the preferences and risk aversion of investors. Similarly, Hübner (2007) states that there are two factors that determine the quality of a performance measure: stability and precision. A stable measure is robust under different asset pricing models and does not vary in terms of its classification over time. Precision means that it should be able to provide the “true” ranking of funds based on the investor’s preferences.

    Introduced by Treynor and Black (1973), the IR is defined as

    , (1)

    where is the return of the portfolio, is the return of the benchmark, is the excess return, and is the volatility of the excess return. Its rationale is very closely related to the investor’s utility function as shown by Jacobs and Levy (1996). They explain that investors of active funds are not risk-averse in the common sense but rather regret-averse. Regret-aversion means that they generally accept the risk of a passive investment in this asset class but – depending on the excess returns of the active fund – regret their decision to invest in an active fund. In a similar spirit, Grinold and Kahn (2000) explain that investors select among different opportunities based on their personal preferences which for actively managed funds “point toward high residual return and low residual risk” (p. 5). Thus, by using the IR investors are able to limit the fund universe based on their personal risk preferences.

    Grinold (1989) identifies two factors that lead to high IRs. The first factor is the skill of the manager to correctly predict the residual returns in his investment universe. It is called the Information Coefficient (IC) and measures the correlation between the actual alpha and the forecasted alpha. The second factor describes the number of independent investment decisions taken per year and is called breadth. The fundamental law of active management illustrates the relationship between IR, IC, and breadth:

      (2)

    For the purpose of this study, the crucial point about the representation in (2) is that correct forecasting of residual returns should be a key skill of any active portfolio manager. Depending on the number of independent bets a manager takes, different skill levels are required in order to achieve a “good” or “very good” IR (Wander, 2003).

    Grinold and Kahn (2000) define IR levels on the basis of cost-adjusted fund performance. Thus, a top quartile portfolio manager has an IR of 0.5 and an exceptional manager should achieve a value of 1.0 or above. According to their study, this classification should hold for all asset classes and time horizons with only slight deviations. Jacobs and Levy (1996) also found an IR of 0.5 or above to be “very good” without restrictions to asset classes. Goodwin (1998), on the other hand, analyzed the distribution of IR for samples of funds with different investment universes and found significantly different results across fund categories. This approach seems to be more plausible than the findings of the two other studies and therefore different ranges of IRs are expected in the empirical analysis when evaluating funds that invest in different asset classes and countries.

    3 Description of Data

    3.1 Fund Data

    The initial sample includes all actively managed, open-end funds listed for sale in Germany, the UK, and the US by Reuters 3000 Xtra as of February 2009. Closed-end funds were excluded since investors cannot freely enter or exit these funds. Other funds such as REITs or hedge funds were excluded as their particular characteristics demand specific performance measures that are not within the scope of this paper (see for example Ackermann et al. 1999 or Below and Stansell 2003). We focus separately on equity, fixed-income, and money-market funds as these have very different risk-return characteristics. Since we aim at analyzing and characterizing performance measures of distinct asset classes balanced funds have been excluded. In order to categorize the funds we use the Lipper Global Classification as it is used throughout Reuters 3000 Xtra.

    In the equity class, funds with a focus on Europe, Germany, UK, or US were selected as these are the major equity markets. Additionally, a distinction has been made between large and small cap funds. Due to the limited number of small cap equity funds in Germany this category had to be eliminated. In the fixed-income class, corporate investment grade bond funds have been selected that focus on the currencies of British Pound, Euro, and US Dollar. These are the three major currencies for corporate bond emissions according to the Reuters 3000 Xtra system. Finally, the same three major currencies have been used to select relevant funds in the money market class. The final list of funds to be analyzed consists of 9,632 funds.1

    The time frame of the study was from January 1, 1998 to December 31, 2008. We retrieved weekly return data, the launch years, as well as the base currencies for the funds from Thomson Financial DataStream. We correct for erroneous data entries by excluding funds with extreme information ratios above 20 or below -20. Also, funds with launch dates after January 1st, 2007 were excluded. If a fund was launched in the second-half of the year the launch year was set to the next year to ensure that a sufficient number of data points per calendar year is available to calculate reliable test statistics. For funds quoted in a currency other than the corresponding benchmark currency the return data was converted with the appropriate exchange rate. Additionally, daily and monthly return data in the given timeframe have been retrieved for large cap US equity funds in order to analyze the influences of data frequency.

    Since Reuters 3000 Xtra and Thomson Financial DataStream only list funds that are currently available on the market, the data is subject to survivorship bias, especially for the years prior to 2007 as only those funds that survived are contained in the dataset. It can be hypothesized that the estimated performance measures may be biased upward. The extent of and possible corrections for the survivorship bias will be analyzed in Section 4.4.

    The calculation of the IR requires a market benchmark against which the fund is compared. Usually fund managers define their benchmark in the prospectus. However, in light of the large number of funds and corresponding benchmarks within the same fund category it was not possible to calculate the performance measure with the benchmark that is specified by each fund manager. Therefore, a general benchmark for each fund category has been used. Initially, this might seem inequitable but one might reason that it is fair to judge each fund within a certain investment universe against the same benchmark. We are, however, aware, that this introduces a bias into the analysis. Those fund managers that are actually managing their fund against the chosen benchmark will tend to exhibit lower tracking errors and therefore a higher IR than other managers who are managing against another benchmark. They will have to bear the tracking errors versus their true benchmark plus the tracking error between true and chosen benchmark. Exhibit A2 in the Appendix provides an overview of the benchmarks that have been assigned to the different fund classes.

    3.2 Descriptive Statistics

    Exhibit 1 outlines descriptive statistics for each fund category as well as the average excess return over the respective benchmark. All numbers are annualized for better comparability.

      - Exhibit 1 about here -

    In terms of risk/return relations, money market and corporate bond funds behave as expected whereas the numbers for the equity segment are uncharacteristic. The poor performance of equities is mainly due to the impact of the financial crisis in 2008: Gains stemming from 2003 to 2007 in the US equity market were completely erased in 2008. In terms of performance as measured by alpha it is clear that, on average, managers were not able to beat the benchmark in almost all asset classes and fund categories over the 11-year period after costs. It is also worth noting that the money market segment shows strong skewness and leptokurtosis. The effects of non-normally distributed returns on performance measures will be further analyzed in Section 4.4.

    4 Is the Information Ratio a Reliable Performance Measure?

    4.1 The Distribution of the Information Ratio

    To analyze whether the distribution of IRs is stable over time and across different fund categories the ratios for each year and asset class were ranked and divided into four quartiles. A Wilcoxon signed-rank test and an optional student t-test are used to test the yearly values against the overall average for statistically significant differences. All results are presented in annualized form for better readability and comparability. This has been done according to method 1 in Goodwin (1998) by using arithmetic mean returns.2

      - Exhibit 2 about here -

    Exhibit 2 presents the threshold values of the four quartiles which are averages over the 11-year horizon of the dataset. It is worth noting that the IR show very different patterns for each fund category, not only in terms of values but also in terms of ranges. A Corporate Bond fund with a positive IR can usually be classified as “very good” whereas an Equity Europe Fund would only be average. Additionally, the value range for a “good” Equity Europe fund is far narrower compared to “good” Money Market EUR funds. Still, within the asset classes the values and ranges seem to be similar. While further testing has to be done to confirm these results, it becomes clear that general statements about the IR such as Grinold and Kahn (2000) do not seem to be applicable for all asset classes and years as the threshold values vary considerably over time. Detailed information about the threshold values and their development over time for the top quartile are shown in Exhibit 3.

      - Exhibit 3 about here -

    The strong fluctuations of the IRs lead to the question whether the variations are statistically significant. In order to test for this difference, the median IR of the top half of all Equity US funds has been calculated for each of the 11 years, i.e. the threshold value between the first 25% and the second 25% of the funds. This threshold value is tested each year for statistically significant difference from the average threshold value reported in Exhibit 2. The results are outlined in Exhibit 4 with the threshold values shown in the first data row and z-statistics shown in the second row. The Wilcoxon signed-rank test has been used as the IRs are not normally distributed according to the Lilliefors test and are assumed to be dependent on each other (see Hollander and Wolfe, 1973).

      - Exhibit 4 about here -

    The results in Exhibit 4 clearly show that the threshold values are significantly different from the 11-year average in every single year. A look at the z-statistics reveals that the values are statistically different from their average. This is also highlighted by the spread in threshold values between -0.39 in 1998 to 0.71 in 2002. Thus, a fund evaluated on the yearly threshold could be categorized as “below average” while on the overall average value it is “very good”. To conclude, IRs have to be calculated anew on a year-to-year basis in order to be reliable. As the relevant thresholds can only be calculated ex post, it is not possible to use IRs in an annual target setting process for the fund manager. Although in the context of a multi-year planning process long-term IRs might be applicable.

    In the next step, IRs across different fund categories are investigated. The focus is again on US funds as it seems more likely to find similar IRs when looking at several asset classes within one country than across different countries. The procedure is exactly the same as in the previous test on Equity US funds; results are presented in Exhibit 5.

      - Exhibit 5 about here -

    Similar to the results presented in Exhibit 4, Exhibit 5 shows that all threshold values are significantly different from their averages. This statement is valid for the years 1998 as well as 2008 so it can be considered rather robust. Therefore, IRs not only change over time but also between different fund categories. Thus, general statements about fixed threshold values like Grinold and Kahn (2000) or Jacobs and Levy (1996) cannot be confirmed. The results of this part of the empirical study are similar to the results of Goodwin (1998) with the addition that IR also change over time. Exhibit 6 uses box-and-whiskers plots to graphically illustrate the different distributions of IRs over time for Equity US funds.

      - Exhibit 6 about here -

    4.2 The Art of Selecting the Benchmark

    In fund management companies, the selection of a benchmark is usually the result of intense negotiations between the fund manager and the investors as the benchmark has a major impact on the alpha of the fund. Depending on style and country focus one benchmark might be more favorable to the fund manager than another (Goodwin 1998; Grinold and Kahn 2000). Therefore, it is necessary to analyze the sensitivity of the IR toward the selected benchmark. Lehmann and Modest (1987) show that benchmark selection has a very strong influence on the resulting alphas as well as their volatility. While the S&P 500 has been used throughout this paper in connection with Equity US funds, two additional indices, the equally-weighted Dow Jones Industrial Average (DJIA) and the market-weighted Russell 1000 Index, will be used to compare the resulting IRs. Exhibit 7 presents the threshold IRs for different benchmarks using the same procedure as in the previous section.

      - Exhibit 7 about here -

    It can be seen that the IRs based on the S&P 500 and the Russell 1000 are closely related while the IRs based on the DJIA behave differently and are far more volatile. It seems that the DJIA does not cover the investment universe of the Equity US funds very well. This can be due to the fact that this index is based on merely 30 stocks. The difference of the threshold values is again tested for significance using the Wilcoxon signed-rank test. Exhibit 8 shows the result of testing the ratios from the Russell 1000 and DJIA against those of the S&P 500. What can be seen is that all are significantly different from those based on the S&P 500 using a 5% level of significance. These results are in line with Goodwin (1998) who also found that the selection of the benchmark has a strong influence on the resulting IRs.

      - Exhibit 8 about here -

    The results are confirmed when looking at the rankings based on the three different IRs as illustrated in both scatter plots of Exhibit 9. While there are noticeable differences between IRs based on the DJIA and those based on the S&P 500, the changes in rankings when using the Russell 1000 versus the S&P 500 are quite small. The selection of an appropriate benchmark is therefore an important step during performance analyses in general. One can, however, conclude that benchmark indices that cover a large part of the investment universe of the specific fund category are superior to indices that are only based on a few securities and certain industry sectors. Finally, the best way to judge on the real risk-adjusted value added of a fund manager is to take into account the actual benchmark he is working against as well as the true benchmarks of the other managers within his peer group. An alternative might be to use the peer group’s average as a general benchmark. This might also lead to more stability in the annual IR thresholds.

    In the same sense that benchmark selection is crucial, different investment restrictions are also quite important. The so called Transfer Coefficient (TC) measures the correlation of the manager’s forecasts with the actually implemented portfolio. While a manager without constraints will end up with a TC of 1, the constrained manager can only achieve a lower result. A usual long-only fund will possibly achieve a TC between 0.2 and 0.4, which significantly impacts the fund’s IR, as the manager is not able to fully transfer his skills into actual investment decisions. When assuming a (constrained) fund with a TC of 0.5 the manager has to double his skill (IC) or quadruple the breadth in order to achieve the same IR as an unconstrained manager for an equal fund (Wander, 2003).

      - Exhibit 9 about here -

    4.3 Does Data Frequency Matter?

    Ané and Labidi (2004) have shown that the return interval has a significant impact on the distribution of returns. While monthly and quarterly returns come close to a normal distribution weekly and especially daily data usually show strong leptokurtosis. Furthermore, the annualized standard deviation varies with frequency. Additionally, other research has shown that the data frequency influences correlations as well (Handa et al., 1989). The question is whether data frequency also impacts the IR and in particular the threshold values. If there is significant influence by the data frequency it is the goal of this research to describe these differences and to provide guidance for selecting the appropriate return interval. Thus, we calculate annualized IRs for Equity US funds using daily, weekly, and monthly fund returns. Using the ranking methodology explained above, fund rankings based on the three different IRs have been created. Results for the year 1999 are shown in Exhibit 10. While the rankings of IRs based on daily and weekly data do not differ significantly, a switch from weekly to monthly data changes the ranking tremendously. It can be concluded that the use of monthly data is inappropriate to calculate reliable performance measures. Furthermore, monthly data only allows for 12 data points per year which is insufficient to estimate the standard deviation.

      - Exhibit 10 about here -

    4.4 Other Influences on Performance Measures

    Although many studies have documented that returns are generally non-normal many popular performance measures are still based on mean-variance analysis. Therefore, non-normal returns lead to biased results (Benson et al., 2008). Additionally, Kraus and Litzenberger (1976) found that positively skewed returns are actually favorable for investors. Referring back to Exhibit 1 which presents descriptive statistics of fund returns for each category, it is striking that money market funds in all currencies produce strongly skewed and leptokurtic returns. Comparing the threshold values in Exhibit 2, the values for “top” money market funds are uncharacteristically high. Therefore, it can be concluded that due to the special return distribution characteristics of money market funds common performance measures are not applicable to these funds. Other performance measures like the Omega Measure of Keating and Shadwick (2002) could take the deviation from normality into account.

    The other factor to bear in mind is that the dataset is subject to survivorship bias due to the fact that common data providers only list funds that are currently available on the market. Therefore, the performance measures estimated may be biased upwards. Brown et al. (1992) found that the survivorship bias can be so strong that it erroneously leads to the conclusion that the performance of mutual funds is predictable. This finding disappears when the sample is corrected for survivorship bias. In terms of quantification of the survivorship bias for Equity US funds, Grinblatt and Titman (1989) found a bias of 0.1% to 0.3% per year. Brown et al. (1995) estimate the bias to be between 0.2% and 0.8% per year, while Elton et al. (1996) present an average bias of 0.71% to 0.77% annually. Although it is very likely that survivorship is present our dataset did not allow us to quantify its proportions. As our study is similar in setup as the previously cited ones, we conjecture that the bias of our results is in a similar order of magnitude.

    Two other factors that influence our results are costs and asynchronous pricing. In order to estimate the real risk-adjusted value added by a fund manager one would want to compare his results net of fees with those of other fund managers. However, another dataset would be necessary to compute those figures. Anyway, the ultimate investor in a fund might only be interested in the final result and not so much whether the results were due to skill or the cost structure of the product.

    The asynchronous pricing introduces an upward bias into tracking error. A fund that is perfectly tracking its benchmark, but whose NAV is not calculated with the same security prices and especially foreign exchange rates as the benchmark will inevitably show a tracking error that is different than zero. Again, quantification of this bias requires a data set which draws very much on internal valuation information of a fund management company and is in scarce supply.

    While the elimination of survivorship bias may lead to lower real average IRs, costs and asynchronous pricing have a negative impact. Taking account these two factors will lead to somewhat higher average IRs.

    4.5 Performance Persistence: Outperformance by Luck or Skill?

    Finally, the question has to be raised whether a single ratio based on one year of data is the only dimension to be used in order to evaluate a manager’s performance. Maybe it was just luck that the manager achieved a very good IR in a certain year. How can lucky portfolio managers be separated from skilled ones? The manager’s track record can be an adequate tool as the probability for good performance by skill increases when the manager is able to position his fund among the top 25% two or even three years in a row (Bollen and Busse, 2005). However, due care should be used when trying to predict future returns of a fund based on its past returns. Horst and Verbeek (2000) show that some studies which claim the existence of performance persistence are subject to spurious and biased results. Kahn and Rudd (1995) and Carhart (1997) also analyzed the persistence of equity mutual funds and did not find a significant relationship between past and future performance. Similar results can be shown with the dataset of this study as follows. Equity US funds with a launch year of 1998 or before are categorized into quartiles based on their 1998 IR. Subsequently, this ranking is not changed anymore. For each quartile, the average IRs are calculated for each year. The results are shown in Exhibit 11. It can be seen that the top quartile funds of 1998 actually have the lowest average IR after two years’ time. The chart creates the image of a mean reverting process and shows that on average good performance does not persist.

      - Exhibit 11 about here -

    Based on the fact that good performance on average does not persist, it can be concluded that lucky managers without skill will not be able to stay among the best funds for multiple years in a row. A second dimension, the track record, is therefore proposed when evaluating the performance of fund managers. In order to visualize the luck versus skill effect, the performance of funds from selected categories with a launch year of 1998 or before that survived until 2008 has been tracked over the entire 11-year period. Every fund has been calculated as to the number of years it was among the top 25% of all funds3 within this 11-year period. Summarized results are presented in Exhibit 12 and are as expected.

      - Exhibit 12 about here -

    It can be seen that 95.5% of all Equity US funds and 93.4% of all Equity Small Cap US funds are at least once among the top 25% of the funds during the 11-year period. Hence, if a fund survives an 11-year period it is very likely that it will be in the top quartile in some years – just by luck. It should be noted that these results could partly be caused by changes in the fund management.4 Taking the results presented in Exhibit 12 one step further, we calculate how many funds are able to remain within the top quartile for two or three years in a row within a three-year period based on their IR. Funds able to stay among the top 25% of all funds of their category for three years in a row can be considered extraordinary according to Exhibit 13 as, on average, only 2.76% of all funds manage to achieve such a result. The percentage of all funds that are able to stay within the top quartile two or three years in a row has been calculated for rolling three-year periods, and the results are fairly stable and consistent.

      - Exhibit 13 about here -

    Exhibit 13 has to be interpreted as follows. When looking at the first row for the timeframe between 2008 and 2006, 0.93% of all Equity US funds were able to stay among the top 25% of the funds in all three years. 2.33% of all Equity US funds were able to stay among the top 25% in the years 2008 and 2007, and 21.73% were among the top 25% only in year 2008. These three values add up to 25% with only minor rounding differences. Additionally, the same calculation was performed using the top 50% that is above the median. It can be concluded that the top 50% requirement is achieved too easily and is, therefore, not an appropriate measure.5 Based on these results, performance persistence (the track record of a manager) is another important factor of performance measurement in the separation of luck from skill. Therefore, investors should look for a consistent series of performance measures as opposed to unrelated occurrences of good performance over a longer timeframe.

    4.6 Agency problems

    If one imagines a portfolio manager who takes just one active investment decision per year and whose correlation between forecasted and actual returns is 0.1, this manager will achieve an IR of . The empirical part of this paper shows that an IR of 0.1 for an Equity US fund would in most years be considered “good” and in some years even “very good” rating although the manager did practically nothing. The IR can therefore incentivize strategies that are unfavorable to investors. It seems that performance measures, which use the tracking error as risk measure would need a second dimension that observes the active weights of the fund. For example, Cremers and Petajisto (2009) proposed the Active Share measure that is easy to calculate and able to quantify the active holdings of a mutual fund in relation to the corresponding benchmark.

    5 Conclusion

    5.1 Practical Implications

    The aim of this study was to evaluate whether the IR is a useful and reliable measure of the performance of mutual fund managers. Based on empirical evidence, it can be concluded that the IR is in fact reliable and useful but with certain limitations. Overall, the analysis revealed that two dimensions are important in order to judge the performance of a manager in a particular year: the performance in that particular year and the track record of the fund over the previous three years. The former can be used to establish a ranking of funds which are then adjusted up- or downward by the latter. In order to transform IRs into a grading system, the categorization in quartiles has been introduced which define thresholds between qualities of funds. It has been shown that IRs vary over time and also across different fund categories so that it is necessary to calculate the threshold values anew for every calendar year. This makes the IR a rather difficult choice for setting targets for portfolio managers as they do not know how good they have to be in advance but only at the end of the year.

    Four factors influence the quality of the IR: benchmark selection, data frequency, non-normality of fund returns, and survivorship bias inherent in the sample that is used to estimate the threshold values. With respect to the benchmark it is recommended to select an index that covers a large part of the respective market. The data frequency should be high, i.e. daily or weekly. Returns should also be tested for normality as this influences the quality of the performance measures greatly. A quantification of the survivorship bias within the IR is difficult and still unclear and, therefore, left for further research. It should be noted that the proposed framework is only valid for funds with symmetric return profiles.

    Exhibit 14 is an example of a performance evaluation framework based on the IRs calculated using the dataset of the empirical study. It is valid for funds of the selected categories in 2008 and helps to estimate the first performance dimension, the performance within the particular year to be evaluated. No differentiation has been made between funds belonging to the third or fourth quartile, that is “below average” or “poor” funds, as their IRs are mostly negative and, therefore, unreliable.

      - Exhibit 14 about here -

    5.2 Further Research

    While the results are able to answer many of the research questions, they also open up new issues. Firstly, the returns are not corrected for fees so that the performance is biased. The performance should be (and in practice is) measured using returns net of fees. In fact, a significant part of the total fees cannot be influenced by the portfolio managers, e.g. fund audit or custody fees. Secondly, the sample is dominated by US funds simply because of the data providers used. Third, many funds are subject to style drifts which generally make returns harder to compare (Chan et al. 2002). Although the very broad fund categories were selected, it would be interesting to test for biases caused by style drift. Fourth, the sample is subject to survivorship bias. The survivorship bias can be up to 0.8% per year which distorts the performance measures calculated based on these returns (Brown et al. 1995). Fifthly, asynchronous pricing might lead to tracking error estimates with an upward bias and therefore to IRs which are lower than the real IRs.

    In addition to the dataset, the analyses created ideas for additional research. Firstly, it would be interesting to compare the results based on a generic benchmark with results that are calculated with fund-specific benchmarks determined by the portfolio manager. Alternatively, use of the peer group average as benchmark might lead to more stability in the wildly fluctuating IRs. Secondly, it is suggested to analyze IRs of funds with more specific style definitions, such as “US value stocks” or “European bank stocks”. However, the number of funds investing in such specialized sectors is rather small, which will affect the significance of the results. Thirdly, the effect of the Transfer Coefficient on the manager’s active performance should be analyzed in more detail. Fund managers face certain investment restrictions that prevent the allocation of funds to the best possible portfolio. These restrictions will negatively affect the IR although they are not influenced by the manager. According to Wander (2003), mutual funds can face TCs of 0.5 or even lower, and, therefore, the manager would have to double his performance in order to be as good as an unconstrained portfolio manager. Future research could develop and empirically analyze ways to modify performance measures so that the impact of investment restrictions is neutralized across different funds.

    References

    Ackermann, C., McEnally, R., and Ravenscraft, D. (1999). The Performance of Hedge Funds: Risk, Return and Incentives. Journal of Finance, 54(3), 833-874.

    Ané, T., and Labidi, C. (2004). Return Interval, Dependence Structure, and Multivariate Normality. Journal of Economics and Finance, 28(3), 285-299.

    Below, S. D., and Stansell, S. R. (2003). Do the Individual Moments of REIT Return Distributions Affect Institutional Ownership Patterns? Journal of Asset Management, 4(2), 77-95.

    Benson, K., Gray, P., Kalotay, E., and Qiu, J. (2008). Portfolio Construction and Performance Measurement when Returns are Non-Normal. Australian Journal of Management, 32(3), 445-461.

    Bollen, N. P. B., and Busse, J. A. (2005). Short-Term Persistence in Mutual Fund Performance. Review of Financial Studies, 18(2), 569-597.

    Brown, S. J., Goetzmann, W., Ibbotson, R. G., and Ross, S. A. (1992). Survivorship Bias in Performance Studies. Review of Financial Studies, 5(4), 553-580.

    Brown, S. J., and Goetzmann, W. N. (1995). Performance Persistence. The Journal of Finance, 50(2), 679-698.

    Brown, S. J., Goetzmann, W. N., and Ross, S. A. (1995). Survival. Journal of Finance, 50(3), 853-873.

    Carhart, M. M. (1997). On Persistence in Mutual Fund Performance. Journal of Finance, 52(1), 57-82.

    Chan, L. K. C., Chen, H.-L., and Lakonishok, J. (2002). On Mutual Fund Investment Styles. Review of Financial Studies, 15(5), 1407-1437.

    Cremers, M., and A. Petajisto (2009). How Active is Your Fund Manager? A New Measure That Predicts Performance, Working Paper, Yale School of Management, New Haven.

    Elton, E. J., Gruber, M. J., and Blake, C. R. (1996). Survivorship Bias and Mutual Fund Performance. Review of Financial Studies, 9(4), 1097-1120.

    Goodwin, T. H. (1998). The Information Ratio. Financial Analysts Journal, 54(4), 34-43.

    Grinblatt, M., and Titman, S. (1989). Mutual Fund Performance: An Analysis of Quarterly Portfolio Holdings. Journal of Business, 62(3), 393-416.

    Grinold, R. C. (1989). The Fundamental Law of Active Management. Journal of Portfolio Management, 15(3), 30-37.

    Grinold, R.C., and Kahn, R.N. (2000). Active Portfolio Management: A Quantitative Approach for Providing Superior Returns and Controlling Risk. 2nd ed., New York, NY: McGraw-Hill.

    Handa, P., Kothari, S. P., and Wasley, C. (1989). The Relation Between the Return Interval and Betas: Implications for the Size Effect. Journal of Financial Economics, 23(1), 79-100.

    Hollander, M., and Wolfe, D. A. (1973). Nonparametric Statistical Methods. Hoboken, NJ: John Wiley & Sons, Inc.

    Horst, J. t., and Verbeek, M. (2000). Estimating Short-Run Persistence in Mutual Fund Performance. Review of Economics and Statistics, 82(4), 646-655.

    Hübner, G. (2007). How Do Performance Measures Perform? Journal of Portfolio Management, 33(4), 64-74.

    Jacobs, B. I., and Levy, K. N. (1996). Residual Risk: How Much is Too Much? Journal of Portfolio Management, 21(3), 10-16.

    Kahn, R. N., and Rudd, A. (1995). Does Historical Performance Predict Future Performance? Financial Analysts Journal, 51(6), 43-52.

    Keating, C., and Shadwick, W.F. (2002). Omega: A Universal Performance Measure. Journal of Performance Measurement, 6(3), 59-84.

    Kraus, A., and Litzenberger, R. H. (1976). Skewness Preference and the Valuation of Risk Assets. Journal of Finance, 31(4), 1085-1100.

    Lehmann, B. N., and Modest, D. M. (1987). Mutual Fund Performance Evaluation: A Comparison of Benchmarks and Benchmark Comparisons. Journal of Finance, 42(2), 233-265.

    Lilliefors, H. W. (1967). On the Kolmogorov-Smirnov Test for Normality with Mean and Variance Unknown. Journal of the American Statistical Association, 62(318), 399-402.

    Treynor, J. L. (1961). Toward a Theory of Market Value of Risky Assets. Working Paper. Subsequently published in R. A. Korajczyk (1999). Asset Pricing and Portfolio Performance: Models, Strategy and Performance Metrics. London: Risk Books.

    Treynor, J. L. (1965). How to Rate Management of Investment Funds. Harvard Business Review, 43(1), 63-75.

    Treynor, J. L., and Black, F. (1973). How to Use Security Analysis to Improve Portfolio Selection. Journal of Business, 46(1), 66-86.

    Wander, B. H. (2003). What it Takes to Beat a Benchmark. Journal of Investing, 12(3), 37-42.

    Appendix

    Exhibit A1: Sample Size of the Fund Dataset Grouped by Fund Classification

Fund Classification

Number of Funds in the Dataset in Year

1998

2000

2002

2004

2005

2006

2007/08

Equity Europe

127

214

363

553

689

813

895

Equity Germany

54

57

65

70

73

80

84

Equity UK

189

267

370

514

570

658

681

Equity US

970

1,341

2,117

2,832

3,203

3,648

3,953

Equity Small Cap Europe

31

64

98

132

152

184

202

Equity Small Cap UK

51

67

83

109

111

127

132

Equity Small Cap US

529

775

1,237

1,653

1,842

2,057

2,184

Corporate Bonds EUR

0

0

49

129

151

171

185

Corporate Bonds GBP

50

86

124

167

187

211

222

Corporate Bonds USD

88

108

158

203

211

231

237

Money Market EUR

0

0

164

223

243

283

300

Money Market GBP

36

53

79

94

99

112

118

Money Market USD

202

230

320

396

410

433

439

    Source: Aggregation based on Reuters 3000 Xtra and Thomson Financial DataStream.

    Exhibit A2: Overview of Benchmark Indices

Fund Classification

Benchmark Name

DataStream Ticker

Equity Europe

MSCI Europe

MSEROP

Equity Germany

DAX

DAXINDX

Equity UK

FTSE 100

FTSE100

Equity US

S&P 500

S&PCOMP

Equity Small Cap Europe

MSCI Europe

MSEROP

Equity Small Cap UK

FTSE All Share

FTSEALLSH

Equity Small Cap US

S&P 600 Small Cap

S&P600I

Corporate Bonds EUR

iBoxx Liquid EUR Corporates

IBELCAL

Corporate Bonds GBP

iBoxx Liquid GBP Corporates

IB£CSAL

Corporate Bonds USD

Merrill Lynch Corporate Master

MLCORPM

Money Market EUR

EUR Interbank 3M Offered Rate

BBEUR3M

Money Market GBP

GBP Interbank 3M Offered Rate

BBGBP3M

Money Market USD

USD Interbank 3M Offered Rate

BBUSD3M

    Source: Thomson Financial DataStream.

    Exhibit 1: Descriptive Statistics of Fund Returns

Fund Classification

Avg. Ann. Return

Avg. Ann. Std.dev.

Skewness

Excess Kurtosis

Avg. Ann.
Excess Return

Equity Europe

-0.72%

17.73%

-0.539

2.885

-1.71%

Equity Germany

0.18%

23.42%

-0.418

3.484

-0.60%

Equity UK

1.97%

15.30%

-0.722

3.167

0.68%

Equity US

-2.57%

18.23%

-1.092

9.734

-3.22%

Equity Small Cap Europe

1.51%

19.25%

-0.986

2.816

2.50%

Equity Small Cap UK

4.09%

14.02%

-1.223

3.020

2.27%

Equity Small Cap US

-2.54%

21.68%

-1.151

9.119

-6.45%

Corporate Bonds EUR

2.38%

2.88%

-0.666

3.914

-1.20%

Corporate Bonds GBP

3.65%

4.39%

-0.572

2.662

-1.12%

Corporate Bonds USD

3.10%

4.22%

-0.551

1.710

-1.58%

Money Market EUR

2.11%

0.30%

-3.557

20.300

-0.25%

Money Market GBP

4.97%

0.45%

4.193

26.245

0.72%

Money Market USD

1.97%

3.12%

1.379

33.221

-0.93%

    Calculations are based on weekly data for the period January 1998 to December 2008, annualized.

    Exhibit 2: Information Ratios of Different Fund Categories

Fund Classification

IR 1st 25%
“very good”

IR 2nd 25%
“good”

IR 3rd 25%
“below avg.”

IR 4th 25%
“poor”

Equity Europe

> 0.40

0.40 to 0.04

0.03 to -0.36

< -0.36

Equity Germany

> 0.07

0.07 to -0.11

-0.12 to 0.37

< -0.37

Equity UK

> 0.32

0.32 to -0.01

-0.02 to -0.30

< -0.30

Equity US

> 0.28

0.28 to -0.40

-0.41 to -1.01

< -1.01

Equity Small Cap Europe

> 0.80

0.80 to 0.40

0.29 to -0.09

< -0.09

Equity Small Cap UK

> 0.59

0.59 to 0.22

0.21 to -0.12

< -0.12

Equity Small Cap US

> 0.08

0.08 to -0.60

-0.61 to -1.18

< -1.18

Corporate Bonds EUR

> -0.24

-0.24 to -0.76

-0.77 to -1.30

< -1.30

Corporate Bonds GBP

> 0.03

0.03 to -0.46

-0.47 to -0.95

< -0.95

Corporate Bonds USD

> 0.03

0.03 to -0.58

-0.59 to -1.29

< -1.29

Money Market EUR

> 4.30

4.30 to 1.36

1.35 to -0.39

< -0.39

Money Market GBP

> 4.30

4.30 to 0.31

0.30 to -1.50

< -1.50

Money Market USD

> 2.46

2.46 to 0.39

0.38 to -1.29

< -1.29

    Exhibit 3: Information Ratio – Threshold Values for 1st Quartile Funds (very good)

    Money Market USD

    Money Market GBP

    Money Market EUR

    Corporate Bonds USD

    Corporate Bonds GBP

    Corporate Bonds EUR

    Equity Small Cap US

    Equity Small Cap UK

    Equity Small Cap Europe

    Equity US

    Equity UK

    Equity Germany

    Equity Europe

    Fund Classification

    > 1.80

    > 0.33

    N/A

    > -0.37

    > 0.56

    N/A

    > 0.65

    > -0.92

    > 0.04

    > -0.39

    > -0.26

    > 0.02

    > 0.23

    1998

    > 1.50

    > 1.10

    N/A

    > 0.26

    > -0.04

    N/A

    > 1.50

    > 2.50

    > 2.40

    > 0.36

    > 0.79

    > -0.16

    > 1.30

    1999

    > 1.40

    > 1.20

    N/A

    > 0.46

    > -0.64

    N/A

    > -0.26

    > 0.67

    > 0.60

    > 0.66

    > 0.62

    > 0.44

    > 0.38

    2000

    > 5.90

    > 4.60

    N/A

    > -0.64

    > -0.50

    N/A

    > -0.21

    > 0.19

    > -0.21

    > 0.51

    > 0.24

    > 0.21

    > 0.15

    2001

    > 3.00

    > 5.90

    > 4.60

    > -0.28

    > -0.19

    > 0.08

    > 0.06

    > 0.25

    > 0.50

    > 0.71

    > 0.15

    > 0.35

    > 0.28

    2002

    > 5.20

    > 5.60

    > 7.70

    > -0.01

    > -0.17

    > -0.30

    > 0.44

    > 1.30

    > 1.40

    > 0.36

    > 0.65

    > 0.12

    > -0.12

    2003

    > 2.70

    > 3.70

    > 7.40

    > -0.26

    > 0.07

    > -0.95

    > -0.74

    > 1.40

    > 1.70

    > 0.18

    > 0.44

    > -0.14

    > 0.46

    2004

    > 0.98

    > 10.00

    > 7.80

    > -0.08

    > -0.15

    > -0.32

    > -0.05

    > 0.47

    > 1.60

    > 0.55

    > 0.31

    > -0.02

    > 0.91

    2005

    > 2.10

    > 7.90

    > 4.20

    > 0.00

    > -0.08

    > 0.26

    > -0.49

    > 1.50

    > 1.60

    > -0.38

    > 0.58

    > 0.08

    > 0.81

    2006

    > 1.30

    > 3.80

    > 0.51

    > 0.49

    > 0.30

    > 0.63

    > 0.49

    > -0.68

    > -0.28

    > 0.44

    > -0.23

    > -0.22

    > -0.09

    2007

    > 1.20

    > 3.20

    > 0.04

    > 0.71

    > 1.20

    > -1.10

    > -0.52

    > -0.18

    > -0.49

    > 0.08

    > 0.18

    > 0.11

    > 0.08

    2008

    Exhibit 4: Test Statistics for the Difference of Threshold Values of Equity US Funds

    * indicates values significantly different from average at the 5% significance level. All test statistics for the Lilliefors test for normality are significant at the 5% level. The test is a generalization of the Kolmogorov-Smirnov (KS) test. While the KS test requires the specification of the population mean and variance, the Lilliefors test is capable of testing samples with incompletely specified distribution characteristics for normality (Lilliefors 1967).

Wilcoxon Signed-Rank Test on Differences in Mean

1998

1999

2000

2001

2002

2003

2004

2005

2006

2007

2008

Avg.

-0.39*

0.36*

0.66*

0.51*

0.71*

0.36*

0.18*

0.55*

-0.38*

0.44*

0.08*

0.28

-16.5

-3.0

-15.4

-12.5

-19.8

-9.0

-3.2

-20.7

-34.3

-15.4

-4.0

-

    Exhibit 5: Test Statistics for the Difference of Threshold Values of US Funds

* indicates values significantly different from average at the 5% significance level; analog to the previous test, IR are not normally distributed according to the Lilliefors test and all test statistics are significant at the 5% level; the second row shows the z-scores of the Wilcoxon signed rank test.

 

Year

Equity

Small Cap Equity

Fixed-Income

Money Market

Average

Wilcoxon

1998

-0.39*

0.65*

-0.37*

1.80*

0.42

z-score

-17.53

-6.39

-5.51

-3.82

-

Wilcoxon

2008

0.08*

-0.52*

0.71*

1.20*

0.37

z-score

-10.10

-26.66

-5.17

-4.04

-

    Exhibit 6: Box Plots of Equity US Fund Information Ratios

    Exhibit 7: The Effect of Benchmark Selection on the Information Ratio

    Exhibit 8: z-Statistics for Significant Difference of the Information Ratios

    * indicates values significantly different from average at the 5% significance level; analog to the previous test, IR are not normally distributed according to the Lilliefors test and all test statistics are significant at the 5% level.

z-values for…

1998

1999

2000

2001

2002

2003

2004

2005

2006

2007

2008

Dow Jones

-18.1*

-9.6*

-9.3*

-22.0*

-26.6*

-26.4*

-30.9*

-32.9*

-7.5*

-8.6*

-25.3*

Russell 1000

-9.4*

-4.2*

-3.5*

-14.2*

-8.5*

-17.1*

-25.0*

-12.7*

-31.9*

-21.1*

-33.5*

    Exhibit 9: Ranking Differences caused by Different Benchmarks

    Exhibit 10: Comparison of Rankings Based on Different Data Frequencies

    Exhibit 11: Performance Persistence of Equity US Funds

    Exhibit 12: Number of Top 25% Ranks over Lifetime

Fund Classification

0

1

2

3

4

5

6 or more

Equity US

4.5%

12.8%

21.4%

24.9%

17.7%

12.4%

6.3%

Equity Small Cap US

6.6%

11.0%

21.7%

24.4%

18.3%

9.3%

8.7%

    Exhibit 13: Performance Persistence of Equity US Funds Over Time

Period

TOP 25% … years in a row

TOP 50% … years in a row

1 Year

2 Years

3 Years

1 Year

2 Years

3 Years

2008 to 2006

21.73%

2.33%

0.93%

25.73%

11.58%

12.67%

2007 to 2005

20.41%

2.88%

1.72%

28.22%

7.81%

13.94%

2006 to 2004

18.77%

3.04%

3.18%

26.48%

5.51%

17.99%

2005 to 2003

16.76%

4.07%

4.15%

20.87%

9.32%

19.81%

2004 to 2002

14.80%

7.57%

2.65%

18.20%

14.23%

17.54%

2003 to 2001

19.76%

2.33%

2.87%

25.49%

6.51%

17.97%

2002 to 2000

12.10%

5.15%

7.77%

13.22%

6.87%

29.87%

2001 to 1999

11.11%

13.17%

0.72%

10.66%

29.66%

9.68%

2000 to 1998

23.35%

0.83%

0.83%

34.09%

5.79%

10.12%

Mean

17.64%

4.60%

2.76%

22.55%

10.81%

16.62%

    Exhibit 14: Framework for Performance Evaluation – Year 2008

1 A complete overview of the fund types analyzed in this study can be seen in Exhibit A1 in the Appendix.

2 The reported results were not sensitive to using methods 2 to 4 and are omitted for brevity.

3 Using the Information Ratio as ranking criterion.

4 Due to limited data availability, it was not possible to correct the sample for changes in the fund management.

5 The results are available from the authors upon request.

Kontakt

Telefon:  069 2567 - 7652
Telefax:  069 2567 1616