US 7539636 B2 Abstract A method for creating a peer group database includes a step of collecting security transaction data for a preselected period of time, for a plurality of investment institutions. The transaction data includes identity of securities being traded, transaction order sizes, execution prices and execution times. The transaction data is grouped into a plurality of orders. A plurality of cost benchmarks are calculated for each of the orders. Transaction costs are estimated for each investment institution relative to the cost benchmarks. The data is stored.
Claims(54) 1. A computer-implemented method for creating a database, said method comprising:
at one or more computers, collecting security transaction data for a preselected period of time, for a plurality of institutional investors, said transaction data including identity of securities being traded, transaction order sizes, execution prices and execution times;
grouping said transaction data into groups of orders, wherein each group of orders consists of a plurality of orders each associated with a common category from a plurality of common categories;
calculating a plurality of cost benchmarks for each group of orders;
estimating transaction costs for each institutional investor from said transaction data relative to each of said calculated cost benchmarks for each category of said plurality of common categories; and
storing said data for said calculated benchmarks and said estimated transaction costs;
wherein the grouping of transaction data into groups of orders includes combining discrete transaction data which form an order into each order.
2. The method as recited in
3. The method as recited in
X _{i}=α_{i}+β_{i} f(S)+γ_{i} g(M)+ε_{i}, for percentiles i=25, 40, 50, 60 or 75, and each percentile i is assumed to depend linearly on functions f and g of size (S) and momentum (M) respectively, and (α_{i}, β_{i}, γ_{i}) are regression parameters.
4. The method as recited in
5. The method as recited in
6. The method as recited in
a closing price C_{T−1 }of the security on a day prior to the day of the execution of the corresponding order;
a volume-weighted average price VWAP across all trades for the security during the day of execution of the corresponding order;
a closing price C_{T+1 }of the security on the first day after the day of execution of the corresponding order;
a closing price C_{T+20 }of the security on the 20th day after the day of execution of the corresponding order;
an open price O_{T }of the security on the day of execution of the corresponding order; and
a prevailing midquote M_{T }of the security prior to the execution time of the corresponding order; and
wherein each of said plurality of benchmarks are calculated for each security for each order.
7. The method recited in
8. The method recited in
9. The method as recited in
X _{i}=α_{i}+β_{i} f(S)+γ_{i} g(M)+ε_{i}, for percentiles i=25, 40, 50, 60 or 75, and each percentile i is assumed to depend linearly on functions f and g of size (S) and momentum (M) respectively, and (α_{i}, β_{i}, γ_{i}) are regression parameters; and
wherein transaction costs are regressed for each of at least one cost factor.
10. The method as recited in
11. The method as recited in
12. The method as recited in
13. The method as recited in
14. A computer-implemented method for ranking security transaction cost performance relative to transaction costs of other institutional investors, said method comprising steps of:
at one or more computers, collecting security transaction data for a preselected period of time, for a plurality of investment institutions, said transaction data including identity of securities being traded, transaction order sizes, execution prices, momentum and execution times;
grouping said transaction data into a plurality of orders, wherein each group of orders consists of a plurality of orders associated with a common category from a plurality of common categories;
calculating a plurality of cost benchmarks for each group of orders;
estimating transaction costs for each investment institution relative to each of said calculated cost benchmarks for each category of said plurality of common categories; and
ranking a first investment institution of said plurality of institutional investors against said plurality of investment institutions based on said estimated transaction costs for said plurality of institutions for at least one of said common categories;
wherein the grouping of transaction data into groups of orders includes combining discrete transaction data which form an order into each order.
15. The method as recited in
16. The method as recited in
X _{i}=α_{i}+β_{i} f(S)+γ_{i} g(M)+ε_{i}, for percentiles i=25, 40, 50, 60 or 75, and each percentile i is assumed to depend linearly on functions f and g of size (S) and momentum (M) respectively, and (α_{i}, β_{i}, γ_{i}) are regression parameters.
17. The method as recited in
18. The method as recited in
19. The method as recited in
a closing price C_{T−1 }of the security on a day prior to the day of the execution of the corresponding order;
a volume-weighted average price VWAP across all trades for the security during the day of execution of the corresponding order;
a closing price C_{T+1 }of the security on the first day after the day of execution of the corresponding order;
a closing price C_{T+20 }of the security on the 20th day after the day of execution of the corresponding order;
an open price O_{T }of the security on the day of execution of the corresponding order; and
a prevailing midquote M_{T }of the security prior to the execution time of the corresponding order; and
wherein each of said plurality of benchmarks are calculated for each security for each order.
20. The method recited in
21. The method recited in
22. The method as recited in
X _{i}=α_{i}+β_{i} f(S)+γ_{i} g(M)+ε_{i}, for percentiles i=25, 40, 50, 60 or 75, and each percentile i is assumed to depend linearly on functions f and g of size (S) and momentum (M) respectively, and (α_{i}, β_{i}, γ_{i}) are regression parameters; and
wherein transaction costs are regressed for each of at least one cost factor.
23. The method as recited in
24. The method as recited in
25. The method as recited in
26. The method as recited in
27. A system for ranking security transaction cost performance relative to transaction costs for a plurality of institutional investors, said system comprising:
processing means for collecting security transaction data for a preselected period of time, for a plurality of institutional investment investors, said transaction data including identity of securities being traded, transaction order sizes, execution prices, momentum and execution times, grouping said transaction data into groups of orders, wherein each group of orders consist of a plurality of orders associated with a common category from a plurality of common categories; calculating a plurality of cost benchmarks for each group of orders; estimating transaction costs for each institutional investor from said transaction data relative to each of said calculated cost benchmarks for each category of said plurality of common categories; and ranking a first investment institution of said plurality of investment institutions based on said estimated transaction cost against said plurality of investment institutions for at least one of said common categories; and
storing means for receiving data from said processing means, storing said data, and making data available to said processing means;
wherein grouping of transaction data into groups of orders includes combination discrete transaction data which form an order into each order.
28. The system according to
29. The system according to
X _{i}=α_{i}+β_{i} f(S)+γ_{i} g(M)+ε_{i}, for percentiles i=25, 40, 50, 60 or 75, and each percentile i is assumed to depend linearly on functions f and g of size (S) and momentum (M) respectively, and (α_{i}, β_{i}, γ_{i}) are regression parameters.
30. The system according to
31. The system according to
32. The system according to
a closing price C_{T−1 }of the security on a day prior to the day of the execution of the corresponding order;
a volume-weighted average price VWAP across all trades for the security during the day of execution of the corresponding order;
a closing price C_{T+1 }of the security on the first day after the day of execution of the corresponding order;
a closing price C_{T+20 }of the security on the 20th day after the day of execution of the corresponding order;
an open price O_{T }of the security on the day of execution of the corresponding order; and
a prevailing midquote M_{T }of the security prior to the execution time of the corresponding order; and
wherein each of said plurality of benchmarks are calculated for each security for each order.
33. The system according to
34. The system according to
35. The system according to
X _{i}=α_{i}+β_{i} f(S)+γ_{i} g(M)+ε_{i}, for percentiles i=25, 40, 50, 60 or 75, and each percentile i is assumed to depend linearly on functions f and g of size (S) and momentum (M) respectively, and (α_{i}, β_{i}, γ_{i}) are regression parameters; and
wherein transaction costs are regressed for each of at least one cost factor.
36. The system according to
37. The system according to
38. The system according to
39. The system according to
40. A system for ranking security transaction cost performance relative to transaction costs for a plurality of institutional investors, said system comprising:
a processing unit coupled with a network and configured to collect security transaction data for a pre-selected period of time, for a plurality of investment institutional investors, said transaction data including identity of securities being traded, transaction order sizes, execution prices, momentum and execution times, to group said transaction data into groups of orders, wherein each group of orders consists of a plurality of orders associated with a common category from a plurality of common categories, to calculate a plurality of cost benchmarks for each group of orders, to estimate transaction costs for each order from said transaction data relative to each of said calculated cost benchmarks for each category of said plurality of common categories, and to store said data for said calculated benchmarks and said estimated transaction costs in a database; and
a database unit coupled with said processing unit and configured to communicate with said processing unit, store data, and make data available to said processing unit;
wherein grouping transaction data into groups of orders includes combining discrete transaction data which form an order into each order.
41. The system according to
42. The system according to
X _{i}=α_{i}+β_{i} f(S)+γ_{i} g(M)+ε_{i}, for percentiles i=25, 40, 50, 60 or 75, and each percentile i is assumed to depend linearly on functions f and g of size (S) and momentum (M) respectively, and (α_{i}, β_{i}, γ_{i}) are regression parameters.
43. The system according to
44. The system according to
45. The system according to
a closing price C_{T−1 }of the security on a day prior to the day of the execution of the corresponding order;
a volume-weighted average price VWAP across all trades for the security during the day of execution of the corresponding order;
a closing price C_{T+1 }of the security on the first day after the day of execution of the corresponding order;
a closing price C_{T+20 }of the security on the 20th day after the day of execution of the corresponding order;
an open price O_{T }of the security on the day of execution of the corresponding order; and
a prevailing midquote M_{T }of the security prior to the execution time of the corresponding order; and
wherein each of said plurality of benchmarks are calculated for each security for each order.
46. The system according to
47. The system according to
48. The system according to
49. The system according to
50. The system according to
51. The system according to
52. The method of
53. The method of
54. The method of
Description This application is based on and claims priority to provisional patent application No. 60/464,962 filed on Apr. 24, 2003, the entire contents of which are incorporated herein by reference. The performance of an investment is strongly related to execution costs related to the investment. Often with trading securities, transaction costs may be large enough to substantially reduce or even eliminate the return of an investment strategy. Therefore, achieving the most efficient order execution is a top priority for investment management firms around the globe. Moreover, the recent demand of some legislators and fund shareholder advocates of greater disclosure of commissions and other trading costs makes their importance even more pronounced (see, for example, Teitelbaum [14]). Therefore, understanding the determinants of transaction costs and measuring and estimating them are imperative. For further discussion see, for example, Domowitz, Glen and Madhavan [5] and Schwartz and Steil [13]. Traditionally, there appear to be two different approaches for estimating trading costs. The first approach is purely analytical and emphasizes mathematical/statistical models to forecast transaction costs. Typically, these models are based on theoretical factors/determinants of transaction costs and take into account, for instance, trade size and side, stock-specific characteristics (e.g., market cap, average daily trading volume, price, volatility, spread, bid/ask size, etc.), market and stock-specific momentum, trading strategy, and the type of the order (market, limit, cross, etc.). The modeling is focused primarily on price impact and, sometimes, opportunity cost. For example, Chan and Lakonishok [4] report that institutional trading impact and trading cost are related to firm capitalization, relative decision size, identity of the management firm behind the trade and the degree of demand for immediacy. Keim and Madhavan [9] focus on institutional style and its impact on their trading costs. They show that trading costs increase with trading difficulty and depend on factors like investment styles, order submission strategies and exchange listing. Breen, Hodrick and Korajczyk [2] define price impact as the relative change in a firm's stock price associated with its observed net trading volume. They study the relation between this measure of price impact and a set of predetermined firm characteristics. Typically, some of these factors are then selected and implemented in mathematical or econometrical models that provide transaction cost estimates depending on different trade characteristics and investment style. ITG ACE™ (Agency Cost Estimator), described in [7] is an example of an econometric/mathematical model that is based on such theoretical determinants. It measures execution costs using the implementation shortfall approach discussed in Perold [12]. See also [15] and [16] for other examples of this type of model. While the first approach implicitly assumes that past execution costs do not entirely reflect future costs, the second approach is specifically based on this principle. In the second approach, the focus is exclusively on the analysis of actual execution data, and resulting estimates are used primarily for post-trade analysis. Typically, executions are subdivided into segments called peer groups, then simple average estimates of transaction costs in each segment are built. Taking empirical averages, however, might cause problems. For example, very often cells with insufficient amount of data may provide inaccurate and inconsistent estimates due to just several outliers. The present invention incorporates ideas of both approaches above to provide an improved method for estimating transaction costs. According to the present invention, a method is provided for estimating transaction costs for financial transaction—preferably equity trades. Estimates are built using historical execution data, which is split into different peer groups. However, instead of calculating simple average estimates, a more sophisticated methodology is applied to historical execution data to produce more robust and consistent forecasts. According to an embodiment of the present invention, a method is provided for creating a peer group database, which includes a step of collecting security transaction data for a preselected period of time, for a plurality of investment institutions. The transaction data includes identity of securities being traded, transaction order sizes, execution prices and execution times. The transaction data is grouped into a plurality of orders. A plurality of cost benchmarks are calculated for each of the orders. Transaction costs are estimated for each investment institution relative to the cost benchmarks. The data is stored. Other objects, advantages and features of the invention that may become hereinafter apparent, the nature of the invention may be more clearly understood by reference to the following detailed description of the invention, the appended claims, and the drawings attached hereto. According to another embodiment of the present invention, a method for ranking a first institutional investor's security transaction cost performance relative to transaction costs of other institutional investors is provided. The method includes a step of collecting security transaction data for a preselected period of time, for a plurality of investment institutions. The transaction data includes identity of securities being traded, transaction order sizes, execution prices, momentum and execution times. The transaction data is grouped into a plurality of orders. A plurality of cost benchmarks are calculated for each of the orders. Transaction costs are estimated for each investment institution relative to the cost benchmarks. The first institutional investor is ranked against the plurality of investment institutions for at least one of a number of factors. According to another embodiment of the present invention, a system is provided for ranking a first institutional investor's security transaction cost performance relative to transaction costs of other institutional investors. The system includes a processing means for collecting security transaction data for a preselected period of time, for a plurality of investment institutions. The transaction data includes identity of securities being traded, transaction order sizes, execution prices, momentum and execution times, grouping said transaction data into a plurality of orders. The processing means calculates a plurality of cost benchmarks for each of the plurality of orders, estimates transaction costs for each investment institution relative to the cost benchmarks, and ranks the first institutional investor against the plurality of investment institutions for at least one of a number of factors. The system also includes a storing means for receiving data from the processing means, storing said data, and making data available to the processing means. According to another embodiment of the present invention, a system is provided for ranking a first institutional investor's security transaction cost performance relative to transaction costs of other institutional investors. The system includes a processing unit and a database unit. The processing unit is coupled with a network and configured to collect security transaction data for a pre-selected period of time, for a plurality of investment institutions. The transaction data includies identity of securities being traded, transaction order sizes, execution prices, momentum and execution times. The processing unit is also configured to group the transaction data into a plurality of orders, to calculate a plurality of cost benchmarks for each of said plurality of orders, to estimate transaction costs for each order relative to the cost benchmarks, and to store the data in a database. The database unit is coupled with the processing unit and configured to communicate with the processing unit, store data and making data available to the processing unit. The invention will be described in detail with reference to the following drawings, in which like features are represented by common reference numbers and in which: The present invention provides a novel system and method for estimating financial transaction costs associated with trading securities, and comparing institutional performance among peer institutions. Transactional data from various peer institutions is collected and analyzed on a periodic basis to create comprehensive data relating to transactions, order and executions. The data can be manipulated and presented to a peer institution so that they can benchmark their performance against their competitors. Costs are measured by comparing the costs of a trade or order by an institution to one or more benchmarks, and then comparing costs between institutions for similar stocks under similar situations. The present invention will help institutional investors to manage their trading costs more efficiently by ranking the performance of investors relative to other peer group participants. The present invention will stimulate institutional investors to enhance their analytical environment using the most efficient trading execution tools (e.g., POSIT®, TriAct™, ITG SmartServers™, etc.) as well as advanced trading analytical products (e.g., TCA, ITG Opt™, ITG ACE™, ResRisk™, etc.). For the purpose of describing the present invention, orders are block orders of securities requiring the buying or selling of one thousand or more shares of at least one security. The present invention includes systems and methods for providing security transaction costs. The methodology is described first, followed by exemplary embodiments of systems for implementing the same. One skilled in the art will readily comprehend that the invention is not limited to the embodiments described herein, nor is it limited to specific programming techniques, software or hardware. A framework with two different clusterization approaches is provided: single executions and orders. Trades submitted by the same institution with the same order identifier, side and stock are assumed to belong to the same order. To build the cost estimates, the transaction cost of each trade or order/trading decision are estimated against a number of benchmarks. Though the true costs to an institutional trader may include costs such as commission costs, the administrative costs of working an order, as well as the opportunity costs of missed trades, the present invention focuses primarily on costs represented by price impact. This price impact can be explained as the deviation of the executed price from an unperturbed price that would prevail had the trade not occurred. The following benchmarks can be used for estimating transaction costs:
Benchmark C_{T−1 }is described more fully in Perold [12]. Benchmark V_{T }is described in detail by Berkowitz, Logue, and Noser [1]. The benchmark M_{T }is, probably, the purest form of unperturbed price that one could choose as opposed to C_{T−1}, for example, because it does not depend on other trades that occur between closing and time of execution. All three benchmarks (C_{T−1 }V_{T }and M_{T}) are widely used in practice both for cost measurement and trading performance evaluation, and will be understood by one of ordinary skill in the art. Although the benchmark VWAP is widely used, it is generally not considered to be appropriate for evaluation of large order executions, because it can be “gamed” by avoiding trading late in the day if prices appear to be worse than the VWAP price. See, for instance, Madhavan [11] for more details and Lert [10] for analysis of differences between various cost measurement methods. Transaction costs can be calculated in basis points according to the formula: [(P^−P_{b})/P_{b}]*δ*10,000 (Eq. 1); where P^ is the actual execution price, P_{b }is the benchmark price and δ is set to 1 or −1 in case of a sell or buy order, respectively. Positive trading costs show outperformance, which means that the trading decision resulted in profit. To compare transaction costs of one peer institution against the costs of other peer institutions under similar circumstances, cost estimates for median and other percentiles for each comparison framework are built into a database, or other storage means, called a Peer Group Database (PGD). A graphical user interface is preferably provided to allow users to view relative peer performance by both traditional measures, as well as trade characteristics. More precisely, trading costs of executions/orders can be grouped by a number of market and stock-specific cost factors, such as type, market capitalization, side market, market, size (represented by a percentage of average daily trading volume), and short-term momentum. These factors define scenarios. The preferred values and ranges of the exemplary cost factors are presented in The six cost factors listed above have a significant impact on transaction costs, but numerous other factors are contemplated to be used, for instance, broker type (alternate broker, full-service broker, research broker, etc.), order type (market, limit, cross), daily volatility, and the inverse of dollar price of a stock (see e.g., Werner [17] or Chakravarty, Panchapagesan and Wood [3]). It is important to note that adding too many factors to the PGD may have some disadvantages. For example, the product resulting could become more complicated, but most importantly, if the amount of transaction data does not increase dramatically, the accuracy of estimates will deteriorate as the number of observations for each segment becomes insufficient. Referring to The factor Market Capitalization classifies stocks into three market capitalization groups. For executions, the Market Capitalization is always based on the closing stock price C_{T−1 }of the day prior to execution. For orders, the Market Capitalization is based on the closing stock price C_{T−1}, but on the day prior to trading decision. The threshold for Small cap stocks is 1.5 billion dollars. The threshold for Mid cap stocks is 10 billion dollars. The factor Side comprises of two categories: Buy and Sell. Preferably, no distinction is made between normal sells and short sells. For U.S. applications, the factor Market subdivides stocks in two categories: Listed and over-the-counter (OTC) stocks. However, for other international applications, the Market factor can be subdivided into any number of categories. The Size factor captures the (total) trade size of an execution (order). Size is measured relative to the average daily share volume (ADV), which is defined as the median daily dollar volume of the latest twenty-one trading days divided by the closing stock price of the day prior to execution, for executions, and the day prior to the trading decision for orders. The factor short-term Momentum is measured over the last two days prior to execution. Momentum measures the price evolution of a stock within the last two trading days as a fraction of absolute price changes. Specifically, The categories of each factor are preferably restricted to be used with other categories as follows: Type categories Value and Growth can be selected only with factors Market Capitalization and Side, and Type category Micro cap can be selected only with the factor Side. For scenarios that do not use the factors Size and Momentum, empirical distributions can be natural estimates for peer cost distributions. However, this is not true for the other cases. It is compelling that cost estimates should be consistent and close to each other for close values of size and momentum. In other words, the ranks of realized costs for two very similar scenarios should not differ very much. The present invention provides robust and consistent peer cost estimates for any choice of factors: Market Capitalization, Side, Market, Size and Momentum. Since Type is used in the present invention in conjunction with the factors Market Capitalization and/or Side only, it is not considered for simplicity. However, one having ordinary skill in the art will understand that factor Type can be easily incorporated in the methodology of the present invention. The methodology of the present invention provides estimates for cost percentiles for any values of Size and Momentum from [0,∞) and [−1,1], respectively. Therefore, the methodology provides much more flexibility than actually needed when values of Size and Momentum are subdivided into different groups, and can be applied even if the choice of the ranges for Size and Momentum is different from the ones shown above. The present invention is described next by way of example. Estimation methodology is based on US execution data from January 2002 to December 2002 submitted by users of TCA. In this sample, the institutional trades represented 91 firms. All institutions together accounted for 14.6 million trades, 82.7 billion shares and 2,067 billion total dollar value. The trades were clusterized into 6.4 million orders; an average order consisted of 2.3 executions. Analysis of Average Realized Transactions For benchmark V_{T}, it is observed that most of the average costs are concentrated around zero for all categories that have been studied. The highest absolute value of average costs is 17 b.p. Average costs for Growth and Value stocks are close, while costs for Micro cap stocks are significantly negative. Similarly to the previous benchmarks, average trading costs seem to be inversely related to market capitalization and OTC stocks appear to have higher average costs than Listed stocks. However, in contrast to the pre-trade benchmarks, there is little difference between average costs for buys and sells (at least for the dollar weighted averages), which is likely due to the fact that, by construction, the VWAP benchmark is set for the day and is not affected by price movement within each day. Average cost behavior for Size and Momentum factors for V_{T }is similar to the case of pre-trade benchmarks. The post-trade benchmarks C_{T+1 }and C_{T+20 }yield quite different results. Benchmark C_{T+20 }provides average costs that fluctuate substantially, for example, both dollar and equally weighted average costs have inverse signs for the same categories in some cases. Basically, the benchmark C_{T+20 }does not seem to indicate any meaningful measure for price impact. Benchmark C_{T+1 }provides average costs that have the reversed behavior of the pre-trade benchmark C_{T−1}. Costs overall are mostly positive, which indicates that on average, peer institutions have strong performance with respect to this benchmark. Micro cap stocks have the highest positive costs and executions of OTC stocks outperform those of Listed stocks. The analysis shows that average realized transaction costs of the exemplary data set are in line with empirical results presented by other researchers (see, for instance, Chakravarty, Panchapagesan and Wood [3]). The results strongly confirm that measuring costs with respect to different benchmarks affects performance evaluation significantly. In light of this fact, it seems to be a challenge to build a methodology that can be efficiently applied for all benchmarks discussed above. Peer cost percentiles can be estimated for all benchmarks, clusterization types and possible choices of scenarios, assuming that at least one of the factors Size and Momentum has been selected. More precisely, the main result is to derive estimates of cost percentiles:
where y=(y1, y2, y3, y4, y5) are arbitrary values for factors Market Capitalization, Side, Market, Size and Momentum, iε[0,100], and costs are measured relative to one of the six benchmarks discussed above. Before estimating X_{i }in Eq. (3), one must note that, first, while the factors Market Capitalization, Side and Market are discrete, Size and Momentum can have any values from [0,∞) and [−1,1], respectively. Consequently, Eq. (3) consists of an infinite number of functions and thus, an infinite number of estimates have to be derived. Second, a pure empirical approach might not be practical in all cases. Subdividing factors Size and Momentum into different groups and computing the empirical distribution for each scenario may lead to inconsistency and instability. As a result, performance of costs realized from two very similar scenarios may be ranked very differently, which may be confusing for users. Third, it is preferred to have a methodology that provides robust estimates and that works for both clusterization types and all six benchmarks C_{T−1}, V_{T}, C_{T+1}, C_{T+20}, O_{T }and M_{T}. This requirement is important since various benchmarks (for instance, V_{T }and C_{T−1}) have very different properties. In provisional application No. 60/464,962, an ordinary least squares (OLS) methods method is described for providing estimates. The present invention does not focus on the mean or median only, but also provides estimates for the 25th, 40th, 60th and 75th costs percentiles in addition to the median. Instead of regressing all the cost percentiles in the comparison framework directly on the (total) trade size and momentum values, the present invention subdivides the comparison framework into different groups depending on the Momentum and Size of the executions (orders). Then, for each group, the 25th, 40th, 50th (median), 60th and 75th cost percentiles, are determined, as well as the equally weighted average values of momentum and (total) trade size. Similar to the simple OLS approach, based on research conducted, all five percentiles are assumed to depend linearly on functions f and g of size and momentum, or, specifically,
Moreover, based on empirical research, it is assumed that f is positive, monotonely increasing, f(0)=0, and g is either
A possible choice for f is f(x)=x^{μ}, for x>0 and some μ>0. In order to have a rough estimate for the whole peer cost distribution of a scenario, the percentiles between 25 and 75 can be computed by linear interpolation. Since transaction cost distributions are heavy-tailed, percentiles below 25 and above 75 are derived assuming Pareto type of distributions. Different regression estimation techniques can be chosen to estimate the regression parameters (α_{i}, β_{i}, γ_{i}) in Eq. (4) by regressing the cost percentiles (i) on average values of momentum and size. Groups without sufficient number of observations are preferably skipped from the regression in order to reduce noise as much as possible and ensure stability of the estimates. The present invention focuses on the following three regression techniques: (a) ordinary least squares (OLS), (b) weighted least squares (WLS) with respect to OLS residuals (WLS1), and (c) WLS with respect to observations in each subdivision (WLS2). The WLS1 approach is an enhancement of the OLS approach and comprises two steps: first, OLS regression is conducted and the residuals of the regression are determined; and second, the parameters are reestimated by weighting the observations with the inverse of their squared residuals. In order to avoid abnormal weighting, inverses of the squared residuals are truncated by the value
Estimates become more robust due to the weighting. Moreover, based on research, squared residuals are generally the highest for large groups with large trade and order sizes. Weighting by the residuals increases the importance of cost percentiles for groups with smaller sizes. This is desirable since executions (and orders) with small (total) trade sizes are in the majority as pointed out above. Method WLS2 weights the importance of each group in a different way. Instead of weighting by the OLS residuals, WLS2 takes into consideration the amount of observed data in each subdivision and thus weights by the number of observations in each group. The problem with this method is that the number of observations might vary dramatically from group to group according to the data. The approach might yield reasonable results for some scenarios (usually for small trade sizes and momentum values close to zero) but provide bad estimates overall. The present invention has the advantage that it provides more information about the whole peer cost distribution. Moreover, it filters out outliers in a natural way by taking medians (and other percentiles) in each group. However, it should be noted that there is no theoretical justification how to subdivide groups optimally, and regressing percentiles on the average size and momentum is only an approximation. Regression Constraints Special attention should be paid to the fact that, without assuming any constraints on the regression parameters: α_{i}, β_{i }and γ_{i}, i=25, 40, 50, 60 and 75, it could occur that for some pair (S, M),
To avoid such situations, constraints have to be assigned to the regression parameters. The constraints depend on the choice of benchmark and of function g. Accordingly, there are three restrictions for each scenario, benchmark and clusterization type. The first constraint suggests that for all cases, condition (5) should not hold for (S, M)=(O,O). In other words, we assume that α_{i}≧α; for i>j. The second restriction takes into consideration that dispersion of costs should increase or decrease as size increases depending on the benchmark and clusterization type. Precisely, for i>j,
The last constraint depends on the choice of the function g and on the type of benchmark. Typically, it is a technical condition on the parameters γ_{i }that ensures that (5) doesn't happen. Finally, if any of these constraints is violated, the regression parameters (α_{i}, β_{i}, γ_{i}) are adjusted relative to the median are (α_{50}, β_{50}, γ_{50}). This approach guarantees that the medians, as the most important percentile estimates, have no regression constraints, and thus, remain unaffected by possible adjustments. Selection of f and g For each benchmark and clusterization type, several functions of f and g are chosen in regression Eq. (4). The linear functions f(x)=x and g(x)=x provide the good results for all benchmarks, except for M_{T}. Performance was measured via the average value of R^{2 }for regressions and the number of adjustments that had to be applied due to the regression constraints. Average R^{2 }of all possible scenarios was around 0.55 for the test set, and parameters had to be adjusted in approximately 30% of cases. The methodology had the best performance for benchmark C_{T+20 }and executions with average R^{2}=0.62, and the worst performance for the benchmark M_{T }and executions with average R^{2}=0.45. It is assumed that the good performance for C_{T+20 }has the following explanation. As already mentioned above, benchmark C_{T+20 }is just a measure for general price movement and noise in the 20 day period. From this point of view, empirical cost percentiles for C_{T+20 }might depend on the underlying trades or orders, very little, and thus, the dependence on momentum and size values of the stocks traded will be weak as well. As a consequence, β_{i }and γ_{i }in Eq. (4) can be set to 0 so that Eq. (4) is transformed into
The poor performance of M_{T }can be explained by the completely different behavior of its cost percentiles. The prevailing midquote benchmark is, probably, the purest benchmark that can mimic the unperturbed price. For small trade sizes, execution prices are naturally bounded by the bid and ask quotes of a stock and thus, by definition, costs with respect to the prevailing midquotes are bounded as well. As a result, all five cost percentiles must lie very closely to each other, which, unfortunately, results in the violation of the regression constraints. Through empirical studies, it was determined that the functions
where
in regression Eq. (4) for benchmark M_{T }yield the most satisfactory results. The function f_{2 }transforms sizes of less than 2% of ADV into even smaller values. The transformation has the desired effect that percentile cost estimates of small trade sizes do not differ significantly. f, and g model the overall non-linear behavior of X_{i }in the variables S and M, respectively. Modeling the Tails of Peer Cost Distributions It is well-known that empirical cost distributions are generally asymmetric and heavy-tailed. The asymmetry has been incorporated in the two-step methodology of the present invention by using five independent regression equations for the estimation of the 25th-, 40th-, 50th-, 60th- and 75th-percentiles. The heavy tails of the peer cost distributions can be modeled by Pareto distributions that are commonly used in extreme value theory (see e.g. Embrechts, Klueppelberg and Mikosch [6]). The modeling of the left tail of a peer cost distribution can be represented by the function F only. The methodology for the right tail can be modeled in a similar way. Assuming a Pareto-type distribution tail behavior, the left tail of F is modeled as
Condition (i) follows directly from the definition of X_{25 }and Eq. (9), condition (ii) guarantees that the peer cost distribution function F is smooth in X_{25}, and condition (iii) assumes that all peer cost distributions must have virtually finite ranges. Selection of the function (9) does not assume that the function can be equal to 0, but the condition (iii) makes costs below −10,000 basis points practically impossible. Conditions (i), (ii) and (iii) define the left tail of the distribution function F uniquely and percentiles X_{1}, . . . , X_{24 }can be derived. Since actual transaction costs are extremely noisy and heavy-tailed, a robust method to build peer group cost distributions is required. The present invention provides a methodology that estimates peer cost percentiles for six different benchmarks, two different clusterization types and all possible choices of scenarios. In the present invention, trading costs can be grouped by the factors Type, Market Capitalization, Side, Market, Size and Short-term Momentum. While the first four factors have discrete values as input, it may be assumed that the factors Size and Momentum can have any values between [0,∞) and [−1,1], respectively. The two-step approach provides smooth and robust estimates for all scenarios corresponding to any values of numerical factors Size and Momentum. If Size and Momentum are subdivided into discrete groups S_{1}, . . . , S_{m }and M_{1}, . . . , M_{n}; m, n≧1, respectively, the procedure for estimating peer cost distributions remains similar to the continuous case. For any partition (S_{j}, M_{k}) 1≦j≦m and 1≦k≦m, compute average Size and Momentum (S, M) for the partition and determine the five percentiles X_{25}, . . . , X_{75 }by inserting (S, M) in Eq. (4). All other percentile computations are identical to the continuous case. The present invention filters out outliers in a natural way. Moreover, in contrast to a simple OLS regression, the two-step approach yields percentile estimates for the whole peer cost distribution. There is no theoretical justification on how to subdivide Momentum and Size groups in the first step of our methodology optimally. Regressing percentiles on the average Size and Momentum is an approximation only. To measure performance of the two-step approach for an arbitrary scenario y=(y_{1}, y_{2}, y_{3}, y_{4}, y_{5}) for Market Capitalization, Side, Market, Size and Momentum one can compare the theoretical distributions with the corresponding empirical peer cost distributions (for y_{4 }and y_{5 }one can choose intervals [y_{4}−Δy_{4}, y_{4}+Δy_{4}] and [y_{5}−Δy_{5}, y_{5}+Δy_{5}]). Comparing the theoretical with the empirical distributions provides an idea on how well the methodology works. Empirical studies performed by the present inventors have shown that in most cases estimated peer cost distributions are very close to the actual distributions. Percentile estimates of scenarios with very flat distributions appear to be less reliable. In particular, peer cost estimates for benchmark C_{T+20 }might differ significantly from the empirical peer cost characteristics. The presented charts can be viewed as a representative sample to assess performance of the two-step approach. The methodology provides consistent cost percentile estimates for the selection of the benchmarks, clusterization types and scenarios. By construction, estimates of median are the most accurate while percentiles for tails are based on modeling assumptions and, therefore, can potentially differ from actual percentiles. One could suggest to estimate more percentiles in equation (4). However, increasing the number of percentiles that are estimated by a regression equation has a big drawback. The more regressions one adds to equation (4) the more adjustments and estimation errors can occur. We believe that the current method provide; the most accurate percentile estimates around the center of the distribution as well as good percentile estimates overall. The following documents were referenced above throughout the present disclosure by author and [number]. The entire contents of each of the following publications are incorporated herein by reference:
One skilled in the art will understand that the above methodology may be implemented in any number of ways. For example, referring to Tools can be used to collect trade data. For example, ITG markets a product called TCA™ (transaction cost analysis), which can collect and analyze transaction data. This tool may be used to collect transaction data and download the data to PGD database 104. As transactional data is collected, the benchmarks may be calculated in real-time as the data, or data can be collected later by batch processing. The data may be separated or organized according to cost factors, such as Size, Type, etc. Periodically, such as once a month, or once a week, the two-step statistical analysis described above is performed on the transaction data to generate cost estimates for each institution for each scenario. First, data is grouped according to size and momentum and, second, each percentile (i) is regressed using linear interpolation, and other techniques described above. The data can be presented to a user in any number of ways. Accordingly, processor unit 102 may be appropriately outfitted with software and hardware to perform the processes describe above, and configured to communicate with database 104 as necessary. One skilled in the art will understand that the system may be programmed using a number of conventional programming techniques and may be implemented in a number of configurations, including centralized or distributed architectures. Peer investment institutions may access the PGD via a client interface. An exemplary display is shown in Thus, the present invention has been fully described with reference to the drawing figures. Although the invention has been described based upon these preferred embodiments, it would be apparent to those of skilled in the art that certain modifications, variations, and alternative constructions would be apparent, while remaining within the spirit and scope of the invention. In order to determine the metes and bounds of the invention, therefore, reference should be made to the appended claims. Patent Citations
Non-Patent Citations
Referenced by
Classifications
Legal Events
Rotate |