Investment Performance Outlier Testing

Sean P. Gilligan, CFA, CPA, CIPM
Managing Partner
January 29, 2020
15 min
Investment Performance Outlier Testing

For any firm that aggregates portfolios of the same strategy into a composite, or otherwise groups portfolios by mandate, how do you know that each portfolio truly follows that strategy? The answer is outlier testing.

Why Utilize Composites?

The GIPS standards require firms managing separate accounts to construct composites, which aggregate all discretionary portfolios of the same strategy. However, even for firms that are not GIPS compliant, the use of composites is considered best practice when reporting investment performance to prospective clients. Composites offer a more complete picture than presenting performance of a model or “representative portfolio” – which usually leave prospects wondering whether the information is truly representative or if the portfolio presented was “cherry picked.”

When creating and maintaining composites, firms must ensure that portfolios are included in the correct composite for the right time period – the period for which you had full discretion to implement the composite strategy for that portfolio. This is achieved by following a clearly documented set of policies and procedures for composite inclusion and exclusion. However, what happens when changes are made to a portfolio and those changes are not communicated to the person maintaining the composite?

In an ideal world, information in your firm would flow perfectly so that the person maintaining your composites knows exactly what is happening with the firm’s clients. In reality, client requests commonly result in small or temporary changes to the portfolio (e.g., halt trading, raise cash) that are not formally documented in the client’s investment guidelines or investment policy statement.

Without formal documentation of these changes, information may not flow down to the manager of your composites. While these minor or temporary changes may not affect the client’s long-term objectives, they may cause the portfolio to deviate from the strategy, requiring (at least temporary) removal from its composite. When these restricted portfolios are left in the composite, they often become performance outliers and create “noise” in the composite results. This “noise” prevents the composite from providing a meaningful representation of the portfolio manager’s ability to implement the strategy. This will also interfere with your prospective clients’ ability to analyze and interpret your performance results.

Why test for performance outliers?

Testing for performance outliers prior to finalizing and publishing performance results can help your firm remove this “noise” and can prevent costly errors in performance presentations. Firms that lack adequate composite construction policies and controls to ensure the policies are consistently followed often end up with errors in their composite presentations. In fact, it is very likely that errors in your performance exist. It is rare for us at Longs Peak to conduct an outlier analysis where no issues are found. Outlier testing should be completed quarterly and at a minimum, before any related verification or performance examination.

Many firms, especially those that are GIPS compliant, rely on their verifier to catch errors in their composites. We do not recommend this and suggest firms perform testing internally (or with the help of a performance consultant like Longs Peak) because:

  1. Verifiers only test a sample and will likely not catch all of your issues.
  2. Verification may happen months after the performance has been published. When errors are found, it may require redistribution of presentations with disclosures regarding prior performance errors.
  3. When verifiers find errors, they generally increase their sample size as well as their assessment of engagement risk. These two things lead to more time spent on the verification and a potential increase in your verification fee.

Even if not GIPS compliant, when firms use composites, regulators may test to ensure the composites are a meaningful representation of the strategy. In addition to improving accuracy, testing for performance outliers can help your firm‘s composites meet the standards expected by regulators.

How can performance outliers be identified?

Testing for performance outliers involves reviewing the performance of portfolios within the same composite or strategy to test if they are performing similarly. This testing allows you to flag any portfolios that may be performing differently so you can evaluate if their inclusion in the composite is appropriate.

For example, if your firm has a Large Cap Growth composite, testing performance outliers would involve compiling the return data for all of your Large Cap Growth portfolios, identifying which portfolios performed materially different from their peers, researching why they performed differently, and then taking the appropriate action if an issue is discovered. This may sound like a daunting task, but it doesn’t have to be. Let us walk you through this in more detail.

Some firms simply look at the absolute difference between each portfolio’s monthly return and the monthly return of the composite. While this may be straight forward, relying only on the absolute difference to determine outliers does not take into consideration the size of the return and the normal distribution of portfolio returns in the composite. For example, if you set a threshold to look at all portfolios that deviate from the composite return by 50bps, the result for a composite with low dispersion and a total return of 2% would very be different than a composite with higher dispersion and a total return of 20%.

In the outlier analysis Longs Peak conducts for clients, we use standard deviation in conjunction with a comparison of the absolute differences to identify the outlier portfolios that require review. Utilizing standard deviation allows us to identify portfolios that are truly outside the normal distribution of returns for each period. For example, reviewing all portfolios that are more than 3 standard deviations from the composite mean will provide the portfolios outside the normal distribution of returns for that period, regardless of the size of the return or the level of dispersion in that composite.

What to consider when reviewing outlier performance

The severity of the outlier

The larger the outlier, the more likely it is that the portfolio has an issue that would require it to be removed from the composite. We typically start by looking at the most extreme outliers first. Generally, we look at portfolios with performance periods flagged with +/-3 standard deviations from the mean return for the period. By addressing these first (including removing them if it is determined they do not belong in the composite), we are able to re-run the outlier test to assess what outliers exist without these extreme cases disrupting the analysis.

Once these extreme outliers are addressed, we move on to review the portfolios that are +/-2 standard deviations and even +/-1.5 standard deviations, if needed. We keep reviewing accounts with returns closer and closer to the composite’s mean return until we are consistently confirming that the portfolios do in fact belong in the composite and errors are not being found.

Each firm will be different in how much they need to drill down to get to a point of comfort that no more errors exist. If your composite is managed strictly to a model, the outliers will be very clear and easy to identify. If each portfolio you manage is customized, more research is often needed to determine if the outlier performance is simply a result of the portfolio’s customization or if the portfolio was included in the wrong composite.

How often the portfolio is an outlier

Longs Peak’s performance outlier reports show a portfolio’s performance, the number of standard deviations it is from the mean each month, and the number of months the portfolio was an outlier throughout its history in that composite. Our reports also show whether there was a cash flow during that period or not. The following are examples of outlier frequencies we evaluate:

Infrequent: If you see that a portfolio is only an outlier for one month and that month had a large cash flow, then you will know that the portfolio is likely only an outlier for that period because of the cash flow and, often, no further research is required.

Frequent: If you can see that the portfolio is an outlier for most of the months under review, then you will know that there is likely an issue with this portfolio.

As of a specific date: If you can see that the portfolio was not an outlier historically, but became a frequent outlier from a certain month forward, this may indicate that a restriction was added or that the strategy changed as of that period. The portfolio may then need to be reclassified to the appropriate composite or flagged as non-discretionary.

The most common causes of outlier performance and how to address performance outliers

Common causes of outlier performance:

  • Data issues – When outliers are extreme, it is likely that there is an issue with the data. Examples include a pricing issue that caused a material jump in performance or a late dividend hitting a portfolio that is closing and had most of its assets already transferred out. These issues are often easily addressed, depending on the circumstance of each case.
  • Cash flows – If a portfolio is only an outlier for one month and during that month the portfolio experienced a large cash flow, this is likely the reason for the outlier performance. If the portfolio had high cash for a period of time around the cash flow and the market moved during that period, this portfolio likely would perform differently than its fully invested peers. Nothing needs to be done in this scenario since the outlier performance is explained and there is no indication that the portfolio is invested incorrectly or grouped with the wrong portfolios.
  • Legacy positions or other client restrictions – If your clients hold legacy positions that you are restricted from selling or have other similar restrictions, this will likely cause these portfolios to perform differently when compared to their unrestricted peers. Depending on your composite construction rules, unless immaterial, these portfolios likely need to be excluded from the composite. With these portfolios removed, other outliers may appear that were not as noticeable when the restricted portfolios were included. It is important to refer to your firm’s composite construction policies, which should outline clear parameters for when restricted portfolios should be included/excluded in composites.
  • Portfolio categorized incorrectly – A portfolio may appear as an outlier because it was placed in the wrong composite. This often happens if a portfolio’s composite changed and it was not removed from its prior composite. If this is the case, the portfolio must be removed (after the change) and added to the new composite based on the timing outlined in your firm’s composite construction policies.
  • Portfolio managed incorrectly – Performance outlier analysis may help identify a portfolio that is managed to the wrong strategy. For example, it is possible that the portfolio is grouped with the correct portfolios, but the wrong strategy was implemented in the portfolio. This is one of the most important errors that performance outlier testing can identify because it means that the client is actually not having their money managed to the strategy for which your firm was hired. In this case, the portfolio would need to be rebalanced to the correct strategy. Likely, a review of the history would need to be conducted as well to ensure the client was not disadvantaged by the error.
  • High dispersion between portfolio managers – Especially when more than one portfolio manager is implementing the same composite at your firm, material differences may exist in the way they each manage the strategy. Outlier performers may be due to differences in the portfolio managers’ discretionary management. If the composite is being sold as one cohesive product, it is important to identify where the portfolio managers deviate and determine if they can work more closely together to avoid high dispersion or if the strategy should actually be run as two different products.

When researching outlier performance, keep in mind that, on its own, a portfolio’s performance deviating from its peers is not a valid reason to remove the portfolio from its composite. You need to determine the root cause of the deviation and remove the portfolio from its composite only if the root cause was client-driven. If the deviation was caused by tactical, discretionary moves made by the portfolio manager, the portfolio must remain in the composite as its performance is still a representation of the portfolio manager’s implementation of the strategy.

Ready to implement performance outlier testing at your firm?

While it is best practice to create a flow of information that will allow portfolios to proactively be included/excluded in the correct composite at the appropriate time, testing for performance outliers acts as a back-up plan to catch anything that was missed.

If analyzing your composite data to identify performance outliers is not something you have the resources to do internally, Longs Peak is available to help. Longs Peak offers both consulting and reporting services that can assist your firm with outlier analysis. Conducting outlier analysis should be done at least quarterly to help ensure your firm is managing your portfolios consistently and are reporting strategy or composite performance that is meaningful and accurate. Please contact us to discuss how we can help implement this practice for your firm.

Questions? 

If you have questions about investment performance, composite construction, or the GIPS standards, we would be love to talk to you. Longs Peak’s professionals have extensive experience helping firms with all of their investment performance needs. Please feel free to email Sean Gilligan directly at sean@longspeakadvisory.com.

Recommended Post

View All Articles

From Compliance to Growth: How the GIPS® Standards Help Investment Firms Unlock New Opportunities

For many investment managers, the first barrier to growth isn’t performance—it’s proof.
When platforms, consultants, and institutional investors evaluate new strategies, they’re not just asking how well you perform; they’re asking how you measure and present those results.

That’s where the GIPS® standards come in.

More and more investment platforms and allocators now require firms to comply with the GIPS standards before they’ll even review a strategy. For firms seeking to expand their reach—whether through model delivery, SMAs, or institutional channels—GIPS compliance has become a passport to opportunity.

The Opportunity Behind Compliance

Becoming compliant with the GIPS standards is about more than checking a box. It’s about building credibility and transparency in a way that resonates with today’s due diligence standards.

When a firm claims compliance with the GIPS standards, it demonstrates that its performance is calculated and presented according to globally recognized ethical principles—ensuring full disclosure and fair representation. This helps level the playing field for managers of all sizes, giving them a chance to compete where it matters most: on results and consistency.

In short, GIPS compliance doesn’t just make your reporting more accurate—it makes your firm more credible and discoverable.

Turning Complexity Into Clarity

While the benefits are clear, the process can feel overwhelming. Between defining the firm, creating composites, documenting policies and procedures, and maintaining data accuracy—many teams struggle to find the time or expertise to get it right.

That’s where Longs Peak comes in.

We specialize in simplifying the process. Our team helps firms navigate every step—from initial readiness and composite construction to quarterly maintenance and ongoing training—so that compliance becomes a seamless part of operations rather than a burden on them.

As one of our clients put it, “Longs Peak helps us navigate GIPS compliance with ease. They spare us from the time and effort needed to interpret what the requirements mean and let us focus on implementation.”

Real Firms, Real Impact

We’ve seen firsthand how GIPS compliance can transform firms’ growth trajectories.

Take Genter Capital Management, for example. As David Klatt, CFA and his team prepared to expand into model delivery platforms, managing composites in accordance with the GIPS standards became increasingly complex. With Longs Peak’s customized composite maintenance system in place, Genter gained the confidence and operational efficiency they needed to access new platforms and relationships—many of which require firms to be GIPS compliant as a baseline.

Or consider Integris Wealth Management. After years of wanting to formalize their composite reporting, they finally made it happen with our support. As Jenna Reynolds from Integris shared:

“When I joined Integris over seven years ago, we knew we wanted to build out our composite reporting, but the complexity of the process felt overwhelming. Since partnering with Longs Peak in 2022, they’ve been instrumental in driving the project to completion. Our ongoing collaboration continues to be both productive and enjoyable.”

These are just two examples of what happens when compliance meets clarity—firms gain time back, confidence grows, and new business doors open.

Why It Matters—Compliance as a Strategic Advantage

At Longs Peak, we believe compliance with the GIPS standards isn’t a cost—it’s an investment.

By aligning your firm’s performance reporting with the GIPS standards, you gain:

  • Access to platforms and institutions that require GIPS compliant firms.
  • Credibility and trust in an increasingly competitive landscape.
  • Operational efficiency through consistent data and documented processes.
  • Scalability to support multiple strategies and distribution channels.

Simply put: compliance fuels confidence—and confidence drives growth.

Simplifying the Complex

At Longs Peak, we’ve helped over 250 firms and asset owners transform how they calculate, present, and communicate their investment performance. Our goal is simple: make compliance with the GIPS standards practical, transparent, and aligned with your firm’s growth goals.

Because when compliance works efficiently, it doesn’t slow your business down—it helps it reach further.

Ready to turn compliance into a growth advantage?

Let’s talk about how we can help your firm simplify the complex.

📧 hello@longspeakadvisory.com
🌐 www.longspeakadvisory.com

Performance reporting has two common pitfalls: it’s backward-looking, and it often stops at raw returns. A quarterly report might show whether a portfolio beat its benchmark, but it doesn’t always show why or whether the results are sustainable. By layering in risk-adjusted performance measures—and using them in a structured feedback loop—firms can move beyond reporting history to actively improving the future.

Why a Feedback Loop Matters

Clients, boards, and oversight committees want more than historical returns. They want to know whether:

·        performance was delivered consistently,

·        risk was managed responsibly, and

·        the process driving results is repeatable.

A feedback loop helps firms:

·        define expectations up front instead of rationalizing results after the fact,

·        monitor performance relative to objective appraisal measures,

·        diagnose whether results are consistent with the manager’s stated mandate, and

·        adjust course in real time so tomorrow’s outcomes improve.

With the right discipline, performance reporting shifts from a record of the past toa tool for shaping the future.

Step 1: Define the Measures in Advance

A useful feedback loop begins with clear definitions of success. Just as businesses set key performance indicators (KPIs) before evaluating outcomes, portfolio managers should define their performance and risk statistics in advance, along with expectations for how those measures should look if the strategy is working as intended.

One way to make this tangible is by creating a Performance Scorecard. The scorecard sets out pre-determined goals with specific targets for the chosen measures. At the end of the performance period, the manager completes the scorecard by comparing actual outcomes against those targets. This creates a clear, documented record of where the strategy succeeded and where it fell short.

Some of the most effective appraisal measures to include on a scorecard are:

·        Jensen’s Alpha: Did the manager generate returns beyond what would be expected for the level of market risk (beta) taken?

·        Sharpe Ratio: Were returns earned efficiently relative to volatility?

·        Max Drawdown: If the strategy claims downside protection, did the worst loss align with that promise?

·        Up- and Down-Market Capture Ratios: Did the strategy deliver the participation levels in up and down markets that were expected?

By setting these expectations up front in a scorecard, firms create a benchmark for accountability. After the performance period, results can be compared to those preset goals, and any shortfalls can be dissected to understand why they occurred.

Step 2: Create Accountability Through Reflection

This structured comparison between expected vs. actual results is the heart of the feedback loop.

If the Sharpe Ratio is lower than expected, was excess risk taken unintentionally? If the Downside Capture Ratio is higher than promised, did the strategy really offer the protection it claimed?

The key is not just to measure, but to reflect. Managers should ask:

·        Were deviations intentional or unintentional?

·        Were they the result of security selection, risk underestimation, or process drift?

·        Do changes need to be made to avoid repeating the same shortfall next period?

The scorecard provides a simple framework for this reflection, turning appraisal statistics into active learning tools rather than static reporting figures.

Step 3: Monitor, Diagnose, Adjust

With preset measures in place, the loop becomes an ongoing process:

1.     Review results against the expectations that were defined in advance.

2.     Flag deviations using alpha, Sharpe, drawdown, and capture ratios.

3.     Discuss root causes—intentional, structural, or concerning.

4.     Refine the investment process to avoid repeating the same shortcomings.

This approach ensures that managers don’t just record results—they use them to refine their craft. The scorecard becomes the record of this process, creating continuity over multiple periods.

Step 4: Apply the Feedback Loop Broadly

When applied consistently, appraisal measures—and the scorecards built around them—support more than internal evaluation. They can be used for:

·        Manager oversight: Boards and trustees see whether results matched stated goals.

·        Incentive design: Bonus structures tied to pre-defined risk-adjusted outcomes.

·        Governance and compliance: Demonstrating accountability with clear, documented processes.

How Longs Peak Can Help

At Longs Peak, we help firms move beyond static reporting by building feedback loops rooted in performance appraisal. We:

·        Define meaningful performance and risk measures tailored to each strategy.

·        Help managers set pre-determined expectations for those measures and build them into a scorecard.

·        Calculate and interpret statistics such as alpha, Sharpe, drawdowns, and capture ratios.

·        Facilitate reflection sessions so results are compared to goals and lessons are turned into process improvements.

·        Provide governance support to ensure documentation and accountability.

The result is a sustainable process that keeps strategies aligned, disciplined, and credible.

Closing Thought

Markets will always fluctuate. But firms that treat performance as a feedback loop—nota static report—build resilience, discipline, and trust.

A well-structured scorecard ensures that performance data isn’t just about yesterday’s story. When used as feedback, it becomes a roadmap for tomorrow.

Need help creating a Performance Scorecard? Reach out if you want us to help you create more accountability today!

When you're responsible for overseeing the performance of an endowment or public pension fund, one of the most critical tools at your disposal is the benchmark. But not just any benchmark—a meaningful one, designed with intention and aligned with your Investment Policy Statement(IPS). Benchmarks aren’t just numbers to report alongside returns; they represent the performance your total fund should have delivered if your strategic targets were passively implemented.

And yet, many asset owners still find themselves working with benchmarks that don’t quite match their objectives—either too generic, too simplified, or misaligned with how the total fund is structured. Let’s walkthrough how to build more effective benchmarks that reflect your IPS and support better performance oversight.

Start with the Policy: Your IPS Should Guide Benchmark Construction

Your IPS is more than a governance document—it is the road map that sets strategic asset allocation targets for the fund. Whether you're allocating 50% to public equity or 15% to private equity, each target signals an intentional risk/return decision. Your benchmark should be built to evaluate how well each segment of the total fund performed.

The key is to assign a benchmark to each asset class and sub-asset class listed in your IPS. This allows for layered performance analysis—at the individual sub-asset class level (such as large cap public equity), at the broader asset class level (like total public equity), and ultimately rolled up at the Total Fund level. When benchmarks reflect the same weights and structure as the strategic targets in your IPS, you can assess how tactical shifts in weights and active management within each segment are adding or detracting value.

Use Trusted Public Indexes for Liquid Assets

For traditional, liquid assets—like public equities and fixed income—benchmarking is straightforward. Widely recognized indexes like the S&P 500, MSCI ACWI, or Bloomberg U.S. Aggregate Bond Index are generally appropriate and provide a reasonable passive alternative against which to measure active strategies managed using a similar pool of investments as the index.

These benchmarks are also calculated using time-weighted returns (TWR), which strip out the impact of cash flows—ideal for evaluating manager skill. When each component of your total fund has a TWR-based benchmark, they can all be rolled up into a total fund benchmark with consistency and clarity.

Think Beyond the Index for Private Markets

Where benchmarking gets tricky is in illiquid or asset classes like private equity, real estate, or private credit. These don’t have public market indexes since they are private market investments, so you need a proxy that still supports a fair evaluation.

Some organizations use a peer group as the benchmark, but another approach is to use an annualized public market index plus a premium. For example, you might use the 7-year annualized return of the Russell 2000(lagged by 3 months) plus a 3% premium to account for illiquidity and risk.

Using the 7-year average rather than the current period return removes the public market volatility for the period that may not be as relevant for the private market comparison. The 3-month lag is used if your private asset valuations are updated when received rather than posted back to the valuation date. The purpose of the 3% premium (or whatever you decide is appropriate) is to account for the excess return you expect to receive from private investments above public markets to make the liquidity risk worthwhile.

By building in this hurdle, you create a reasonable, transparent benchmark that enables your board to ask: Is our private markets portfolio delivering enough excess return to justify the added risk and reduced liquidity?

Roll It All Up: Aggregated Benchmarks for Total Fund Oversight

Once you have individual benchmarks for each segment of the total fund, the next step is to aggregate them—using the strategic asset allocation weights from your IPS—to form a custom blended total fund benchmark.

This approach provides several advantages:

  • You can evaluate performance at both the micro (asset class) and macro (total fund) level.
  • You gain insight into where active management is adding value—and where it isn’t.
  • You ensure alignment between your strategic policy decisions and how performance is being measured.

For example, if your IPS targets 50% to public equities split among large-, mid-, and small-cap stocks, you can create a blended equity benchmark that reflects those sub-asset class allocations, and then roll it up into your total fund benchmark. Rebalancing of the blends should match there balancing frequency of the total fund.

What If There's No Market Benchmark?

In some cases, especially for highly customized or opportunistic strategies like hedge funds, there simply may not be a meaningful market index to use as a benchmark. In these cases, it is important to consider what hurdle would indicate success for this segment of the total fund. Examples of what some asset owners use include:

  • CPI + Premium – a simple inflation-based hurdle
  • Absolute return targets – such as a flat 7% annually
  • Total Fund return for the asset class – not helpful for evaluating the performance of this segment, but still useful for aggregation to create the total fund benchmark

While these aren’t perfect, they still serve an important function: they allow performance to be rolled into a total fund benchmark, even if the asset class itself is difficult to benchmark directly.

The Bottom Line: Better Benchmarks, Better Oversight

For public pension boards and endowment committees, benchmarks are essential for effective fiduciary oversight. A well-designed benchmark framework:

  • Reflects your strategic intent
  • Provides fair, consistent measurement of manager performance
  • Supports clear communication with stakeholders

At Longs Peak Advisory Services, we’ve worked with asset owners around the globe to develop custom benchmarking frameworks that align with their policies and support meaningful performance evaluation. If you’re unsure whether your current benchmarks are doing your IPS justice, we’re hereto help you refine them.

Want to dig deeper? Let’s talk about how to tailor a benchmark framework that’s right for your total fund—and your fiduciary responsibilities. Reach out to us today.