TradeAudit Report #TA-2026-0047 - Trend-Following Strategy v1

Validation overview

82

Audit score

This layout works best as a fast orientation layer. It helps the reader see where confidence is strongest and why a strong report can still lead to a paper-trading-first recommendation.

Statistical edge

88%

Walk-forward stability

79%

Parameter robustness

74%

Market-condition survival

83%

Implementation readiness

58%

Decision summary

Current Recommendation

Paper-trade next

This means the validation is strong enough to progress, but the next smart step is a structured paper-trading phase instead of moving straight to live capital.

What This Strategy Is

This example currently behaves as a selective trend-following breakout model rather than an all-weather system that should be expected to stay equally credible across all market conditions.

Where It Fits Best

The strongest fit is in orderly higher-timeframe trend conditions where continuation logic stays clean. It is not framed as a strategy that must remain active in every market phase.

Configuration Choice

One shared profile can still be a reasonable starting point, but we highlight when separate market-specific profiles would be safer than pretending one setup fits every target market with equal credibility.

Immediate Next Move

Keep the configuration fixed and move to a structured paper-trading phase before any live capital is considered. Validation strength does not remove the need for implementation discipline.

Why this market

Benchmark-first starting point

BTC is used here as a benchmark-first starting point because it is one of the most liquid and decision-relevant crypto markets. The goal is not to showcase a random symbol. The goal is to show how we start where decision quality is strongest, then widen only if the strategy earns broader validation.

Validation path chosen

Single benchmark first

This sample shows the first validation step on a benchmark market before widening to more assets. The point is to establish whether the strategy has credible edge in a decision-relevant market before claiming broader portability.

Why not live yet

Historical strength still needs implementation discipline

This example is intentionally framed as paper-trade next because validation strength and live readiness are different questions. Signal handling, execution discipline, and forward confidence still matter after a strong historical result.

Benchmark family

Crypto benchmark

This sample uses a benchmark-first crypto market and a single-benchmark-first validation path before any broader expansion claim is made.

Performance summary - full period 2017-2026

Net return

+3842%

Full period

Profit factor

2.92

Threshold >=1.5

Max drawdown

-44.30%

Threshold <=35.00%

Total trades

38

4.2 per year

Win rate

63.20%

24W / 14L

Avg win

+31.40%

Per trade

Avg loss

-12.10%

R:R 2.60

Sharpe ratio

0.84

Risk-adjusted

Time in market

34%

Avg 47 days/trade

Avg trade dur.

47

Calendar days

Buy & Hold

+6100%

BTC passive benchmark

Walk-forward

5 / 7

Splits passed

How to read large returns

Return here is measured relative to starting capital, so it can exceed 100%. A 100% return means capital doubled. A 300% return means it became four times the starting capital.

Equity curve - strategy vs buy & hold (2017-2026, indexed to 100)

Underwater equity (drawdown from peak, 2017-2026)

Walk-forward validation - 7 time splits

Profit factor per walk-forward split (threshold: 1.5)

Individual trade P&L - 38 trades - 26% win rate, avg winner +21.4% vs avg loser -5.8%

Individual trade P&L distribution (38 trades)

Period	Return	Profit Factor	Max DD	Trades	Sharpe	Verdict
2017-2018	+187%	3.41	-31.2%	6	1.12	Pass
2018-2019	+241%	4.12	-28.4%	5	1.34	Pass
2019-2020	+156%	2.89	-19.7%	6	0.98	Pass
2020-2021	+612%	5.33	-22.1%	7	2.41	Pass
2021-2022	+44%	1.61	-38.9%	5	0.51	Caution
2022-2023	-31.00%	0.71	-44.30%	5	-0.62	Fail
2023-2026	+389%	3.84	-24.6%	4	1.81	Pass

Walk-forward finding

5 of 7 splits pass (71.00%). The 2022-2023 split fails with PF 0.71, -44.30% drawdown, and negative Sharpe. The strategy has a genuine edge in trending conditions but lacks a mechanism to avoid sustained bearish market phases.

Parameter sensitivity - +/-20% variation

Parameter	Base	-20%	+20%	PF range	Stability
ATR period	10	8 -> PF 2.71	12 -> PF 3.04	2.71-3.04	Stable
Multiplier	3.0	2.4 -> PF 2.38	3.6 -> PF 3.11	2.38-3.11	Stable

Sensitivity finding - positive

Both parameters are stable across +/-20% variation. No overfitting detected through parameter sensitivity.

Bear market stress tests - 7 hostile events

2018 Crypto Winter

Jan -> Dec 2018

Survived

4 trades - -30.8% - BTC -83%

2020 COVID Crash

Feb -> Apr 2020

Profitable

1 trade - +17.4% - caught recovery

2021 May Crash

May -> Jul 2021

Avoided

0 trades - no exposure

2021 H2 Chop

Nov 2021 -> Jan 2022

Avoided

0 trades - no exposure

2022 LUNA Collapse

May -> Jun 2022

Avoided

0 trades - no exposure

2022 FTX Collapse

Nov 2022

Avoided

0 trades - no exposure

2022 Full Bear Year

Jan -> Dec 2022

Failed

5 trades - -44.30% - all losses

Bear market finding - critical

Failed the 2022 full bear year: -44.30% drawdown, 5 consecutive losses. The trend-following signal generates bullish flips during relief rallies within macro downtrends. Add a 200-day EMA filter to block entries when macro trend is bearish.

Cross-timeframe robustness

Timeframe	Return	PF	Max DD	Trades/yr	Sharpe	Verdict
1D (submitted)	+3842%	2.92	-44.30%	4.20	0.84	Caution
4H	+2,104%	2.41	-51.2%	14.8	0.67	Fail
1H	+487%	1.63	-68.4%	42.1	0.31	Fail

Statistical significance - Monte Carlo (1,000 simulations)

Observed return

+3842%

Simulation median

+3,791%

p-value

0.47

Significance

Borderline

Monte Carlo finding

With only 38 trades over the full period, the Monte Carlo p-value of 0.47 means 47% of randomly shuffled sequences match or beat the observed return. The strategy's edge is real (positive PF, consistent win rate) but the low trade count limits statistical confirmation. This is a known limitation of low-frequency trend strategies - not a disqualifying finding, but a risk to note.

Recommendations

1 - Add 200-day EMA macro filter (critical)

Add: only enter long trades when close is above the 200-period EMA. This single change would have blocked all 5 losing trades in the 2022 bear year, converting a -44.30% period to flat. Estimated improvement: max DD drops to ~25-28%, walk-forward pass rate increases to 7/7. Re-submit for re-audit at €49.

2 - ATR-based position sizing (recommended)

Scale position size inversely to current ATR. Smaller sizing during high-volatility market phases reduces portfolio-level drawdown without changing signal quality. Estimated improvement on 2018 period: DD from -30.8% to ~18-22%.

3 - Do not deploy in current form

The 2022 bear year failure and -44.30% drawdown exceed the deployment threshold of <=35.00%. Apply the 200 EMA macro filter and re-submit. Expected verdict after fix: PASS.

Strategy interpretation

Strategy Behavior

This example strategy currently behaves more like a selective trend-following breakout model than a broad all-weather system. The edge appears when market structure is orderly enough for continuation logic to remain credible.

Cross-Market Fit

The same core logic can remain viable across several assets without behaving identically on all of them. We use this section to explain when one market looks cleaner than another and why future refinements should be tested market by market instead of assumed universal.

Configuration Recommendation

When the evidence suggests that one shared configuration is no longer the most honest answer across all target markets, the report can recommend separate market-specific profiles rather than forcing one setup to serve every asset with equal credibility.

Understanding your validation results

Each test in this report answers a different question about your strategy. This guide explains what each test measures, what the numbers mean, and how to use the results to improve your strategy before deploying real capital.

01 Performance Summary Foundation

What it measures: The overall performance of your strategy across the full historical dataset - return, drawdown, profit factor, win rate, trade count, and time in market.

Why it matters: This is the starting point. If the full-period numbers look strong but the other tests don't confirm them, the strategy is likely curve-fitted to history. If the full-period numbers are weak, no further validation is needed.

How to interpret

Profit Factor >= 1.5 - the minimum threshold for a tradeable edge. PF of 2.0+ is strong. PF of 3.0+ is excellent.
Max Drawdown <= 35% - the maximum loss from peak to trough. Above 35% means most traders would abandon the strategy mid-drawdown.
Total Trades >= 8 - fewer than 8 trades means the results aren't statistically meaningful.
Time in market - how long you're actually exposed. Lower is better if returns are similar - less time at risk.
Sharpe ratio >= 1.0 - risk-adjusted return. Below 0.5 means returns don't justify the volatility taken.

02 Walk-Forward Validation Most important test

What it measures: Whether your strategy works on data it has never seen before. The full dataset is divided into 7 sequential time windows. The strategy is tested on each window independently - not on the full period at once.

Why it matters: A strategy that only looks profitable because it was optimised on historical data will fail walk-forward validation. It cannot fake results on unseen data. This is the closest simulation to what will happen in live trading.

How to interpret

5+ of 7 splits passing means the strategy has genuine, repeatable edge across different market conditions.
A failing split tells you exactly which market condition breaks the strategy - use that to identify what fix is needed.
All 7 passing is the gold standard. Combined with other tests, this is what makes a strategy PASS-ready.
Consecutive failing splits (e.g. 2022 and 2023 both fail) suggests market-condition dependency - the strategy only works in certain market types.

03 Parameter Sensitivity (+/-20%) Overfitting check

What it measures: What happens to performance when each parameter is varied by +/-20% from its default value - one at a time. For example, if your ATR period is 10, we test 8 and 12.

Why it matters: A strategy that only works at exactly one specific parameter value was probably curve-fitted to that value. A robust strategy maintains its edge when parameters are slightly adjusted - because the underlying logic is sound, not the exact numbers.

How to interpret

STABLE - the profit factor stays above 1.5 across all variants. The edge is robust to parameter choice.
REJECT - one or more variants drops below 1.5 or turns negative. The strategy is fragile at this parameter.
A "banana shape" PF curve (one peak, drops sharply either side) is the classic overfitting signature - avoid deploying.
If all parameters are STABLE, you have good evidence the strategy edge comes from the logic, not lucky parameter selection.

04 Bear Market Stress Tests Survival check

What it measures: How your strategy performs during the 7 most hostile named market events in crypto history - 2018 crash, COVID, LUNA collapse, FTX fraud, and more. Each period is tested in isolation.

Why it matters: A strategy that only shows strong results because of the 2020-2021 bull run is dangerous to deploy. Most bear markets and crashes are preceded by exactly the type of signal a trend strategy generates. This test determines if your strategy survives when conditions turn hostile.

How to interpret

AVOIDED - 0 trades. Best case. The strategy correctly identified a non-tradeable environment and stayed flat.
PROFITABLE - took trades and made money despite hostile conditions. Exceptional.
SURVIVED - took trades with drawdown under 35%. Acceptable.
FAILED - drawdown exceeded 35%, or multiple consecutive losses. This event is the primary risk to live deployment.
If FAILED: identify which condition triggered entries during the crash and add a filter to block it (typically a macro trend filter).

05 Monte Carlo - Statistical Significance Confidence check

What it measures: Whether your strategy's results could have happened by chance. We run 1,000 simulations where your exact trades are shuffled into random order. If most random sequences produce similar returns, the edge may be luck rather than skill.

Why it matters: With only 10-20 trades, it's possible to look profitable purely by luck. A strategy with 100+ trades and a p-value below 0.05 has statistically proven edge. A strategy with 12 trades and p-value of 0.40 may simply have been lucky on a few large winners.

How to interpret

p-value < 0.05 - statistically significant. Less than 5% of random sequences beat your strategy. Strong confidence.
p-value 0.05-0.20 - borderline. Some confidence, but not conclusive. Monitor live carefully.
p-value > 0.20 - not statistically significant. The results may be luck. Needs more trades before deployment.
The fix: more trades = lower p-value. Either run on more assets, shorter timeframe, or extend the test period.
Note: low-frequency strategies (1D, swing) will almost always show borderline p-values due to low trade count. This is expected - weight this finding accordingly.

06 Cross-Timeframe Robustness Robustness check

What it measures: Whether the same strategy logic produces positive results on other timeframes (4H, 1H, 15m). A genuinely robust edge should be visible on at least 2 timeframes.

Why it matters: If a strategy only works on exactly one timeframe (e.g. 1D) but fails on 4H and 1H, the edge may be an artefact of that specific data granularity rather than a real market phenomenon.

How to interpret

Passes 2+ timeframes - strong indicator of genuine, durable edge.
Only passes 1 timeframe - not disqualifying for daily strategies (1D is structurally different from intraday), but note the limitation.
Fails all other timeframes - treat as a warning. Deploy on the tested timeframe only, with extra caution.
Intraday strategies should pass at least 2 intraday TFs (e.g. 1H strategy should also work on 4H) to be considered robust.

Trend-Following Strategy v1 - BTC Market Index - Daily (1D)

Strategy has real edge but fails 2022 bear market stress test

Current Recommendation

What This Strategy Is

Where It Fits Best

Configuration Choice

Immediate Next Move

Benchmark-first starting point

Single benchmark first

Historical strength still needs implementation discipline

Crypto benchmark

How to read large returns

Walk-forward finding

Sensitivity finding - positive

Bear market finding - critical

Monte Carlo finding

1 - Add 200-day EMA macro filter (critical)

2 - ATR-based position sizing (recommended)

3 - Do not deploy in current form

Strategy Behavior

Cross-Market Fit

Configuration Recommendation