This is a practical roadmap to guide you when developing a system: https://www.prorealcode.com/wp-content/uploads/2016/09/System_Development_Process_MindMap.pdf
The number one reason historically opitimized systems fail is because of overoptimization. Techniques to avoid this include simple OOS testing (out-of-sample testing) and walk-forward testing. In the first, you optimize and test your strategy on one half of the data, then test it on the unoptimized data. If it is viable, it will produce decent results on both tests.
There is a library full of strategies. They all show historic gains, but vary in reliability. The really viable strategies are not going to be shared with anyone, because they are too valuble. If you want to know why YOUR systems fail, you would need to post one of them so we can find out where you can improve.
Seeing as you use for example four different indicators in a single system, that may be too much. Focus on using one or two indicators as filters, and make the entry depend on some price action event. You may also not be counting the spread and overnight fees correctly when testing. For exits, try using a stop loss that is not fixed, but is set by the average true range.
In addition to that, what time-frames do you use? Higher time-frames are much easier to develop successful systems on. On the other hand, the higher you go, the more fundamental events play into account. Are you in the sweet spot between?
When you describe your tests as having ‘decent gains’, do you look at the gains (which do not really matter as long as positive), or do you look at the maximum drawdowns, number of trades, etc. ? How much drawdown in % is acceptable to you, and how many simulated trades is your minimum to trust a system’s performance? I aim for drawdown of less than 10 % and minimum 500 trades.
Are your systems trading both short and long positions? If not, results may be skewed if the underlying instrument has experienced a bullish or bearish market trend.