![]() ![]() ![]() They can approximate any reasonable function to perfection. Neural networks are more flexible than tree-based methods. A prominent example is the M4 competition where the winning algorithm heavily relies on neural networks. Especially because of their capacity to flex themselves in order to capture complex underlying data generating process. That said, neural networks are fashionable nevertheless. This lack of robustness makes it tedious to promise the same out-of-sample performance as the one observed for the validation set. Neural networks are also quite sensitive to feature-scaling, there is a large amount of hyperparameters to tweak, and the tweaking of many hyperparamers substantially alters expected forecasting performance. There is no way to sugar-coat it, neural networks are way much more sensitive (less robust) to hyperparameters than tree-based methods. Mature in the sense that proper default values for hyperparameters are established, and also that they are easy to configure.Īnother second point is something you may already know: “The difficulty of tuning these models makes published results difficult to reproduce and extend, and makes even the original investigation of such methods more of an art than a science.” (quoted from Algorithms for Hyper-Parameter Optimization). The authors make the point, and I agree, that one important difference between tree-based methods and neural network is the available at-the-ready software tree-based methods enjoy more mature publicly available software. The winning algorithm in the M5 competition was a random forest variant. “Our central observation is that tree-based methods can effectively and robustly be used as blackbox learners” Paper 2 (by the way it’s open access so you can download the published version) is basically some shared thoughts from experienced practitioners with regards to the M5 competition – forecasting 42,840 time series. Of course neural networks are no slouches and do appear in the top 10 performers in this exercise. Still the takeaway stands: random forest is likely to do very well. ![]() Sure, on some data (such as data 41 and 70, see the downward spikes) it performs much worse when compared to the best classifier for that data. For almost all data the accuracy achieved by the random forest algorithm is fairly close to the best, or it is itself the best. The red is the accuracy obtained by the random forest algorithm (parallel version). The blue line is the accuracy obtained by the best classifier. “the best of which are implemented in R and accessed via caret.”īelow is Figure 2 (right) from the paper which I find relevant to show here: The x-axis has the different data on it (so 121 data sets). “The classifiers most likely to be the bests are the random forest versions” Paper 1 compares the performance of 17 algorithm families – each family with numerous variants totaling 179 classifiers, across 121 data sets. Here are couple of excerpts from the two papers. I was pleasantly surprised to observe the synergy between the two papers, given they are published almost a decade apart. Paper 1: Do we Need Hundreds of Classifiers to Solve Real World.Over Christmas I read the following two papers long-resting in my “to-read” list: Those references help form an opinion with regards to when one should use neural networks and when tree-based methods are preferable, if you don’t have time to implement both (which is usually the case). Because both neural networks and tree-based methods are able to capture non-linearity in the data, it’s not easy to choose between them.Two academic references lauding the powerful performance of tree-based methods.These are ultra flexible algorithms with impressive forecasting performance even (and especially) in highly complex real-life environments. Another machine learning community darling is the deep learning method, particularly neural networks. They are easy to use and provide good forecasting performance off the cuff more or less. Tree-based methods like decision trees and their powerful random forest extensions are one of the most widely used machine learning algorithms. ![]()
0 Comments
Leave a Reply. |