Nikolay Nikolaev, Goldsmiths College, University of London

Hitoshi Iba, The University of Tokyo

Polynomial Neural Networks

PNN

Polynomial Neural Networks

Polynomial neural networks (PNN) are multilayer perceptrons of neuron-like units which produce high-order multivariate polynomial mappings. These are tree-structured hierarchical cascades of first-order and second-order activation polynomials in the nodes, and input variables passed from the leaves. The activation polynomial outcomes are fed forward to their parent nodes, where partial polynomial models are made. This PNN topology follows the construction of the multilayer GMDH [Ivakhnenko,A.G., 1979] and allows to produce high-order multivariate polynomials by composing tractable activation polynomials in the hidden network nodes.

From a theoretical point of view, polynomial neural networks are attractive for nonlinear modelling because they are:

-universal approximators with which one may approximate any continuous function mapping to an arbitrary precision;

-mathematically tractable since they assume manipulations, like decompositions and reformulations, which make them flexible for adaptive structural identification;

- statistically tractable as they can be directly analysed with the available statistical tools.

From a practical point of view, polynomial neural networks are:

-computationally efficient, for they allow parameter estimation by well known algorithms like least squares;

-open-box models, which are easy to understand and interpret.

The process of learning PNN from data is decomposed in two separate steps: 1) finding the network structure by global population-based search using inductive genetic programming (IGP), and 2) further adjustment of the polynomial weights by retraining using high-order backpropagation algorithms. These two steps make a coherent and integrated methodology for polynomial identification that enables to identify accurate, parsimonious, and predictive models.

Experimental results show that PNN exhibit superior accuracy than many global models such as: statistical learning networks, multilayer perceptrons, and recurrent neural networks on multivariate non-linear regression, time-series forecasting and classification problems [Nikolaev,N. and Iba,H., 2002].

PNN Models

Polynomial Neural Network Models

The PNN models inherit the format of discrete Volterra series and possess universal approximation abilities. Their approximation properties can be explained using the generalized Stone-Weierstrass theorem and the Kolmogorov-Lorentz superposition theorem.

There have been developed several groups of PNN models:

-high-order multivariate polynomials: including block polynomials and horizontally expanded polynomials;

-orthogonal polynomials: including polynomials of orthogonal terms, Chebishev polynomials;

-trigonometric polynomials: using harmonics with non-multiple frequencies;

-rational polynomials: including polynomial fractions and sigmoidal power series;

-local basis polynomials: including radial-basis polynomials and piecewise polynomials;

-fuzzy polynomials: using various membership functions;

-dynamic polynomials: including time-lagged, NARMA polynomials and recurrent PNN.

BP

Backpropagation Re-training of PNN

The PNN weights during evolutionary IGP search for the best network model are estimated rapidly by ordinary least squares fitting starting from the lowest layer with input leaves moving up to the tree root output node. Additional weight improvement is performed by further training using high-order backpropagation (BP) techniques especially derived for networks with polynomial activation functions. There are developed several re-training algorithms for PNN [Nikolaev,N. and Iba,H., 2002]: batch BP, incremental BP, second-order BP algorithms using the Newton's method and the Levenberg-Marquardt method, temporal BP algorithms like Real-Time Backpropagation Through Time and Epochwise Backpropagation Through Time.

The generalization performance when learning PNN is controled by methods for overfitting avoidance, such as: global regularization, local regularization, pruning by weight subset selection, and second-order pruning methods like optimal brain damage and optimal brain surgeon.

References

Ivakhnenko,A.G. (1971). Polynomial Theory of Complex Systems. IEEE Trans. on Systems, Man, and Cybernetics. 1 (4): 364-378.

Nikolaev,N. and Iba,H. (2003). Learning Polynomial Feedforward Networks by Genetic Programming and Backpropagation. IEEE Trans. on Neural Networks, vol.14, N:2, pp.337-350.


Relevant Learning Network Sites:


n.nikolaev@gold.ac.uk