US RE37488 E1 Abstract A heuristic processor incorporates a digital arithmetic unit arranged to compute the squared norm of each member of a training data set with respect to each member of a set of centers, and to transform the squared norms in accordance with a nonlinear function to produce training
φ vectors. A systolic array arranged for QR decomposition and least mean squares processing forms combinations of the elements of each φ vector to provide a fit to corresponding training answers. The form of combination is then employed with like-transformed to provide estimates of unknown result. The processor is applicable to provide estimated results for problems which are nonlinear and for which explicit mathematical formalisms are unknown.Claims(33) 1. An heuristic processor comprised of:
(1) non-linear transforming means for producing a respective training
φ vector from each member of a training data set on the basis of a set of centers, each training data set member having a displacement from each of said centers, where a norm of the displacement is calculable from each of said displacements, and each element of a φ vector consisting of a non-linear transformation of the norm of the displacement of the associated training data set member from a respective center set member, (2) processing means for combining training
φ vector elements in a manner producing a training fit to a set of training answers, and (3) means for generating result estimate values, each of said estimate values consisting of a combination of the elements of a respective
φ vector produced from test data, each combination being at least equivalent to a summation of vector elements weighted in accordance with the training fit wherein the transforming means is a digital arithmetic unit computing differences between training data vector elements and corresponding center vector elements and for summing the squares of such differences associated with each data vector-center vector pair, and for converting each sum to a value in accordance with the non-linear transformation and for providing a respective training
φ vector element, wherein the processing means is a systolic array of processing cells for implementing a rotation algorithm to provide QR decomposition of a Φ matrix φ vector rows and least squares fitting to the training answer set, the algorithm involving computation and application of rotation parameters and storage of updated decomposition matrix elements by the processing cells, and wherein the systolic array has a first row of processing cells arranged to receive φ vectors extended by training answers, each first row cell being arranged for input of a respective element of each extended vector. 2. A processor according to claim
1 wherein the processing cells are boundary and internal cells connected to form rows and columns of the systolic array and:(1) each row begins with a boundary cell and continues with at least one internal cells which diminish in number down the array by one per row,
(2) the first array row contains a number of boundary and internal cells equal to the number of elements in an extended vector,
(3) the columns comprise a first column containing a boundary cell only, subsequent columns containing a respective boundary cell surmounted by numbers of internal cells increasing from one by one per column, and at least one outer column of internal cells arranged to receive training answer input,
(4) the boundary and internal cells are arranged to compute rotation parameters from input values and apply them to input values respectively, and to store respective updated decomposition matrix elements for use in such computation, and
(5) the cells have row and column nearest neighbour connections providing for rotation parameters to pass along rows and rotated values to pass down columns.
3. A processor according to claim
2 further including a multiplier cell (M) for multiplying cumulatively rotated values output from an outer column of internal cells by cumulatively multiplied and relatively delayed parameters generated by boundary cells in appropriate form for computing least squares residuals arising between combined elements of training data φ vectors and their respective training answers.4. A processor according to claim
1, wherein the means for generating result estimates values includes means for switching the systolic array to a test mode of operation in which decomposition matrix element update and training answer input are suppressed.5. An heuristic processor comprised of:
(1) non-linear transforming means for producing a respective training
φ vector from each member of a training data set on the basis of a set of centers, each training data set member having a displacement from each of said centers, where a norm of the displacement is calculable from each of said displacements, and each element of a φ vector consisting of a non-linear transformation of the norm of the displacement of the associated training data set member from a respective center set member, (2) processing means for combining training
φ vector elements in a manner producing a training fit to a set of training answers, and (3) means for generating result estimate values, each of said estimate values consisting of a combination of the elements of a respective
φ vector produced from test data, each combination being at least equivalent to a summation of vector elements weighted in accordance with the training fit, wherein the heuristic processor consists at least partly of processing devices linked by connecting means incorporating clocked latches for data storage and propagation. 6. An heuristic processor comprised of:
(1) non-linear transforming means for producing a respective training
φ vector from each member of a training data set on the basis of a set of centers, each training data set member having a displacement from each of said centers, where a norm of the displacement is calculable from each of said displacements, and each element of a φ vector consisting of a non-linear transformation of the norm of the displacement of the associated training data set member from a respective center set member, (2) processing means for combining training
φ vector elements in a manner producing a training fit to a set of training answers, said processing means consisting at least partly of programmed transputers interconnected together by single-bit data links and for performing calculation operations in parallel with one another, and (3) means for generating result estimate values, each of said estimate values consisting of a combination of the elements of a respective
φ vector produced from test data, each combination being at least equivalent to a summation of vector elements weighted in accordance with the training fit. 7. An heuristic processor comprised of:
φ vector from each member of a training data set on the basis of a set of centers, each training data set member having a displacement from each of said centers, where a norm of the displacement is calculable from each of said displacements, and each element of a φ vector consisting of a non-linear transformation of the norm of the displacement of the associated training data set member from a respective center set member, (2) an electronically addressable memory incorporated in the transforming means, the memory “receiving” addresses in fixed point arithmetic format and “providing” output in floating point arithmetic format in the course of producing each said training
φ vector in floating point format, (3) processing means for combining training
φ vector elements in a manner producing a training fit to a set of training answers, and (4) means for generating result estimate values, each of said estimate values consisting of a combination of the elements of a respective
φ vector produced from test data, each combination being at least equivalent to a summation of vector elements weighted in accordance with the training fit. 8. An heuristic processor comprised of:
φ vector from each member of a training data set on the basis of a set of centers, each training data set member having a displacement from each of said centers, where a norm of the displacement is calculable from each of said displacements, and each element of a φ vector consisting of a non-linear transformation of the norm of the displacement of the associated training data set member from a respective center set member, (2) processing means for combining training
φ vector elements in a manner producing a training fit to a set of training answers, (3) means for generating result estimate values, each of said estimate values consisting of a combination of the elements of a respective
φ vector produced from test data, each combination being at least equivalent to a summation of vector elements weighted in accordance with the training fit, wherein the non-linear transforming means, the processing means and the means for generating result estimate values are interlinked by multibit buses and single-bit lines for data transmission purposes. 9. An heuristic processor comprised of:
φ vector from each member of a training data set on the basis of a set of centers, each training data set member having a displacement from each of said centers, where a norm of the displacement is calculable from each of said displacements, and each element of a φ vector consisting of a non-linear transformation of the norm of the displacement of the associated training data set member from a respective center set member, (2) an electronically addressable memory incorporated in the transforming means, the memory being for “receiving” addresses in fixed point arithmetic format and “providing” output in floating point arithmetic format in the course of producing each said training
φ vector in floating point format, said output in each case being a non-linear transformation of the respective address value, (3) processing means for combining training
φ vector elements in a manner producing a training fit to a set of training answers, and (4) means for generating result estimate values, each of said estimate values consisting of a combination of the elements of a respective
φ vector produced from test data, each combination being at least equivalent to a summation of vector elements weighted in accordance with the training fit. 10. An heuristic processor comprised of:
(
) 1 a non-linear transformation device producing a respective training -φ vector from each member of a training data set on the basis of a set of centers, each training data set member having a displacement from each of said centers, where a norm of the displacement is calculable from each of said displacements, and each element of a φ vector consisting of a nonlinear transformation of the norm of the displacement of the associated training data set member from a respective center set member, (
) 2 a combining processor combining training φ vector elements in a manner producing a training fit to a set of training answers, and (
) 3 a result estimate value generator generating estimate values, each of said estimate values consisting of a combination of the elements of a respective φ vector produced from test data, each combination being at least equivalent to a summation of vector elements weighted in accordance with the training fit, wherein the heuristic processor consists at least partly of processing devices linked by connectors incorporating clocked latches for data storage and propagation.11. An heuristic processor comprised of:
a non-linear transformation device producing a respective training -φ vector from each member of a training data set on the basis of a set of centers, each training data set member having a displacement from each of said centers, where a norm of the displacement is calculable from each of said displacements, and each element of a φ vector consisting of a nonlinear transformation of the norm of the displacement of the associated training data set member from a respective center set member, electronically addressable memories incorporated in the transformation device, the memories “receiving” addresses in fixed point arithmetic format and “providing” output in floating point arithmetic format in the course of producing said elements of training
φ vectors in floating point format, and a combining processor combining training
φ vector elements in a manner producing a training fit to a set of training answers in a form suitable for enabling result estimate values to be generated, each of said estimate values consisting of a combination of the elements of a respective
φ vector produced from test data, each combination being at least equivalent to a summation of vector elements weighted in accordance with the training fit.12. An heuristic processor comprised of:
(
) 1 a non-linear transformation device producing a respective training -φ vector from each member of a training data set on the basis of a set of centers, each training data set member having a displacement from each of said centers, where a norm of the displacement is calculable from each of said displacements, and each element of a φ vector consisting of a nonlinear transformation of the norm of the displacement of the associated training data set member from a respective center set member, (
) 2 a combining processor combining training φ vector elements in a manner producing a training fit to a set of training answers, and (
) 3 a result estimate value generator generating estimate values, each of said estimate values consisting of a combination of the elements of a respective φ vector produced from test data, each combination being at least equivalent to a summation of vector elements weighted in accordance with the training fit, wherein the transformation device, the combining processor and the result estimate value generator are interlinked by multibit buses and single-bit lines for data transmission purposes. 13. An heuristic processor comprised of:
a non-φ vector from each member of a training data set on the basis of a set of centers, each training data set member having a displacement from each of said centers, where a norm of the displacement is calculable from each of said displacements, and each element of a φ vector consisting of a nonlinear transformation of the norm of the displacement of the associated training data set member form a respective center set member, electronically addressable memories incorporated in the transformation device, the memories “receiving” addresses in fixed point arithmetic format and “providing” output in floating point arithmetic format in the course of producing said elements of training -φ vectors in floating point format, said output in each case being a nonlinear transformation of the respective address value, φ vector elements in a manner producing a training fit to a set of training answers in a form suitable for enabling result estimate values to be generated, φ vector produced from test data, each combination being at least equivalent to a summation of vector elements weighted in accordance with the training fit.14. An heuristic processor comprised of:
a non-linear transformation device producing a respective training -φ vector from each member of a training data set on the basis of a set of centers, each training data set member having a displacement from each of said centers, where a norm of the displacement is calculable from each displacement, and each element of a φ vector consisting of a nonlinear transformation of the norm of the displacement of the associated training data set member from which said training φ vector is produced, a processor which weights and combines training
φ vector elements and produces a training fit to a set of training answers, and a result estimate value generator generating estimate values and producing a respective test -φ vector from each ember of a set of test data, each test data set member having a displacement from each of said centers, where a norm of said test data set member displacement is calculable from each test data set member displacement and each element of a test φ vector consisting of said nonlinear transformation of said norm of said test data set member displacement, each of said estimate values consisting of a combination being at least equivalent to a summation of vector elements weighted in accordance with the training fit. 15. A processor according to claim
14, wherein the transformation device computes differences between training data vector elements and corresponding center elements, sums the squares of such differences associated with each center-data vector pair, converts each sum to a value in accordance with the non-linear transformation and provides a respective training φ vector element.16. An heuristic processor comprised of:
a non-linear transformation device producing a respective training -φ vector from each member of a training data set on the basis of a set of centers, each training data set member having a displacement from each of said centers, where a norm of the displacement is calculable from each displacement, and each element of a φ vector consisting of a nonlinear transformation of the norm of the displacement of the associated training data set member from which said training φ vector is produced, φ vector elements and produces a training fit to a set of training answers, and a result estimate value generator generating estimate values and producing a respective test -φ vector from each member of a set of test data, each test data set member having a displacement from each of said centers, where a norm of said test data set member displacement is calculable from each test data set member displacement and each element of a test φ vector consisting of said nonlinear transformation of said norm of said test data set member displacement, each of said estimate values consisting of a combination of the elements of a respective test φ vector and each said combination being at least equivalent to a summation of vector elements weighted in accordance with the training fit, wherein said processor comprises programmed processing devices for performing calculation operations in parallel with one another.17. An heuristic processor comprised of:
a non-linear transformation device producing a respective training -φ vector from each member of a training data set on the basis of a set of centers, each training data set member having a displacement from each of said centers, where a norm of the displacement is calculable from each displacement, and each element of a φ vector consisting of a nonlinear transformation of the norm of the displacement of the associated training data set member from which said training φ vector is produced, a processor which weights and combines training
φ vector elements and produces a training fit to a set of training answers, wherein said processor comprises a digital electronic processor for performing calculations in floating point arithmetic, and a result estimate value generator generating estimate values and producing a respective test -φ vector from each member of a set of test data, each test data set member having a displacement from each of said centers, where a norm of said test data set member displacement is calculable from each test data set member displacement and each element of a test φ vector consisting of said nonlinear transformation of said norm of said test data set member displacement, each of said estimate values consisting of a combination of the elements of a respective test φ vector and each said combination being at least equivalent to a summation of vector elements weighted in accordance with the training fit.18. An heuristic processor comprised of:
a non-φ vector from each member of a training data set on the basis of a set of centers, each training data set member having a displacement from each of said centers, where a norm of the displacement is calculable from each displacement, and each element of a φ vector consisting of a nonφ vector is produced, φ vector elements and produces a training fit to a set of training answers, and a result estimate value generator generating estimate values and producing a respective test -φ vector from each member of a set of test data, each test data set member having a displacement from each of said centers, where a norm of said test data set member displacement is calculable from each test data set member displacement and each element of a test φ vector consisting of said nonlinear transformation of said norm of said test data set member displacement, each of said estimate values consisting of a combination of the elements of a respective test φ vector and each said combination being at least equivalent to a summation of vector elements weighted in accordance with the training fit.19. An heuristic processor comprised of:
a non-φ vector from each member of a training data set on the basis of a set of centers, each training data set member having a displacement from each of said centers, where a norm of the displacement is calculable from each displacement, and each element of a φ vector consisting of a nonφ vector is produced, φ vector elements and produces a training fit to a set of training answers, and φ vector from each member of a set of test data, each test data set member having a displacement from each of said centers, where a norm of said test data set member displacement is calculable from each test data set member displacement and each element of a test φ vector consisting of said nonlinear transformation of said norm of said test data set member displacement, each of said estimate values consisting of a combination of the elements of a respective test φ vector and each said combination being at least equivalent to a summation of vector elements weighted in accordance with the training fit, wherein the transformation device and the processor incorporate digital electronic signal processing devices controlled by clock signals.20. An heuristic processor comprised of:
a non-φ vector from each member of a training data set on the basis of a set of centers, each training data set member having a displacement from each of said centers, where a norm of the displacement is calculable from each displacement, and each element of a φ vector consisting of a nonφ vector is produced, a processor which weights and combines training
φ vector elements and produces a training fit to a set of training answers and comprises digital electronic signal processing devices for storing processing results for output after a delay, and φ vector from each member of a set of test data, each test data set member having a displacement from each of said centers, where a norm of said test data set member displacement is calculable from each test data set member displacement and each element of a test φ vector consisting of said nonlinear transformation of said norm of said test data set member displacement, each of said estimate values consisting of a combination of the elements of a respective test φ vector and each said combination being at least equivalent to a summation of vector elements weighted in accordance with the training fit.21. A method of training an heuristic processor, wherein the heuristic processor consists at least partly of processing devices linked by connectors incorporating clocked latches for data storage and propagation, said method comprising the steps of:
(
) 1 producing a respective training -φ vector from each member of a training data set on the basis of a set of centers, each training data set member having a displacement from each of said centers, where a norm of the displacement is calculable from each of said displacements, and each element of a φ vector consisting of a nonlinear transformation of the norm of the displacement of the associated training data set member from a respective center set member, and (
) 2 combining training φ vector elements in a manner producing a training fit to a set of training answers, φ vector produced from test data, each combination being at least equivalent to a summation of vector elements weighted in accordance with the training fit.22. A method of training an heuristic processor, said method comprising the steps of:
(
) 1 producing a respective training -φ vector from each member of a training data set on the basis of a set of centers, each training data set member having a displacement from each of said centers, where a norm of the displacement is calculable from each of said displacements, and each element of a φ vector consisting of a nonlinear transformation of the norm of the displacement of the associated training data set member from a respective center set member said non-linear transformation being implemented with the aid of electronically addressable memories responsive to an input address in fixed point arithmetic format by providing output of a φ vector element as a transformation of that address in floating point format, and (
) 2 combining training φ vector elements in a manner producing a training fit to a set of training answers in a form suitable for enabling result estimate values to be generated, each of said estimate values consisting of a combination of the elements of a respective φ vector produced from test data, each combination being at least equivalent to a summation of vector elements weighted in accordance with the training fit.23. A method of training of heuristic processor, said processor including a non-
linear transformation device, a combining processor and a result estimate value generator are interlinked by multibit buses and single-bit lines for data transmission purposes, said method comprising the steps of: (
) 1 producing, in said non-linear transformation device, a respective training -φ vector from each member of a training data set on the basis of a set of centers, each training data set member having a displacement from each of said centers, where a norm of the displacement is calculable from each of said displacements, and each element of a φ vector consisting of a nonlinear transformation of the norm of the displacement of the associated training data set member from a respective center set member, and (
) 2 combining, in said combining processor, training φ vector elements in a manner producing a training fit to a set of training answers in a form suitable for enabling result estimate values to be generated, each of said estimate values consisting of a combination of the elements of a respective φ vector produced from test data, each combination being at least equivalent to a summation of vector elements weighted in accordance with the training fit.24. A method of training an heuristic processor, said method comprising the steps of:
producing a respective training -φ vector from each member of a training data set on the basis of a set of centers, each training data set member having a displacement from each of said centers, where a norm of the displacement is calculable from each of said displacements, and each element of a φ vector consisting of a nonlinear transformation of the norm of the displacement of the associated training data set member from a respective center set member, said non-linear transformation being implemented with the aid of memory means which, when supplied with an input address in fixed point arithmetic format, provides output of an element of each said training φ vector as a transformation of that address in floating point format, and combining training
φ vector elements in a manner producing a training fit to a set of training answers in a form suitable for enabling result estimate values to be generated, each of said estimate values consisting of a combination of the elements of a respective φ vector produced from test data, each combination being at least equivalent to a summation of vector elements weighted in accordance with the training fit.25. A method of training an heuristic processor, said method comprising the steps of:
producing a respective training -φ vector from each member of a training data set on the basis of a set of centers, each training data set member having a displacement from each of said centers, where a norm of the displacement is calculable from each displacement, and each element of a φ vector consisting of a nonlinear transformation of the norm of the displacement of the associated training data set member from which said training -φ vector is produced, said nonlinear transformation being implemented with the aid of memory means which, when supplied with an input address in fixed point arithmetic format, provides output of an element of each said training φ vector as a transformation of that address in floating point format, and weighting and combining training
φ vector elements and producing a training fit to a set of training answers in a form suitable for enabling result estimate values to be generated, each of said estimate values consisting of a combination of the elements of a respective φ vector produced from test data, each combination being at least equivalent to a summation of vector elements weighted in accordance with the training fit.26. A method of training an heuristic processor, according to claim
25, wherein said first producing step includes the steps of:computing differences between training vector elements and corresponding center elements;
summing the squares of such differences associated with each center-data vector pair; converting each sum to a value in accordance with the non-linear transformation and providing a respective training
φ vector element.27. A method of training an heuristic processor, wherein said processor comprises a programmed processing device for performing calculation operations in parallel with one another, said method comprising the steps of:
producing a respective training -φ vector from each member of a training data set on the basis of a set of centers, each training data set member having a displacement from each of said centers, where a norm of the displacement is calculable from each displacement, and each element of a φ vector consisting of a nonlinear transformation of the norm of the displacement of the associated training data set member from which said training φ vector is produced, and weighting and combining training
φ vector elements and producing a training fit to a set of training answers in a form suitable for enabling result estimate values to be generated, each of said estimate values consisting of a combination of the elements of a respective φ vector produced from test data, each said combination being at least equivalent to a summation of vector elements weighted in accordance with the training fit.28. A method of training an heuristic processor, wherein said processor comprises a digital electronic processor for performing calculations in floating point arithmetic, said method comprising the steps of:
producing a respective training -φ vector from each member of a training data set on the basis of a set of centers, each training data set member having a displacement from each of said centers, where a norm of the displacement is calculable from each displacement, and each element of a φ vector consisting of a nonlinear transformation of the norm of the displacement of the associated training data set member from which said training φ vector is produced, and weighting and combining training
φ vector elements and producing a training fit to a set of training answers in a form suitable for enabling result estimate values to be generated, each of said estimate values consisting of a combination of the elements of a respective φ vector produced from test data, and each said combination being at least equivalent to a summation of vector elements weighted in accordance with the training fit.29. A method of training an heuristic processor, said method comprising the steps of:
φ vector from each member of a training data set on the basis of a set of centers, each training data set member having a displacement from each of said centers, where a norm of the displacement is calculable from each displacement, and each element of a φ vector consisting of a nonlinear transformation of the norm of the displacement of the associated training data set member from which said training φ vector is produced, and φ vector elements and producing a training fit to a set of training answers in a form suitable for enabling result estimate values to be generated, each of said estimate values consisting of a combination of the elements of a respective φ vector produced from test data, and each said combination being at least equivalent to a summation of vector elements weighted in accordance with the training fit.30. A method of training an heuristic processor, said processor including a non-
linear transformation device and said processor and transformation device incorporate digital electronic signal processing devices controlled by clock signals, said method comprising the steps of: producing, in said non-linear transformation device, a respective training -φ vector form each member of a training data set on the basis of a set of centers, each training data set member having a displacement from each of said centers, where a norm of the displacement is calculable from each displacement, and each element of a φ vector consisting of a nonφ vector is produced, and φ vector elements and producing a training fit to a set of training answers in a form suitable for enabling result estimate values to be generated, each of said estimate values consisting of a combination of the elements of a respective φ vector produced from test data, and each said combination being at least equivalent to a summation of vector elements weighted in accordance with the training fit.31. A method of training an heuristic processor, said method comprising the steps of:
φ vector from each member of a training data set on the basis of a set of centers, each training data set member having a displacement from each of said centers, where a norm of the displacement is calculable from each displacement, and each element of a φ vector consisting of a nonφ vector is produced, and weighting and combining training
φ vector elements and producing a training fit to a set of training answers in a digital electronic signal processing device for storing processing results for output after a delay in a form suitable for enabling result estimate values to be generated, each of said estimate values consisting of a combination of the elements of a respective φ vector produced from test data, and each of said combination being at least equivalent to a summation of vector elements weighted in accordance with the training fit.32. A method of estimating results using an electronic processing device, the device including a means for the non-
linear transformation of data, for combining elements of transformed data, and for weighting data, said method comprising arranging said electronic device to execute the steps of: (
) 1 producing training -φ vectors from each member of a training data set on the basis of a set of centers, each training data set member having a displacement from each of said centers, where a norm of the displacement is calculable from each of said displacements, and each element of a φ vector consisting of a non(
) 2 combining training φ vector elements in a manner producing a training fit to a set of training answers, and (
) 3 generating result estimate values, each of said estimate values comprising a combination of weighted elements of a respective φ vector produced from test data, said weighting in accordance with the training fit.33. A method of estimating results using first and second electronic processing devices, said first electronic processing device including a means for the non-
linear transformation of data and for combining elements of transformed data, and said second electronic processing device including means for producing weighted combinations of vector elements, said method comprising arranging said first electronic processing device to execute the steps of: (
) 1 producing training -φ vectors from each member of a training data set on the basis of a set of centers, each training data set member having a displacement from each of said centers, where a norm of the displacement is calculable from each of said displacements, and each element of a φ vector consisting of a non(
) 2 combining training φ vector elements in a manner producing a training fit to a set of training answers, and said second electronic processing device generating result estimate values, each of said estimate values comprising a combination of weighted elements of a respective
φ vector produced from test data, said weighting in accordance with the training fit produced by said first electronic processing device.Description 1. Field of the Invention This invention relates to an heuristic processor, i.e. a digital processor designed to estimate unknown results by an empirical self-learning approach based on knowledge of prior results. 2. Discussion of Prior Art Heuristic digital processors an not known per se in the prior art although there has been considerable interest in the field for many years. Such a processor is required to address problems for which no explicit mathematical formalism exists to permit emulation by an array of digital arithmetic circuits. A typical problem is the recognition of human speech, where it is required to deduce an implied message from speech which is subject to distortion by noise and the personal characteristics of the speaker. In such a problem, it will be known that a particular set of sound sequences will correspond to a set of messages, but the mathematical relationship between any sound sequence and the related message will be unknown. Under these circumstances, there is no direct method of discerning an unknown message from a new sound sequence. The approach to solving problems lacking known mathematical formalisms has in the past involved use of a general purpose computer programmed in accordance with a self-learning algorithm. One form of algorithm is the so-called linear perception model. This model employs what may be referred to as training information from which the computer “learns”, and on the basis of which it subsequently predicts. The information comprises “training data” sets and “training answer” sets to which the training data sets respectively correspond in accordance with the unknown transformation. The linear perception model involves forming differently weighted linear combinations of the training data values in a set to form an output result set. The result set is then compared with the corresponding training answer set to produce error values. The model can be envisaged as a layer of input nodes broadcasting data via varying strength (weighted) connections to a layer of summing output nodes. The model incorporates an algorithm to operate on the error values and provide corrected weighting parameters which (it is hoped) reduce the error values. This procedure is carried out for each of the training data and corresponding training answer set, after which the error values should become small indicating convergence. At this point data for which there are no known answers are input to the computer, which generates predicted results on the basis of the weighting scheme it has built up during the training procedure. It can be shown mathematically that this approach is valid and yields convergent results for problems where the unknown transformation is linear. The approach is described in Chapter 8 of “Parallel Distributed Processing Vol. 1: Foundations”, pages 318-322, D. E. Rumelhart, J. L. McClelland, MIT Press 1986. For problems involving unknown nonlinear transformations, the linear perception model produce results which are quite wrong. A convenient test for such a model is the EX-OR problem, i.e. that of producing an output map of a logical exclusive-OR function. The linear perception model has been shown to be entirely inappropriate for the EX-OR problem because the latter is known to be nonlinear. In general, nonlinear problems are considerably more important than linear problems. In an attempt to treat nonlinear problems, the linear perception model has been modified to introduce non-linear transformations and at least one additional layer of nodes referred to as a hidden layer. This provides the nonlinear multilayer perception model. It may be considered as a layer of input nodes broadcasting data via varying strength (weighted) connections to a layer of internal or “hidden” summing nodes, the hidden nodes in turn broadcasting their sums to a layer of output nodes via varying strength connections once more. (More complex versions may incorporate a plurality of successive hidden layers.) Nonlinear transformations may be performed at any one or more layers. A typical transformation involves computing the hyperbolic tangent of the input to a layer. Apart from these one or more transformations, the procedure is similar to the linear equivalent. Errors between training results and training answers are employed to recompute weighting factors applied to inputs to the hidden and output layers of the perception. The disadvantages of the nonlinear perception approach are that there is no guarantee that convergence is obtainable, and that where convergence is obtainable that it will occur in a reasonable length of computer time. The computer programme may well converge on a false minimum remote from a realistic solution to the weight determination problem. Moreover, convergence takes an unpredictable length of computer time, anything from minutes to many hours. It may be necessary to pass many thousands of training data sets through the computer model. It is an object of the invention to provide an heuristic processor. The present invention provides an heuristic processor including: (1) transforming means arranged to produce a respective training (2) processing means arranged to combine training (3) means for generating result estimate values each consisting of a combination of the elements of a respective The invention provides the advantage that it constitutes a processing device capable of providing estimated results for nonlinear problems. In a preferred embodiment, the processing means is arranged to carry out least squares fitting to training answers. In this form, it produces convergence to the best result available having regard to the choice of nonlinear transformation and set of centres. The processing means preferably comprises a network of processing cells; the cells are connected to form rows and columns and have functions appropriate to carry out QR decomposition of a The processing means may include switching means for switching between a training mode of operation and a test mode. The switching means provides means for generating result estimate values. In the training mode, boundary and internal cells respectively generate and apply rotation parameters and update their stored elements as aforesaid. In the test mode, stored element update is suppressed, and training data The transforming means may comprise a digital arithmetic unit arranged to subtract training data vector elements form each of a series of corresponding centre vector elements, to square and add the resulting differences to provide sums arising from each data vector centre vector pair; and to transform the sums in accordance with a nonlinear function to provide Embodiments of the invention will now be described, by way of example only, with reference to the accompanying drawings, in which: FIG. 1 is a block diagram of an heuristic processor of the invention; FIG. 2 provides processing functions for cells of the FIG. 1 processor; FIG. 3 is a more detailed block diagram of a digital arithmetic unit of the FIG. 1 processor; FIG. 4 is a simplified schematic drawing of the FIG. 1 processor illustrating throughput timing; FIG. 5 is a schematic drawing of an extended version of a heuristic processor of the invention; FIGS. 6, FIG. 9 illustrates a processor for use with weighting data obtained in a FIG. 5 device. Referring to FIG. 1, there is shown an heuristic processor of the invention indicated generally by The processor The third inputs P The first row arithmetic units P Similarly, the second row arithmetic units have first and second inputs P The centre and data latches CL The second row arithmetic unit outputs P The look-up tables LUT The processor Each of the latch chains YL For ease of subsequent reference, elements previously defined, other than inputs YI The AND gates A The QR processor The boundary cells B The boundary, internal and multiplier cells have differing references and outlines to indicate differing processing functions. The latter are illustrated in FIG. The boundary cells B
Having computed its respective {overscore (r′)} each boundary cell calculates a sine-like rotation parameter {overscore (s)} from
It then outputs {overscore (s)} and φ, the latter now designated {overscore (φ)}, and, on the next clock cycle, these pass horizontally to the right to the respective neighbouring internal cell in the same row. The cell also outputs a stored value δ′ as δ′ diagonally below right, and replaces δ′ in store by a new value in accordance with
Equation (1.3) is equivalent to delaying output of δ′ by one additional clock cycle. The cell also replaces its stored value {overscore (r)} by {overscore (r′)}. If the right hand side of equation (1.2) or (1.3) produces division by zero, the left hand side is set to zero. The first row boundary cell B Internal cells in we second to fourth columns of the QR processor The processing fraction of the internal cells are as follows:
In other words, each internal cell computes a vertical output φ′ or y′ by subtracting the product of its stored element k or u (originally zero) with a left hand inputs {overscore (φ)} from its vertical input φ or y. It then updates its stored element k or u by substituting the sum of its previous stored element with the product of its vertical output and its second left hand input {overscore (s)}. These operations occur every data clock cycle. First row internal cells I The multiplier cell M The transputers employed in the QR and LSM programs Referring now also to FIG. 3, the structure of each of the processing cells P is shown in more detail. The first and second sixteen-bit inputs P The inverter array The latch array The overall mode of operation of the processor To initialise other parts of the processor The next phase of operation of the processor
where ∥ . . . ∥ represents the Euclidean norm. (The invention is, however, not restricted to use of the Euclidean norm, provided that the quantity employed is equivalent to a distance.) The value D
On the fourteenth to sixteenth data clock cycles, computations similar to those described above involving
and
This procedure continues as successive training data vectors To summarise, data clock cycles fifteen to nineteen correspond to input of φ The QR/LSM processor GB 2,151,378B and U.S. Pat. No. 4,727,503 referred to above prove in detail that input of successive temporally skewed vectors
where the symbol T indicates the transpose of a column vector The vector Each value e has a minimum value. In effect, the implicit weight vector The training mode of operation is carried out until the Nth training data vector
The weight vector After input of φ On the data clock cycle following input of On the data clock cycle after boundary cell B In consequence of update suppression, each vector
On subsequent data clock cycles E
Equation (10) may be rewritten as Equations (10) and (11) show that E The processor Since the processor Referring now to Table 1, there are shown the validity and status output signals and the output signal at Q
In practice the processor In the course of the training mode, the QR/LSM processor The processor The processor Referring now to FIGS. 5 to The J/K/L processor There are L inputs YI Status and validity inputs SSI and SVI are connected via J latches to the Φ processor The processor The QR and LSM processors are expanded to K by K and K by L arrays respectively. The first boundary cell B The latch arrays The FIG. 5 processor The equivalent of equations (10) and (11) for the J/K/L processor of FIG. 5 are as follows: Equations (12) and (13) demonstrate that the weight vector As has been said, the QR/LSM processor Explicit extraction of the weight leads to a further embodiment of the invention illustrated in FIG. The processor The foregoing description has shown how the processor of the invention is trained to produce a nonlinear transformation of a training data set is as nearly as possible equal to −y In other circumstances, it is an advantage to employ a processor of the invention which is switchable between training and test modes because it allows retraining; ie it is possible to revert back to a training mode after a test sequence and input further training data. The effect of the original training procedure may be removed by initialising the processor with zero inputs as previously described. Its effect may alternatively be retained and merely augmented by input by further training data. This has a potential disadvantage in that each successive training data vector may have progressively less effect. For example, after say one thousand training data vectors have been input, the boundary cell stored element {overscore (r)} may be very little changed by updating with addition of the one thousand and first δ
where, during the test phase, β=1, and during the training phase, 0<β<1. Normally, β will be very close to unity during training. Its effect is to make stored values {overscore (r)}, k and u reduce slightly each clock cycle; ie they decay with time. Elements k and u are affected indirectly via the relationship between {overscore (r′)} and {overscore (s)}, and {overscore (s)} and k′. The foregoing examples of the invention employed a nonlinear transformation of the Euclidean distance D (a real quantity≧0) to exp(−D More generally, it is sufficient (but not necessary) for the chosen nonlinear transformation to involve a function which is continuous, monotonic and non-singular. However, functions such as fractal functions not possessing all these properties may also be suitable. Suitability of a function of transformation is testable as previously described by the use of test data with which known answers not employed in training are compared. The QR/LSM processor Patent Citations
Non-Patent Citations
Referenced by
Classifications
Legal Events
Rotate |