Advised by
Dr. T. L. Stewart &
Dr. G. Dempsey
E E 452
Senior Project Final Report
12 May 1997
A useful control tool is one that can adapt to variances in a complex system. One system that would benefit from such adaptive control is an inverted robot arm, the exact nature of which is not known or can be changed (length, mass, mass distribution, disturbances). This project is the implementation of an ADALINE (Adaptive Linear Neuron) neural network learning control system on a Motorola DSP56002 Digital Signal Processor to stabilize the robot arm to a vertical position. The control network is trained with the adjustment signals used to manually stabilize the robot arm and the resulting robot arm angles. Once trained, the network will stabilize the robot arm without manual input.
Many systems exist whose characteristics are difficult to mathematically model, making the design of an adequate controller a computationally intensive task. A controller that does not depend on exact characteristics but can adapt to differences in the system could eliminate the need for an exact model. An example of this type of system is the motor-driven inverted robot arm [Fig. 1]: gear-induced time delay, wind and other such disturbances, and mass-distribution differences combine to produce a complex model, needing an even more complex controller. A human, however, can easily manipulate a manual control to keep the robot arm upright. Given that a human cannot always be on hand to balance the robot arm, it would be advantageous to have a controller which can mimic this ease of control.
Figure 2: System Block Diagrams for
Learning and Controlling Modes
The project can be categorized into two modes of functionality: learning and controlling. In learning mode [Fig. 2, Top], a joystick will be used to manually control the robot arm through a gain and protection circuit which keeps the current and arm-angles from exceeding safety limits and brings the control signal up to the power needed to drive the motor. The motor then adjusts the arm angle, which is detected using a sensor in the motor/arm assembly. The angle-signal and the joystick signal are used by the neural network to learn how to control the arm. (Learning will be more fully described in the next section.) In controlling mode [Fig. 2, Bottom], the joystick is removed and the neural network is connected to produce the control signal to manipulate the robot arm.
A neural network is a parallel, distributed information processing structure composed of basic processing elements known as neurons [1]. Artificial neural networks (ANNs) are networks consisting of man-made models of biological neurons, implemented in either hardware or software. In neurons, input signals are carried across synapses which weight the individual signals, and into the cell body which combines the weighted signals; the new signal is output through an axon which can, in turn, connect to one or more synapses. In the single-neuron ADALINE model used for this project, the combination of input signals is accomplished through a linear weighted sum of the input signals; however, other non-linear combinations are possible. For the time-varying inverted robot arm, the ADALINE input signals are a series of time-delayed robot-arm angles. The signal flow just described is seen visually in Figure 3.
Figure 3: Controlling Mode ADALINE
Neuron
Here, the synaptic weights remain constant, and the neural network maps a given set of inputs to a specific output using those constant weights; this is the configuration used in the inverted robot arm application to actually stabilize the arm. However, before the neural network will perform as desired, the weights must be set through a process known as supervised learning.
The ANN does not magically "know" how to perform a given task; it must
learn how. In highly supervised learning, the neuron is presented with
not only the set of inputs as described above, but also a signal
representing the desired output of the ANN -- in the case of this
project, the desired signal comes from a joystick used to manually
control the robot arm [Fig. 4]. The actual ANN
output is subtracted from the desired output to produce an error signal
which is used to adjust the weights through the equation
Wnew = Wold + 2*mu*delta*X, (1)
where W is the set of weights, mu is the learning rate (a constant
which influences how quickly the weights will change), delta is the
error signal, and X is the set of time-delayed arm-angle inputs
(starting with the most recent, and extending back in time as far as
the number of weights allows) [2]. This ADALINE
learning rule is a Least-Mean-Squared (LMS) algorithm, wherein the
squared-error (delta^2) averaged over the entire input set is
minimized.
To correctly implement the LMS algorithm on the Motorola DSP chip, it first needs to be simulated with a tool like MATLAB to make sure that the algorithm is fully understood and works correctly. It is difficult to have a truly manual signal input to MATLAB to teach the algorithm. This can be overcome by simulating a traditional controller in MATLAB's Simulink to act as the manual signal [Fig. 5]. This controller is derived from an experimentally-determined two-pole model of the robot arm [3]. A disturbance signal comprising impulses and noise is combined with the robot-arm angle. The disturbed arm angle [Fig. 6, Upper Left] is passed through the traditional controller, the output of which controls the robot arm plant model and is saved as the generated manual signal [Fig. 6, Upper Right].
Figure 5: Generating "Manual" signal
for MATLAB simulation
The learning process in MATLAB begins with a random set of thirty weights with values between -1.0 and 1.0. The arm-angle inputs and generated manual signal are presented to the ANN in learning mode, and the output [Fig. 6, Lower Right] and error signals are computed. All the squared-errors are summed over one complete presentation of the generated manual signal (one "epoch," in neural network terminology); this Sum-Squared-Error (SSE) is compared every epoch against an error goal to determine when the network has sufficiently trained. The LMS algorithm can theoretically find the absolute minimum error. However, in practice, this could take an arbitrarily large amount of time, so an error goal was chosen based on an average error-per-data-point of 0.1 for the control signal. During the learning process, the SSE [Fig. 6, Lower Left] dropped approximately exponentially with the number of epochs; after 57 epochs, it had reached the goal.
Figure 6: Neural Network Training
Data: Disturbance, Desired Output, SSE during training, and the final
trained-ANN output in response to the disturbance
To simulate controlling mode, the trained neural network is inserted into the loop instead of the traditional controller [Fig. 7]. When the disturbance used for training is presented to the ANN in controlling mode, the network does indeed bring the arm angle back toward 0 [Fig. 8, Left]. Specifically, when the disturbance [Fig. 8, Upper Left] impulses upwards, the arm begins to move in that direction [Fig. 8, Lower Left], and the ANN generates a control signal [Fig. 8, Middle Left] in the opposite direction to compensate; the neuron continues to counter the disturbance as much as it can. Due to the consistent random noise, the arm will never completely stabilize. Furthermore, the network performs similarly for a disturbance with different impulses (magnitude, duration, and timing), showing that the network has trained and can generalize [Fig. 8, Right].
Figure 7: Simulating the trained
controlling-mode ADALINE in MATLAB
Figure 8: ANN Testing: Training Data
(Left) & new Testing Data (Right)
Two primary methods of implementing a neural net exist: designing hardware to act as the neurons and synapses, and simulating the neurons in software. The goal of the project is to design the control tool to be adaptive; the use of hardware-neurons makes it much more difficult to automatically change the synaptic weights, limiting the adaptability of the system. Therefore, software implementation was chosen. Specifically, the software is being designed for the Motorola DSP56002 digital signal processor.
The DSP instruction set facilitates neural net coding. This is because the weighted-sum nature of a neuron is the essential heart of DSP as well: multiply two numbers, add it to a running accumulator, and shift to the next two numbers. On the DSP56002, an instruction to do the full multiply-add-move takes only one instruction cycle, allowing for fast processing of the inputs. The weighted-sum of time-delayed inputs in the trained, single-neuron ADALINE for this project [Fig. 3], is functionally equivalent to a digital filter, specifically, a finite impulse response (FIR) filter, since only the inputs affect the ANN output.
The Motorola DSP56002 EVM board comes with the converters necessary to input two channels of data, process the information, and output it to both a headphone jack and a speaker jack (see Fig. 9) [4]. By default, only the microphone inputs are connected, but the board was modified to allow for line input as well. (The difference: the mic input is capacitively coupled, not allowing a DC offset.) Selection of the input type is handled through software initialization. Through example code that comes with the DSP development kit, it was discovered that, with no DSP processing, an input-to-output ratio of 4:1 occurs when measured at the speaker jack; the headphone jack adds a DC offset of 2.5 V, and has a 3:1 input-to-output ratio. Also, there exists internal DC output blocking in the D/A that cannot be circumvented.
Figure 9: I/O Flowchart for EVM
Board
To test the understanding of the DSP code and functionality, and to set
up the code to implement the neuron's weighted sum, a sample FIR filter
was designed and implemented. A notch filter which will cut out
frequencies near 1/6th the sampling frequency (which, for a 48 KHz
clock yields an 8000 Hz notch) will have poles at z = 0.9*ej/3 [Fig. 10]. Or, in difference equation form (which is
the implementation used in the DSP code), the filter is
y(n) = x(n) - 0.9*x(n-1) + 0.81*x(n-2).
(2)
However, the fractional data range is from -1.0 to 0.99999988
(800000h - 7FFFFFh). Therefore, the coefficients were divided by 2,
then the weighted sum was multiplied by 2. That is,
y(n) = 2*[0.5*x(n) - 0.45*x(n-1) +
0.405*x(n-2)]. (3)
Theoretically, this will yield the "Ideal Filter" response in Figure 11. The actual DSP filter output (through the
headphone jack) is shown in the same figure. The scaling factors
previously mentioned combine for a factor of 1.5 difference between
the ideal and actual filters; further variation occurs because of a
sharp built-in filter beginning at 20 KHz that drops the gain to .1 by
25 KHz. This is understandable given the 48 KHz sampling frequency of
the chip. (Thus, the highest allowable frequency would be 24 KHz to
avoid aliasing.)
Figure 10: Pole/Zero z-plane with
poles at z = 0.9 * exp(+/-j*pi/3)
Figure 11: Notch Filter Frequency
Response: Ideal, Scaled, and DSP-Implemented
A method is needed to switch the neural net between learning and controlling modes. Because the two channels of the A-D converter are already used for training signal and robot arm angle, an analog input could not be implemented elegantly. However, the DSP has three interrupts (IRQA, IRQB, NMI) which are easily accessible via jumper pins (J17) on the DSP's EVM board. To activate an interrupt, the pin on J17 labeled with the desired interrupt needs only to be briefly grounded.
Software handling of interrupts is accomplished through a vector table. When the processor detects an interrupt, it immediately executes the command in the appropriate location of the vector table, then returns to the previous program location and continues running normally. In general, the command either calls an interrupt handling routine or, if the interrupt handling can be executed in a one-word instruction, executes that one instruction. Because of the return-to-previous action, a method had to be developed to allow the change effected by the interrupt to not interfere with the program, wherever the program happened to be when the interrupt was received. To this end, a loop was created which repeatedly calls one of three subroutines, depending on the value of a pointer. When one of the three interrupts is processed, the pointer will be changed, and control will be returned to the current subroutine. The subroutine will finish processing, then when the loop makes the next subroutine call, it will access the new routine rather than the original.
Filtering and interrupt handling have now been explored. All that remains is to combine them into a single neural net system with both learning and controlling modes. To accomplish this, the previously-described learning algorithm used in MATLAB simulation must be converted into DSP code. In pseudo-DSP code, the algorithm becomes:
learn:
move INPUT, A
move DESIRED, B
jsr filter
sub ERROR = DESIRED - INPUT
mac SSE += ERROR*ERROR
sub EFC -= 1
if EFC 0 ; EFC = Epoch Fraction Counter:
; EFC/EFCMAX: gives what fraction of an epoch remains
; when 0, the epoch is over
if SSE < SSEMAX
nop ;For the future, Light an LED to
; indicate training is done
endif
move EFCMAX, EFC ; reset fraction-counter and
move 0, SSE ; SSE for next epoch
endif
weight_update:
do 30, until enddo ; update weights
multiply MU2*ERROR TMP ; MU2 = 2*mu, and is
; set in the data setup
move Wi, A
mac A += Xi*TMP
move A, Wi
update pointers for next i
enddo
Converted to actual DSP code, this yields the learning subroutine in Appendix A. As can be seen in the pseudo-code, provision has been made for calculating the SSE of an epoch. However, unlike MATLAB wherein one epoch can easily be defined as one presentation of the complete training waveform, there is no easy way to define an epoch when given continuous, aperiodic training data, other than defining it as the presentation of a fixed number of data samples. To this end, the counter variable EFC is used to count down from the maximum number of data samples to 0, thus indicating how many more samples are left in one epoch. When the epoch is complete, the SSE is compared against an error goal, and provision is made for future code to exit learning mode if the error has reached the goal. After this test, the EFC and SSE are reset to prepare for the next epoch. Modifications of the weights are handled via the subsection WEIGHT_UPDATE, which loops through each of the 30 weights and modifies each. (Because the number of epochs is not visible to the user, training will be indicated via the approximate time of training rather than the number of epochs to train.)
To get reasonable weight and error magnitudes, it was decided to reduce the input values by a factor of 32. (The simulated weights had a maximum of ~28, so the next highest power of 2 was chosen as the divisor.) This reduced value will be used for internal processing (error computation and training), then outputs will be scaled back up, but only by 16 to avoid internal data saturation [Figure 12].
To chose a sampling frequency, the effects of that frequency must be considered. With the default 48 KHz sampling frequency, the low frequencies of the joystick control signal and robot arm angle are insignificant. Furthermore, two seconds of training will effect nearly 100,000 changes for each weight. Therefore, the DSP's minimum sampling frequency of 8000 Hz was chosen, to improve the significance of the low frequencies and to change the fewest weights per second.
Figure 12: I/O Scaling &
Interaction with Training
The overall flow of the DSP code is shown in Figure 13. To begin the program, the weights, pointers, and interrupts are initialized. Ideally, the weights should begin randomized to help initial training, but because of difficulties in randomizing on a DSP chip, it was decided to set all but the first weight to zero, with the first weight starting at an arbitrary value. The software then jumps to the appropriate subroutine, as determined by the subroutine pointer (which is modified via the interrupts). If in sleeping mode, the program will return immediately to call the next subroutine. In controlling mode, the arm angle is retrieved and scaled, then passed through the neuron filtering subroutine, then scaled again and sent out as the control signal, and the program returns to call the next subroutine. In learning mode, both the arm angle and joystick signals are retrieved and scaled, then the arm angle is filtered, scaled, and output to give an indication of how training is progressing; the filtered but un-scaled arm angle is also subtracted from the joystick signal to find the error, and the weights are updated using that error, and the program then calls the next subroutine.
As discussed previously, to connect a signal with DC levels to the DSP,
the line inputs need to be used. To prepare the software to receive
from the line inputs, the 4th bit (IS -- Input Select) of Data Time
Slot 7 needs to be set to 0. For microphone inputs, the data constant
(as derived from the evaluation kit CODEC.ASM notation) TONE_INPUT
needs to be set to
whereas for line inputs, it needs to be set to
TONE_INPUT EQU MIC_IN_SELECT+(15*MONITOR_ATTN),
where the *_IN_SELECT constants have the appropriate bit set to either
1 or 0.
TONE_INPUT EQU LIN_IN_SELECT+(15*MONITOR_ATTN),
The motor arm control signal and angle sensor and the joystick are all set up to have values centered at 0 V, but the DSP requires input signals in the range of 2.1 -> 1.2 V [4]; therefore, interface circuitry is required. The offset is needed to correctly bias the signal for the LINE inputs; the 1.2 V will keep the inputs from saturating. The joystick was powered by the 5 V, and its output was sent to the robot arm and through a gain-and-level shifting circuit to the DSP [Fig. 14]. The signal flow is as follows: 5 V -> -1.2 V -> 1.2 V -> -1.2 V - CMOUT -> 1.2 V + CMOUT, where CMOUT is the center voltage the DSP needs. Since the DSP generates a level-shifted and inverted (as noted before) control signal, a similar level-shift-and-inverting-gain circuit was used, but with one fewer inversion, resulting in a correct-polarity signal. Also, the arm-angle-to-DSP circuit has the same configuration as the joystick-to-DSP circuit.
Figure 14: Joystick, Arm Angle,
and ANN Output Interface Circuitry
Multiple tests of the network's ability to learn were accomplished throughout the project. A simple test is to train the neuron to duplicate (autoassociate to) an input sinusoid, with the motor arm assembly completely removed from the system. To this end, a signal generator producing a 1 V, 1 KHz sine wave was connected to both inputs (input data and desired value). After two to three minutes of learning (with mu=0.05, as seen in Appendix A), the neuron was switched to controlling mode, and the output was indeed a sine wave similar to that of the input. (Specifically, there was a phase difference -- due to time delay -- and the 3:1 input/output ratio previously noted.)
To verify that the ANN will train to a wider range of frequencies, the function generator was set to a 100 - 1000 Hz frequency sweep. The network output matched at all those frequencies during training after only one minute of training. In controlling mode, various frequencies in the trained range were tested, and the output did match the input. Changing the input during controlling mode to a triangle wave, the network produced a slightly triangularized sinusoid, showing its adaptability. The network was also trained to a square wave, which it mimicked quite well, considering the high frequencies involved. However, when training back to a sine wave after the weights were trained to a square wave, the training time was quite large (on the order of ten minutes).
Periodic signals trained well; however, aperiodic training needed to be verified as well. The network was thus trained to autoassociate to a joystick signal. After two to three minutes of training, the ANN output signal was virtually identical to the input during training. When switched to controlling mode, it continued to similarly mimic the input signal. However, when the joystick input was held at a steady DC voltage, the output would drop to 0 V, which indicated the presence of an output capacitor. As a final test of learning (before training to an actual control signal), the network was trained to a sinusoidal output with a triangular input (using function generators). The network correctly associated this input-output pair.
Finally, the robot arm assembly was connected to the ANN and the joystick in controlling mode as shown in Figure 2. The neural network was trained given a semi-random joystick control, maneuvering the robot arm near 0 for approximately two minutes. In controlling mode, for small disturbances, control signals were appropriate in direction, though not always strong enough to overcome the disturbance. Specifically, when given a manual pulse (hitting the robot arm quickly) in either direction, the ANN was able to stabilize the robot arm back to 0 [Fig. 15].
Figure 15: Manual Disturbances to
the Robot Arm
To obtain repeatable disturbances, a signal generator was added to the motor control signal, and a range of pulse widths and magnitudes were tested. At and above a pulse width of 0.167 seconds (a half-cycle pulse of a 3 Hz square wave), the ANN can stabilize the arm with disturbance magnitudes up to 4.5 V. (With pulse frequencies below 3 Hz -- that is, slow disturbances -- the arm would not be consistently stabilized, because the output capacitor previously mentioned would not allow a DC control signal. However, through the network's performance at higher frequencies, full confidence is established that the ANN would stabilize the robot arm even with slow disturbances without the capacitive blocking.) Stabilization of generated disturbances in both directions, with a 3 Hz and 3 V half-cycle pulse, is shown in Figure 16.
Figure 16: Generated 3 Hz, 3 V
disturbances to the Robot Arm
To test generalization, a screwdriver was taped to the end of the robot arm, thus unbalancing the system; using the same weights as the previous tests used, the network was still able to stabilize the robot-arm-and-screwdriver system, indicating the robustness of the controller.
The goal of this project was to show that an ADALINE neural network implemented on a DSP chip would be able to control an inverted robot arm to stabilize the arm to a vertical position. For most disturbances, the neural network does just that: it reacts to an arm angle variation from 0 and sends the appropriate control signal to stabilize the robot arm. It functions for manual and generated disturbance pulses, and for modifications to the robot arm system, thus showing the adaptability of the neural network control method.
[1] Gary L. Dempsey. Lecture notes from "EE410/691 - Special Topics: Artificial Neural Networks," Bradley University Department of Electrical & Computer Engineering & Technology, Lectures 1-3, 13-16, Spring 1997.
[2] Derrick H. Nguyen and Bernard Widrow. "Neural Networks for Self-Learning Control Systems," IEEE Control Systems Magazine, pp. 18-23, April 1990.
[3] Gary L. Dempsey. Personal interactions with Peter C. Jones and Scott D. Tepavich, November 1996 - February 1997.
[4] Motorola. DSP56002 EVM Evaluation Module Kit manuals, data sheets, and example code.