INS: implement a new strategy for fast sampling, take II

olliw42 · November 28, 2016, 9:27am

this I write as “my personal” sum-up of a discussion on digital filtering in the pull request #5247, “INS: implement a new strategy for fast sampling” (https://github.com/ArduPilot/ardupilot/pull/5247). Unfortunately, some misleading statements were made, by all sides, and as the non-expert I am I couldn’t stop sorting out things until I reached a conclusion which at least satisfied myself. I’m not expecting that writing it up would make any difference, past experience wouldn’t suggest that, but I believe that the web would not be what it is if folks wouldn’t write up their thoughts, useful or not. Since the PR got closed, I put it here.

ArduPilot’s new filter strategy consists of reading the gyro and acc data at 8 kHz or 4 kHz, respectively, to filter them using the “common” 1st order low-pass law yn += alpha*( xn - yn-1 ), to average them (which is essentially equal to a CIC filter), and to finally decimate them down to 1 kHz sample rate. The topic attracted me because I’m working on a better filtering in “my” STorM32 NT Imu modules since quite some time, and came up with a similar strategy, with some differences though, which spurred questions. The design of a filter is obviously a trade-off between several design requirements, and what is considered “best” obviously can depend a lot on the particular application, i.e., the “best” filter for ArduPilot and STorM32 NT may not be equal. Some goals, I assume, are however shared, especially achieving better vibration immunity and/or anti-aliasing.

I. LPF Filter

I questioned the use of the LPF yn += alpha*( xn - yn-1 ), which is derived by backwards differencing from the continuous 1st order LPF filter, and suggested the use of yn += -alpha * yn-1 + 1/2 * alpha * ( xn + xn-1 ), which is obtained by using the bilinear substitution (also known as Tustin’s approximation as priseborough pointed out). I was arguing that the bilinear filter has a zero amplitude (infinite attenuation) while the backwards differencing filter does not, which should be good for anti-aliasing. lthall presented however a plot in which the response of the implemented LPF (=backwards differencing filter) was drawn as blue line, which showed a zero at the Nyquist frequency of 2 kHz. priseborough pointed out some advantages/disadvantages of the bilinear filter, with the disadvantages (i) It produces a relative degree of zero adding a mult and an add operation, (ii) The frequency of the poles is increasingly distorted as the Nyquist frequency is reached, but that this can be avoided by pre-warping. He concluded that there may be some benefit in terms of noise attenuation around the Nyquist frequency. I’ll argue now that the presented blue line is incorrect, that the disadvantages of the bilinear filter are nil, and that the attenuation benefit is substantial.

The difference law yn += alpha*( xn - yn-1 ) corresponds to the transfer function H(z) = alpha / (1 - (1-alpha) z^-1)), from which the frequency response is obtained by inserting exp(j w dt) and taking the magnitude. I prefer working with “real” frequencies, and to denote the time steps by dt, thus j w dt. The sample frequency is f_samplerate = 1/dt. The amplitude is found as |H(w)| = sqrt{ alpha^2 /( 1+(1-alpha)^2 - 2(1-alpha)cos(wdt) )}, which obviously does not have any zero (and also no pole).

In contrast, the bilinear filter, or difference law yn += -alpha * yn-1 + 1/2 * alpha * ( xn + xn-1 ), which corresponds to H(z) = 1/2 alpha*(1 + z^-1) / (1 - (1-alpha) z^-1)), yields the amplitude response |H(w)| = sqrt{ 1/2 alpha^2 (1+cos(wdt)) / ( 1+(1-alpha)^2 - 2(1-alpha)cos(wdt) )}. This has a zero at w dt = pi or the Nyquist frequency f_N = 1/2 f_samplerate.

Thus, the shown blue curve is incorrect, i.e., does not represent the implemented LPF. It shows zero gain at 2 kHz, as expected for the bilinear filter. One thus could speculate that the blue line was in fact calculated for a bilinear filter, maybe because canned biquad functions were used, but without the calculation details that’s speculation.

As regards the computational complexity, the bilinear filter formally requires one additional multiplication, as pointed out by priseborough, which I rebutted, somewhat misleadingly. In fact, when using floating point arithmetic (as ArduPilot does), one additional multiplication is indeed needed, to account for the 1/2 in the bilinear filter formula. If a mult is substantially more costly than a add, one could use yn += -alpha/2 *( yn-1 + yn-1 + xn + xn-1 ). However, when using integer arithmetic (as I do) this is just a shift, or one asm command, and thus extremely cheap, cheaper than an addition. Anyway, I conclude, in integer arithmetic the additional cost of the bilinear filter is cheap, just one addition and one shift, while in floating point arithmetic it is one add and one mult or two adds.

The bilinear filter warps the frequency axis with concomitant phase distortions, which are most notable near the Nyquist frequency, as pointed out by priseborough. However, I think that this is not an issue. For LTI systems causality enforces that any attenuation is accompanied by a phase shift (Kramers-Kronig). Thus, zero gain at the Nyquist frequency can’t come “for free”, and that will be true in general terms for any filter with strong attenuation. So, it’s IMHO a bit of a no-brainer since the underlying question is how much attenuation is desired; a filter designer can play a bit with how to distribute that along the frequency axis, but that’s it. The suggested pre-warping technique is just a method for determining the filter coefficients, but does not affect the filter behavior in any way. So, these points by themselves should not derail the bilinear filter.

The bilinear filter’s zero gain at the Nyquist frequency is in my opinion however very much desired in terms of anti-aliasing. It ensures strong attenuation for frequencies close to the Nyquist frequency, which are the most crucial ones to filter out as they would alias to low frequencies (see e.g. tridge’s plots from the 21 Nov 2016 DEV call), and affect the controller behavior most negatively. At the same time the bilinear filter has a slightly better phase performance at the cut-off frequency, and is not really very expensive computational wise. To me it is thus clear that the advantages of the bilinear filter outweight its disadvantages by far, and thus is my choice.

II. CIC/Averaging for the acc data

In addition to the LPF the acc data are averaged, which corresponds to H(z) = 1/M (1 - z^-M)/(1 - z^-1). Thus |H(w)| = 1/M sqrt{ (1 - cos(wdtM))/(1 - cos(wdt)) }, which has zeros at w dt M = 2 pi n, with n <> M. For the acc sample rate of 4 Khz the employed averaging width M = 4 leads to zeros at 1 kHz, 2 kHz, and 3 kHz, exactly as shown in the red curve in lthall’s plot in the mentioned pull request. The zero at 2 kHz or the Nyquist frequency is again desirable for the same reason discussed for the bilinear filter. However, for the purpose of filtering, the zero at the cut-off frequency of 1 kHz is not desirable. This zero is, so to say, the CIC filter’s way of producing attenuation above the cutoff frequency, but comes along with significant distortions in the amplitude and phase behavior. In terms of vibration attenuation and anti-aliasing there is no good reason to put a zero with its negative consequences at the cutoff frequency, except maybe that averaging is a quite cheap operation. Also, since the acc data enters any AHRS calculation and/or controller only through some (implicit or explicit) additional filtering, the “ideal” step-response behavior of an averaging filter is no argument in favor of it either. Thus, using an averaging is less than ideal for the purpose of filtering the acc data. I conclude that a biquad filter, which also produces a zero at the Nyquist frequency but is more well-behaved in the transmission region, should be a significantly better choice, and only slightly more costly.

III. CIC/Averaging for the gyro data

The gyro data is only averaged (and decimated). In contrast to the case of the acc data, averaging might be the “best” choice here for two reasons. (1) Noise and anti-aliasing is much less of a problem than for the acc data. This is simply so because the acc measures acceleration and the gyro velocity (1/s^2 vs 1/s), and high-frequency components are thus generically amplified by a (relative) factor w in the acc data, which is consistent with experimental experience. Thus, filtering is not as serious for the gyro data as for the acc data, and less than ideal choices may be acceptable. (2) Averaging the gyro data can be considered as a multi-rate approach to the AHRS calculations. I.e., averaging the gyro data essentially corresponds to integrating the attitude equation of motion at high frequency, with neglecting the non-community of rotations or conning corrections. I thus conclude that for the gyro data an averager is a good choice, and it’s my pick.

IV. Performance

ArduPilot does the above filtering in floating point arithmetic. However, both the acc and gyro data come in as 16-bit integers, and using 32 bit integer arithmetic suggests itself. In fact, the above filters are easily implemented in 32 bit integer. The filter coefficients can even be set to inverse integer values without much loss in generality, preventing discretization noise and such artefacts. For instance, with alpha = 1/4 the cut-off frequency of a bilinear LPF filter for the acc data would be 182 Hz, very close to the currently used 188 Hz. I’ve never worked with a F4, and thus don’t know how effective its FPU is in comparison to an integer arithmetic implementation. Anyway, one easily could change to an implementation where the high-speed filtering is done with integers, and conversion to floating point is done (only) when decimating to 1 kHz. The bottom line is that there is not really a performance concern with the high-frequency filters, even with the more elaborate schemes discussed in the above. On a F1 at 76 Mhz they take far less than 10 us, 5 us or so. On a F4 we are talking about few us at most.

V. Delay/Latency

I also questioned ArduCopter’s practice to decimate to 1 kHz, where the 1 kHz is based on the MPU’s internal oscillator which is slightly out of track with the processor frequency, and which is incommensurate to the 400 Hz update frequency of the copter controller loop. The result are latencies of up to 1 ms, and 0.5 ms on average, which are substantial. The suggestion was to use a moving average and to draw the latest data point, reducing the maximum latency to 125 us. priseborough convincingly argued that the implemented scheme gives more consistent results in the AHRS estimation, which I presume is more important for autonomous vehicles. However, it is appropriate to state that ArduPilot has an intrinsic disadvantage as regards control in comparison to fully synchronous schemes or a “draw the latest” scheme. In my application, the latter yields much better performance, and is thus my choice.

VI. Vibration isolation

A corollary to the argument (1) in III. seems appropriate. The flight controllers or IMU units I’m aware of have both the gyro and acc on the same “board”, and vibration isolation, for reducing the vibration especially in the acc data, is done by isolating this “board” carrying both the gyro and acc. This is not optimal. In fact, argument (1) suggests to mount the gyro directly onto the copter, as it is typically done in e.g. racers, and to only vibration isolate the acc sensor. For the gyro this should often be very acceptable vibration-wise, but has the benefit of a most accurate sampling of fast copter movements and thus potentially better (grispier) control. The acc data enters only via the data fusion anyhow, and thus high levels of mechanical and digital filtering wouldn’t deteriorate control response. This separation also would make mechanical vibration isolation a much easier task. That line of thoughts I find the logical (next) step for further improvement. For e.g. Pixhawk2 this would suggest that the “main” IMU is not using data from one physical IMU chip, but e.g. the gyro data from an IMU chip on the main board, and the acc data from the IMU chip on the internally damped part. The acc of the main board IMU would then be only for backup, and the gyro of the internally damped (and temperature controlled) IMU for e.g. improving gyro bias estimation.

Very long-winded text, I know, but that’s what had been in my mind. LOL. Have fun.