RME Tech Info - Low Latency Background: Monitoring, ZLM and ASIO

Low Latency Background
Monitoring, ZLM and ASIO

Introduction

The dream of a complete sound studio inside the PC is becoming reality. Thanks to CPUs getting faster and faster, hard drives keeping to get faster and larger, and thanks to incredibly performing applications, complete audio and video productions can be handled entirely within the computer. It was unthinkable in the past to use a computer not only as a MIDI sequencer, but as a multitrack tape recorder and a mixing console at the same time, without making use of special dedicated DSP hardware.

One of the now upcoming problems is called latency, which means the unavoidable delay between the audio signal fed to the computer and its output. This delay inevitably leads to monitoring problems, because no musician can play to other instruments synchronously, if he hears himself or the band a couple of hundreds of milliseconds late.

Monitoring

Long before real-time synthesizers, audio computers faced monitoring as their first challenge. Every tape recorder routes the input signal of a record track to the output when the recording starts. This recording-dependant routing does not only guarantee that a musician can hear himself without delay, it also automatically controls the monitoring. As long as there is no recording going on, the recorded signal will be played back from tape, as soon as a recording starts however, the input signal is routed through and can be heard at the output.

What had been every day's practice in a normal studio, was not available to PCs and Macs in the first place. Software developers didn't even think about controlling the monitoring via the audio software. In order to do this, you have to read in, process and output audio data. Using standard Windows drivers (MME) one has to expect delays of more than 400 ms typically, i. e. the monitor signal is about half a second late - just fine to use it as a delay effect, but that's it.

The block diagram shows the typical signal flow between audio hardware and software. A digital input signal passes a special receiver, an analog signal passes an ADC (Analog-to-Digital Converter.) Because of oversampling and digital filtering modern AD-Converters generate around 44 samples delay, which means 1 millisecond at 44.1 kHz. Before being transferred to the PCI bus, the data is buffered once more. Depending on the hardware concept, this delay can be few samples, but also more than a millisecond. Using MME, the software controls a couple of relatively large buffers, here we see 4 buffers with 8192 bytes each, translating to 186 ms (at 16 bit, see Low Latency Background: Buffers and Latency for details. Processing of the data is usually very fast with less than a sample of delay. But this is actually only true for the functions of a simple mixer. Inserted plug-ins with special effects cause further delays, sometimes in the range of more than several milliseconds.

For playback, somewhat smaller buffers are sufficient most of the time, here 4 times 4096, corresponding to 93 ms. Depending on the hardware, the data which is going to be output is buffered again and finally output analog or digital. While the digital transmitter / encoder usually doesn't cause any delay, the D-to-A Converter (DAC) generates another 44 samples or 1 millisecond. In total, for a complete pass from input to output the delay sums up to around 200 ms (in this example.)
Monitoring in hardware on the other hand works virtually without delay: with a switch the incoming data can be sent to the output directly. In this case the playback data is not available anymore. Working quite well with simple applications like Wavelab, this concept doesn't work for full duplex applications. In order to give the possibility of immediate reaction and to achieve click-free punch-ins, punch-outs and an overall stable performance, almost all multitrack software has the playback and record processes running at the same time.

If empty tracks are played back, zeros are being output. If no recording is going on, nothing will be written to the hard drive. However, the audio hardware has to work constantly all the time. Thus the input signal cannot be routed through from the actual recording point, because the hardware doesn't know if a recording is or is not going on. Cards without internal mixer (mainly professional solutions) are not able to provide any monitoring at all, because the playback tracks have to have priority. Otherwise the hardware would not be Full Duplex anymore.

While most software companies left (leave!!) their customers alone with this problem, SEK'D has been showing a solution for a couple of years. The function 'monitoring in hardware' sends a special command to the audio hardware with every punch-in, that sets the concerning track to route-through mode. This holds until the punch-out has been reached and a command for switching off is being sent. This technique is supported by all audio cards from the manufacturer's Marian and RME. Just in case you wonder now, what should be so special about this feature - go ask Microsoft or Apple, why they don't support something like this directly in their os'...

ZLM^®...

The term 'Zero Latency Monitoring' was introduced in 1998 by RME with the DIGI96 series and stands for the above mentioned technique (routing the input signal directly to the output on the audio card.) Since then, the idea behind has become one of the most important features of modern hard disk recording.

...and ADM

With ASIO Direct Monitoring (ADM, since ASIO 2.0), Steinberg has not only introduced ZLM to ASIO, but also extended it substantially. ADM also allows for monitoring the input signal via the hardware in real-time. Over and above that, ADM supports panorama, volume and routing, which requires a mixer (i.e. DSP functionality) in the hardware though. Thus it is possible to copy a routing through a software mixer in the hardware in real-time, so that the sound difference between playback and monitoring is very small. In total, ADM renders a substantial step towards 'mixer and tape recorder inside the computer'.

Enhanced ZLM^®

Direct routing-through of digital data usually requires identical and synchronous sample rates at input and output. In other words: monitoring of an input signal is only possible in slave mode (respectively AutoSync.) Actually the first generation of DIGI96 cards had to be set to this clock mode to use ZLM. The exclusive Enhanced ZLM process from RME, which can be found in all RME cards of today, removes this restriction and allows for monitoring the input signal independent from clock mode and sample frequency, also for different data rates at input and output, and is available both with ZLM and ADM operation.

Low - Lower - Lowest Latency

With this, the dream of a complete recording studio in a PC avoiding dedicated DSP hardware has come true (this is called native processing, because the CPU executes all calculations on its own.) Unfortunately, there is one downside: Windows still is rather an office operating system, and not meant to be used in the sound studio. Therefore it tries hard to sabotage performing with low latencies. In spite of ASIO many background operations from Windows or other applications lead to short CPU blockades and thus to drop-outs.

Danger recognized - danger removed! How to remove typical traps systematically and to avoid further ones is explained in the Tech Info Tuning Tips for Low Latency operation.

By increasing the latency, reliability increases and the risk of drop-outs decreases. On the other hand every decrease of the latency time leads to higher system load, because the number of indications for an upcoming data transfer increases. For example the Hammerfall triggers an interrupt every 1.5 milliseconds at a latency setting of 1.5 ms (translating to a buffer size of 64 samples), that means 666 times per second!

But not all that is technically feasible is also sensible. A practical example for this: a bass player 3 meters away from the drummer hears the drums with a delay of 9 ms (speed of sound 340 m/s.) Even close to the crash cymbals (1 m, tinnitus guaranteed) there are still 3 ms. With respect to those numbers, a latency of 6 ms seems to be way sufficient. In practice, even with a fixed latency of 10 ms, you can work wonderfully. Anyone can prove this on its own: just delay the audio output of a keyboard on purpose. The difference between the delayed and the non-delayed signal can be felt clearly, but one gets used to it quickly and can play groovy without any problem.

This claim will for sure be doubted by a lot of studio professionals, who declare a production with a variation of few milliseconds in the MIDI timing unusable. They are right, but they don't mean a fixed delay, they mean a varying delay, as caused by

Latency Jitter

Latency Jitter means variation in the latency, i. e. in some sense the reaction time of the system. We have released a special Tech Info about this issue: Low Latency Background: Buffers and Latency Jitter.

Copyright © Matthias Carstens, 2000.

All entries in this Tech Infopaper have been thoroughly checked, however no guarantee for correctness can be given. RME cannot be held responsible for any misleading or incorrect information provided throughout this manual. Lending or copying any part or the complete document or its contents is only possible with the written permission from RME.

Home News Audio Converters Sound Cards MADI Series DIGICheck Mic Preamps
Accessories Support RME Newsgroup Company Info Purchasing Downloads Links