Benchmarking Single-Point Performance on National Instruments Real-Time Hardware
Overview
National Instruments Real-Time hardware and software is used in a wide variety of application domains. As part of our commitment to providing the best up-to-date information regarding the performance characteristics of our products, NI regularly produces benchmarks that can assist existing and potential customers design the best hardware and software configurations for their particular needs.
Table of Contents
This document provides results from a set of single-point benchmarks that NI R&D ran on a cross section of the NI real-time hardware. The focus of these benchmarks is to provide an indication of the performance of NI hardware and software in the area of single-point performance. In isolation, these benchmarks cannot provide a guide to entire system performance, but they can assist in selecting the appropriate platform for a particular application by comparing the different hardware/software combinations on a set of simple, standard tests.
In this paper, we begin with a description of the three different hardware platforms tested, Compact RIO (cRIO), Compact Field Point (cFP) and PXI/PCI, follow with a description of the single-point tests, and end with detailed benchmark numbers for each of the tests.
CompactRIO
The NI CompactRIO programmable automation controller (PAC) is a low-cost reconfigurable control and acquisition system designed for applications that require high performance and reliability. The system combines an open embedded architecture with small size, extreme ruggedness, and hot-swappable industrial I/O modules. CompactRIO is powered by reconfigurable I/O (
Figure 1: CompactRIO System
In order to program the Compact RIO system, you need to use LabVIEW, the LabVIEW Real-Time Module and the LabVIEW FPGA Module. In our tests, we used the LabVIEW FPGA Module to gain access to the I/O while performing computation, local logging, and host communication with the host on the real-time controller.
Figure 2: cRIO AI/AO Sample Diagram on Real-Time Controller
Compact FieldPoint
Designed for industrial control, the NI Compact FieldPoint programmable automation controller (PAC) offers the flexibility and ease of use of a PC and the reliability of a PLC. With Compact FieldPoint, you can embed all of the intelligence, advanced control, and analysis capabilities of LabVIEW in a small modular package suitable for industrial environments.
Figure 3: Compact FieldPoint System
Compact FieldPoint is programmed using LabVIEW and the LabVIEW Real-Time Module, along with the FieldPoint I/O API.
Figure 4: FieldPoint AI/AO Sample Diagram
PXI/PCI
PCI eXtensions for Instrumentation (PXI) is the open, multivendor standard for measurement and automation with access to a wide variety of I/O and communication modules, including data acquisition, modular instrumentation, reconfigurable I/O (RIO), image acquisition, motion control, Ethernet, serial, CAN, DeviceNet, reflective memory, and more. With PXI, you automatically benefit from the low cost, ease of use, modularity, and flexibility of PC technology. The PXI system is programmed using LabVIEW and the LabVIEW Real-Time Module, and in these tests we used NI DAQmx software and a data acquisition board for access to single-point I/O.
Figure 5: PXI Controller
In addition to the PXI form factor, the same suite of software products can be run on standard off-the-shelf PC hardware.
Figure 6: DAQmx AI/AO Sample Diagram
Single-Point Tests
NI R&D designed a set of single-point tests to gauge the performance of the systems across a variety of program architectures. The table below lists the tests along with a brief description. Each test used 1, 4, and 16 channels.
Table 1: Single-Point Tests
Testing For “Lateness”
We created tests to determine the fastest loop the systems could attain without losing data or being “late.” These loop rates are reported later in this paper. A loop iteration is considered late if the software is unable to receive a sample from the I/O hardware, process that sample, and output the result before the next input sample is ready.
The NI-DAQmx and NI-RIO drivers are capable of hardware-timed single-point operations, and they provide feedback to help ensure that the software keeps up with the hardware clock. The Fieldpoint driver is purely software timed, so the driver does provide this feedback. In the next three sections, we briefly review how lateness checking can be accomplished with each of these drivers.
See Appendix A for a complete discussion of lateness checking.
Polling vs. Interrupts
A key consideration when designing a LabVIEW real-time application is whether the system needs to concurrently perform its time-critical function along with other non-time-critical operations, such as local disk logging or communication with the host. This decision dictates the basic architecture of the real-time application.
Polling can be used for I/O mode in the case where the system has no non-time-critical responsibilities, or, more realistically, the system uses a state machine to schedule time-critical and non-time-critical tasks to operate sequentially. For most I/O drivers, polling mode is faster than interrupt mode.
Although slower than polling mode, interrupt mode is the more common for real-time applications as most applications contain a mix of time-critical and non-time-critical functions occurring simultaneously. Interrupts allow the I/O portion of a diagram to suspend its operation and allow other code, such as communication and logging, to run while the hardware is in the process of acquiring data. Once the hardware has finished its acquisition, it raises an interrupt to notify the software that it should resume with it’s time-critical I/O processing.
As noted in the table, we ran the single-point tests using the appropriate I/O mode to highlight the differences between these two modes of operation.
Polling with Microsecond Wait
A variation on the polling mode architecture that we did not utilize is to use of the microsecond wait function in time-critical loop in conjunction with the polling I/O mode. This methodology allows the programmer to reserve a pre-determined block of time for concurrent non-time-critical code. The polling numbers presented in this paper establish the maximum rate for such a mode, assuming a zero microsecond wait, and increasing the duration of the microsecond wait will reduce the loop rates in a linear fashion.
Figure 7: DAQmx Example with Microsecond Wait
Real-Time FIFOs
An additional consideration when building systems that require concurrent time-critical and non-time-critical function is how to transfer data from the time-critical loop to the non-time-critical loop so that it can be logged or communicated back to the host. Real-Time FIFOs, available as part of the LabVIEW Real-Time Module, provide this needed functionality. Figure 8: Use of Real-Time FIFOs shows how tests T3 & T4 use the Real-Time FIFOs to provide jitter-free communication between the time-critical code and communication/logging functions.
Figure 8. Use of Real-Time FIFOs
Buffer Sizes
Buffer size selection is an important consideration when programming with RT FIFOs and altering the size of a FIFO buffer might produce a large change in the final loop rate a test can achieve. Larger buffers generally provide better performance at the expense of higher memory usage. For these tests, all buffer sizes were fixed to a 4KB size to ensure that the tests could be run on even the most memory constrained devices.
Network Published Shared Variables
Test T4 included communication with a host PC and the method we chose for this communication was TCP/IP. LabVIEW 8.0 introduced the Network Published Shared Variable which is a simplified communication mechanism for communicating between LabVIEW programs across the network. Currently, the network published shared variable uses a network protocol optimized for supervisory monitoring of large numbers of variables and not high speed streaming as required by these tests. In future versions of LabVIEW, National Instruments will optimize the network published shared variable to support the streaming use case. For more information regarding the shared variable, please refer to the white paper titled LabVIEW Shared Variable available on ni.com.
Test Results
Final tests results, along with the specific hardware and software versions, are provided in the following sections.
CompactRIO
Hardware
- NI cRIO-9012: Real-Time Controller
- 400 MHz PowerPC processor
- 64 MB RAM
- (4) NI cRIO-9263: 4-Channel, 100 kS/s, 16-bit, ±10 V, Simultaneous-Update Analog Output Module
- (2) NI cRIO-9201: 8-Channel, ±10 V, 500 kS/s, 12-Bit Analog Input Module
- NI cRIO-9102: 8-Slot, 1 M Gate Reconfigurable Embedded Chassis
Software installed on controller
- LabVIEW Real-Time 8.5
- LabVIEW PID Control Toolkit 8.5
- NI-RIO 2.3.0
- NI-VISA 4.2
- NI-VISA Server 4.2
Notes
- PID performed on real-time controller
Results
| Test | 1 Channel | 4 Channels | 16 Channels |
| T1 | 39.5 kHz | 26.1 kHz | 17.1 kHz |
| T2a | 25.7 kHz | 8.2 kHz | 5.3 kHz |
| T2b | 7.1 kHz | 4.1 kHz | 3.5 kHz |
| T3 | 5.8 kHz | 4.3 kHz | 2.2 kHz |
| T4 | 4.8 kHz | 4.6 kHz | 3.7 kHz |
Table 2: CompactRIO 9012 Test Results
Download the code for the CompactRIO benchmarks
Compact FieldPoint
Hardware
- NI cFP-2120: Rugged Intelligent Ethernet Controller Interface for Compact FieldPoint with Removable Storage
- 200 MHz Pentium-class processor
- 128 MB SDRAM
- (4) NI cFP-AIO-600: 8-Channel Combination Analog Input/Analog Output Module
- (4) NI cFP-CB-1: Integrated Connector Block for Wiring to Compact FieldPoint I/O
- NI cFP-BP-8: 8-Slot Backplane
Software installed on controller
- LabVIEW Real-Time 8.5
- LabVIEW PID Control Toolkit 8.5
- FieldPoint Drivers 6.0.0
- FieldPoint VI Manager 6.0.0
Notes
- Analog channels use 0 to 20 mA current mode
Results
| Test | 1 Channel | 4 Channels | 16 Channels |
| T1 | 619.9 Hz | 611.3 Hz | 312 Hz |
| T2a | 605.6 Hz | 609.8 Hz | 313.7 Hz |
| T2b | N/A | N/A | N/A |
| T3 | 629.6 Hz | 616.2 Hz | 289.1 Hz |
| T4 | 617.3 Hz | 608.4 Hz | 278.3 Hz |
Table 3: Compact FieldPoint Test Results
Download the code for the Compact FieldPoint benchmarks
PXI/DAQmx
Hardware
- NI PXI-8106 RT
- 2.16 GHz Intel Core 2 Duo T7400
- 512 MB dual-channel DDR2 RAM
- NI PXI-6071E: 1.25 MS/s, 12-Bit, 64-Analog-Input Multifunction DAQ
- NI PXI-6723: Static and Waveform Analog Output -- 13-Bit, 32 Channels
Software installed on controller
- LabVIEW Real-Time 8.5
- LabVIEW PID Control Toolkit 8.5
- NI-DAQmx 8.6
o AO Series
o Digital I/O
o Multifunction DAQ
o TIO Series
Notes
- Ethernet driver set to polling
- Legacy USB disabled
- Turn off CPU load display by setting ni-rt.ini token EnableCPULoadDisplay=False
- NI RT Extensions for SMP not installed
Results
| Test | 1 Channel | 4 Channels | 16 Channels |
| T1 | 107.5 kHz | 67.0 kHz | 30.2 kHz |
| T2a | 104.2 kHz | 58.0 kHz | 27.3 kHz |
| T2b | 47.4 kHz | 30.2 kHz | 18.7 kHz |
| T3 | 38.5 kHz | 25.4 kHz | 16.1 kHz |
| T4 | 45.0 kHz | 29.5 kHz | 18.3 kHz |
Table 4: PXI/DAQmx Test Results
Download the code for the DAQmx benchmarks
PXI/RIO
Hardware
- NI PXI-8106 RT
- 2.16 GHz Intel Core 2 Duo T7400
- 512 MB dual-channel DDR2 RAM
- (2) NI PXI-7831R: Reconfigurable Multifunction I/O
Software installed on controller
- LabVIEW Real-Time 8.5
- LabVIEW PID Control Toolkit 8.5
- NI-RIO 2.3
- NI-VISA 4.2
- NI-VISA Server 4.2
Notes
- Ethernet driver set to polling
- Legacy USB disabled
- Turn off CPU load display by setting ni-rt.ini token EnableCPULoadDisplay=False
- NI RT Extensions for SMP not installed
- PID performed on real-time Pentium controller
Results
| Test | 1 Channel | 4 Channels | 16 Channels |
| T1 | 90.4 kHz | 71.9 kHz | 35.7 kHz |
| T2a | 86.4 kHz | 65.7 kHz | 32.2 kHz |
| T2b | 55.4 kHz | 44.3 kHz | 25.0 kHz |
| T3 | 45.7 kHz | 39.6 kHz | 23.4 kHz |
| T4 | 53.8 kHz | 45.0 kHz | 25.4 kHz |
Table 5: PXI/RIO Test Results
Download the code for the RIO benchmarks
PCI/DAQmx
Hardware
- Intel® Core™ 2 Duo Real-Time Desktop
-
- Intel® Core™ 2 Duo E6400 - 2.13 GHz
- Intel® BOXDG965WHMKR Intel® G965 Express motherboard
- 1GB DDR2 PC6400 Dual-Channel RAM
- Maxtor 160GB SATA hard drive
- PCI Intel® Pro/1000 MT network adapter (Intel® 8254X chipset)
-
NI PCI-6070E: 12-Bit, 1.25 MS/s, 16-Analog-Input Multifunction DAQ
-
NI PCI-6723: Static and Waveform Analog Output -- 13-bit, 32 Channels
Software installed on controller
- LabVIEW Real-Time 8.5
- LabVIEW PID Control Toolkit 8.5
- NI-DAQmx 8.6
o AO Series
o Digital I/O
o Multifunction DAQ
o TIO Series
Notes
- Ethernet driver set to polling
- Legacy USB disabled
- Turn off CPU load display by setting ni-rt.ini token EnableCPULoadDisplay=False
- NI RT Extensions for SMP not installed
Results
| Test | 1 Channel | 4 Channels | 16 Channels |
| T1 | 111.7 kHz | 64.3 kHz | 26.5 kHz |
| T2a | 103.1 kHz | 54.9 kHz | 22.7 kHz |
| T2b | 41.6 kHz | 36.5 kHz | 20.0 kHz |
| T3 | 40.2 kHz | 38.8 kHz | 20.8 kHz |
| T4 | 36.2 kHz | 33.2 kHz | 16.7 kHz |
Table 6: PCI/DAQmx Test Results
Download the code for the DAQmx benchmarks
PCI/RIO
Hardware
- Intel® Core™ 2 Duo Real-Time Desktop
-
- Intel® Core™ 2 Duo E6400 - 2.13 GHz
- Intel® BOXDG965WHMKR Intel® G965 Express motherboard
- 1GB DDR2 PC6400 Dual-Channel RAM
- Maxtor 160GB SATA hard drive
- PCI Intel® Pro/1000 MT network adapter (Intel® 8254X chipset)
- (2) NI PCI-7831R: Reconfigurable Multifunction I/O
Software installed on controller
- LabVIEW Real-Time 8.5
- LabVIEW PID Control Toolkit 8.5
- NI-RIO 2.3
- NI-VISA 4.2
- NI-VISA Server 4.2
Notes
- Ethernet driver set to polling
- Legacy USB disabled
- Turn off CPU load display by setting ni-rt.ini token EnableCPULoadDisplay=False
- NI RT Extensions for SMP not installed
- PID performed on real-time Pentium controller
Results
| Test | 1 Channel | 4 Channels | 16 Channels |
| T1 | 80.9 kHz | 68.4 kHz | 35.3 kHz |
| T2a | 77.4 kHz | 64.5 kHz | 31.9 kHz |
| T2b | 59.3 kHz | 50.6 kHz | 28.9 kHz |
| T3 | 60.6 kHz | 51.5 kHz | 16.2 kHz |
| T4 | 49.7 kHz | 44.3 kHz | 25.2 kHz |
Table 7: PCI/RIO Test Results
Download the code for the RIO benchmarks
Comparative Graphs: Set 1
Comparative Graphs: Set 2
Appendix A
Lateness checking in NI-DAQmx 7.4 & Higher
The DAQmx 7.4 driver provides feedback to guarantee that no input samples have been lost and that all output channels are updated with their corresponding values before the Sample Clock signal. To achieve this, DAQmx 7.4 uses the “Wait For Next Sample Clock” VI to force synchronization of the software I/O loop with the Sample Clock signal of one of its I/O tasks.
For more detailed information on DAQmx lateness detection please refer to the DAQmx Online Help or the following document on the NI Developer Zone: DAQmx Hardware-Timed Single-point Lateness Checking
Lateness checking in NI-RIO
Lateness checking on RIO devices is implemented in the FPGA by using a handshaking mechanism to determine if the real-time software VI is able to receive the input samples, process the data, and provide corresponding output values before the next sample clock signal occurs. The handshaking mechanism can use interrupts, or it can poll a user-defined register to signal that a new I/O cycle has begun.
The diagrams for the interrupt mechanism are depicted below in Figure 9 and Figure 10. The FPGA VI generates an interrupt when new input data is available and then waits for the time-critical VI to acknowledge the interrupt.
Figure 9: NI-RIO Diagram (FPGA)
The time-critical VI must service the interrupt within the duration of a single sample period. This includes reading the input samples, processing the samples, and writing the output values back to the FPGA.
Figure 10: NI-RIO diagram (Real-Time)
The following diagram illustrates the interrupt-based handshaking employed on the NI-RIO benchmarking tests:
Figure 11: Interrupt-based handshaking employed on NI-RIO tests
The polling mechanism is implemented in a very similar way. The time-critical VI polls an FPGA register to determine when data is available and does not reset the value of that register until it has finished the I/O tasks (processing and output). At that point, the FPGA is allowed to move on to the next I/O iteration.
Lateness checking on FieldPoint
Software-timed I/O on FieldPoint does not directly provide a way to determine how fast an I/O loop can be implemented. Several factors, such as the FieldPoint I/O server, I/O bus scheduling and I/O module update rates, can limit how quickly the desired output is updated or the desired input channel is read. You can still roughly determine the rate at which the given FieldPoint program can execute an I/O loop by one of the following two ways:
1) Using external checking methods, such as a second system with Data Acquisition cards or a traditional scope
2) Wiring the output channels back to the input channels and making sure that a specific I/O pattern is sustained without data loss
The first option requires external equipment that must:
- Provide a known external signal
- Share timing signals with the FieldPoint device
- Adapt sampling and input signal frequencies to the sampling rate used on the FieldPoint device
The second option is a bit simpler since no external equipment is needed, but it places the additional burden of checking the I/O pattern on the FieldPoint system. While the additional overhead impacts the performance of the system, a good estimate of a sustainable I/O rate can still be measured because the impact is relatively small.
For the FieldPoint AI-AO and PID tests, we configured the test
Figure 12 shows two complete stair-step patterns where we compare the analog output values from iteration n with the analog input values of iteration n+1. Sample #15 shows a glitch where the current input value lags behind the expected pattern, and the loop iteration is considered to be “late.”
Figure 12: FP Lateness Checking Graph
For more information on the elements that affect FieldPoint I/O rates, refer to the following document on the NI Developer Zone: Benchmarking LabVIEW Real-Time FieldPoint Systems.
Appendix B
Running PID Calculations on the FPGA
The tests for the cRIO and PCI/PXI RIO benchmarks were written such that all data acquired through the FPGA modules/board was transferred to the real-time controller for processing. An alternative approach for tests T1 (Analog In + Analog Out) and T2a (Analog In + PID + Analog Out) is to perform all the processing directly on the FPGA. This approach can achieve much faster loop rates than the numbers shown the respective graphs. Single channel PID calculation on both the cRIO and PCI/RIO approach 150kHz and are mainly limited by the performance of the A/D chips.
Appendix C
Archives
Since the data shown is using the latest hardware and software versions, we have attached a pdf below containing any previously published data.
Reader Comments | Submit a comment »
Legal
This tutorial (this "tutorial") was developed by National Instruments ("NI"). Although technical support of this tutorial may be made available by National Instruments, the content in this tutorial may not be completely tested and verified, and NI does not guarantee its quality in any way or that NI will continue to support this content with each new revision of related products and drivers. THIS TUTORIAL IS PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND AND SUBJECT TO CERTAIN RESTRICTIONS AS MORE SPECIFICALLY SET FORTH IN NI.COM'S TERMS OF USE (http://ni.com/legal/termsofuse/unitedstates/us/).
