# Scheduling Timing Using Handshaking Signals (FPGA Module)

FlexRIO Help

Edition Date: November 2015

Part Number: 372614J-01

»View Product Info

Handshaking refers to communication between two nodes that establishes the parameters for continued communication. For a given block diagram node F in a single-cycle Timed Loop, handshaking determines when the following actions occur:

• F discards data from upstream nodes. An upstream node is any node that sends data to F.
• F accepts data from upstream nodes.
• Downstream nodes discard data from F. A downstream node is any node that receives data from F.
• Downstream nodes accept data from F.

In a single-cycle Timed Loop, handshaking is necessary because multi-cycle nodes need more than one cycle to compute valid data, but the single-cycle Timed Loop forces these nodes to return data every clock cycle. Therefore, multi-cycle nodes do not return valid data every clock cycle. To ensure the numerical accuracy of an algorithm, nodes that depend on this data must know whether the data is invalid or valid.

National Instruments has established a handshaking protocol you can use with certain nodes inside a single-cycle Timed Loop. This protocol involves the following terminals:

• input valid—Specifies that the next data point has arrived for processing.
• output valid—Indicates that the current data point produced by the node is valid and ready to be used by downstream nodes.
• ready for output—Specifies whether the downstream node can accept a new data point.
• ready for input—Indicates whether the node can accept a new data point during the next clock cycle.
 Note  Because this protocol involves four terminals, handshaking in FPGA VIs sometimes is known as the four-wire protocol.

## Ensuring that Single-Input Nodes Use Valid Data

Consider three single-input functions in a single-cycle Timed Loop: an upstream function U, a function F, and a downstream function D. These functions take an input value x and produce output data after a certain number of clock cycles. The following figure shows how you can wire U, F, and D to use the handshaking protocol National Instruments recommends.

In the previous figure, the numbers on the wires correspond to the following steps:

1. The value of the input valid terminal of F becomes TRUE when U sends valid data to F. TRUE means that U has produced a valid output value and is sending this value to F.

To use this logic, wire the output valid terminal of U to the input valid terminal of F. F then knows when U has produced a valid output value.
2. After the input valid terminal of F receives a value of TRUE, F begins computing a result. While F computes the result, the output valid terminal of F returns FALSE.

The output valid terminal returns TRUE when both of the following criteria are met:
• At least L f clock cycles have elapsed since F began computing the result, where L f is the latency of F. You can see the latency of a function in the configuration dialog box of a function. Access this dialog box by double-clicking a function.
• The ready for output terminal of F is TRUE. This terminal tells F when D is ready to accept an input value from F.

To use this logic, wire the ready for input terminal of D through a Feedback Node to the ready for output terminal of F. Because Feedback Nodes take one clock cycle to execute by default, F receives this value one clock cycle after D sends it.
3. If you wire the output valid terminal of F to the input valid terminal of D, D knows when F has produced a valid value and is sending this value to D. This logic is the same logic as described in step 1 above. The same process then repeats for D, except that F becomes the upstream function.
4. The ready for input terminal of F returns TRUE if F can receive another valid input data point during the next clock cycle.

To use this logic, wire the ready for input terminal of F through a Feedback Node to the ready for output terminal of U. U then knows when F is ready to accept a new input value. Because the Feedback Node takes one clock cycle to execute by default, U receives this information one clock cycle after F sends it.
 Tip  To achieve a high throughput rate, enter a small value in the Throughput control of F. After F receives a valid input and starts to compute a result, at least T f clock cycles must elapse before F can receive another valid input, where T f is the value of the Throughput control of F. Therefore, a small value for the Throughput control means that F can receive another valid input data point sooner than if you specified a large Throughput value. A small Throughput value also means that the ready for input terminal of F can return TRUE earlier than if you specified a high Throughput value.

## Ensuring that Multi-Input Nodes Use Valid Data

The High Throughput Add function is a multi-input function because this function has two inputs, x and y. In a single-cycle Timed Loop, you must ensure that both x and y are valid during the same clock cycle. If one input is valid and the other is invalid, this function computes output data by using one valid input value and one invalid input value.

To ensure a valid x and y arrive during the same clock cycle, add up the latencies of all nodes that feed data forward to x. Then, add up the latencies of all nodes that feed data forward to y. The two latency totals must be equivalent. If the latency totals are different, add Feedback Nodes to the shorter path (the one with the lowest latency total) until the latency totals for the x and y paths are equivalent.

 Note  By default, a Feedback Node adds one cycle to the latency of a path. However, you can change this delay for each Feedback Node. This design technique helps when nodes in the same path have different numbers of pipeline stages.

Refer to labview\examples\target_type \FPGA Fundamentals\FPGA Math and Analysis\High-Throughput Math\Vector Normalization\Vector Normalization.lvproj, where target_type either is CompactRIO or R Series depending on the driver you installed, for an example that demonstrates how to balance latency paths in a multi-input handshaking application.

## Nodes that Support Handshaking Terminals

The following nodes support the handshaking terminals:

## Nodes that Support Multi-channel Handshaking

The following node supports the handshaking with multiple channels:

For multi-channel nodes, LabVIEW interleaves the values such that the first valid data point goes to the first channel, the second valid data point goes to the second channel, and so on. If the input valid terminal becomes FALSE during a channel scan, the next valid input applies to the next channel immediately in the sequence, as shown in the following illustration.

In the illustration, d0, d1, and d2 are three channels receiving data. Notice that each time the input valid terminal becomes FALSE and then TRUE again, the subsequent valid data applies to the next channel immediately in the sequence.

## Determining the Fastest Allowable Throughput Rate

For a series of connected nodes inside a single-cycle Timed Loop, the fastest throughput rate refers to the fewest number of clock cycles after which you can send more data to the series. The fastest throughput rate is equal to the throughput rate of the slowest node (that is, the node with the highest value of its Throughput control) in the series. This node is the bottleneck for all nodes in the series. If the system sends data to this series at a rate that is faster than this fastest rate (that is, the system waits fewer than the fewest number of allowable clock cycles before sending data to the series), LabVIEW discards data points.

To determine the fastest throughput rate for a series of nodes, first determine the throughput rate of each node by looking at the Throughput control for that node. You can see this control by double-clicking a node or by displaying the Context Help window of the node. The Throughput control with the highest value (that is, the slowest node) is the fastest rate (that is, the fewest number of cycles) at which the system can send data to this series of functions.

For example, consider a series of nodes where the slowest node has a throughput rate of 32 cycles/sample, the clock rate of the FPGA target is 40 MHz, and the input sample rate is 2 MS/s. The system throughput rate is the clock rate divided by the input sample rate, which is 40,000,000 cycles/2,000,000 samples or 20 cycles/sample. In this situation, the system sends data to the series every 20 cycles, which is faster than the fastest throughput rate of 32 cycles. This difference causes LabVIEW to discard data points.

 Note  The system throughput is inversely proportional to the number of channels in the system, and the channel throughput is proportional to the number of channels in the system. The following equations describe the relationship between the system throughput and the channel throughput. system throughput = channel throughput / number of channels channel throughput = system throughput Χ number of channels For example, if the system throughput is 200 cycles/sample and the system has four channels, the throughput for each channel is 800 cycles/sample.