Company Events Academic Community Support Solutions Products & Services Contact NI MyNI

Array Optimization (FPGA IP Builder)

LabVIEW 2013 FPGA IP Builder Help

Edition Date: June 2013

Part Number: 373567C-01

»View Product Info
Download Help (Windows Only)

The LabVIEW FPGA IP Builder stores arrays in memory, such as block memory and look-up tables. Memory configurations in an FPGA IP design have a great impact on the performance of the overall design, including the timing performance and device utilization. Therefore you must properly configure the array directives of the algorithm VI to achieve the best performance. These directives appear when you select an array or array buffer from the Block Diagram Components list on the Directives page of the Directives Properties dialog box.

Note  The FPGA IP Builder does not store arrays in dynamic RAM (DRAM). For small-size arrays, the FPGA IP Builder automatically implements them using registers instead of memory.

Partitioning Arrays

Each memory unit can access only two array elements at most in one cycle. Reading four elements from an array might require two cycles. To improve the bandwidth, you can use the Partition type and Number of partitions directives to split up a large-size array into multiple smaller-size arrays and access them simultaneously.

The Number of partitions directive specifies the number of partitions for the specified partition type. This directive is only valid for Block and Cyclic partition types.

Note  LabVIEW automatically handles the partition if the array size is not a multiple of the value you specify for the Number of partitions directive.

The Partition type directive specifies one of three supported partition types. The FPGA IP Builder supports the following partition types:

  • Block: Splits an array into equally-sized blocks of consecutive elements. For example, you can split [0, 1, 2, 3…N–3, N–2, N–1] into the following two arrays:
    • [0, 1, 2, 3…N/2–1]
    • [N/2…N–3, N–2, N–1]
  • Cyclic: Splits an array into equally-sized blocks of interleaved elements. For example, you can split [0, 1, 2, 3…N–3, N–2, N–1] into the following two arrays:
    • [0, 2…N–2]
    • [1, 3…N–3, N–1]
  • Complete: Splits an array into individual elements. For example, you can split [0, 1, 2, 3…N–3, N–2, N–1] into the following numeric numbers:
    • 0
    • 1
    • 2
    • 3
    • N–3
    • N–2
    • N–1

If you are processing a large-size array and want to read the first four elements (0, 1, 2, and 3) without partitioning the original array, you need two cycles to access them. If you use the Block option to partition the array into two blocks, the first four elements are still in one array and you still need two cycles to access them. If you use the Cyclic or Complete option to partition the array, you can access the first four elements simultaneously in only one cycle.

Specifying Memory Resources

Use the Resource pull-down menu on the Directives page to configure memory resources for arrays. Appropriate configurations can save hardware resource usage and improve timing performance of the resulting FPGA IP. If you do not configure memory resources for an array, the FPGA IP Builder automatically assigns a memory resource to the array.

Use the following guidelines to configure the memory resource for an array:

  • Use ROM resources if you only read elements from an array. ROM requires less hardware resources than RAM.
  • Use look-up tables for small-size arrays and use block memory for large-size arrays.
  • Use dual-port memory if you want to reduce the number of cycles. Dual-port memory can process two array elements in one cycle but requires more hardware resources than single-port memory.
  • Use Xilinx CORE Generator memory for extra large-size arrays. This type of memory requires more time to generate FPGA IP but returns better results than other non-Xilinx CORE Generator options.

Reducing the Number of Array Buffers

After you create a directives item, you might find one or more items with the name Array Buffer in the Block Diagram Components list on the Directives page. The image below shows an example of an array buffer. The [4] beside the array buffer indicates that this array has 4 elements:

When you click an Array Buffer in the Block Diagram Components list and then click Find on block diagram, the affected array will be highlighted on the block diagram. The image below shows an array buffer which is highlighted on the block diagram:

LabVIEW automatically generates array buffers based on the algorithm VI to create array copies as necessary. These array buffers affect both the performance and device utilization of the resulting FPGA IP, because they require additional memory and computational time. They might also generate buffer loops in the estimation reports. You might be able to reduce the number of array buffers by modifying the algorithm VI. For example, you can avoid branching the wire that comes from an array.

Manually Unrolling Introduced For Loops

If a VI you build with the FPGA IP Builder contains an Array Subset function, or a Build Array function with array inputs, the FPGA IP Builder creates one or more for loops to implement the function. If you set a value for the Initiation interval directive on the owning VI or owning loop structure, the FPGA IP Builder usually unrolls these for loops and implements each iteration of a for loop in parallel before performing optimizations to achieve your specified initiation interval. However, there may be some cases where the tool cannot unroll these for loops. The FPGA IP Builder implements each such for loop iteration sequentially, which negatively impacts the throughput of your IP.

If your algorithm VI contains an Array Subset function or Build Array function for which the input arrays and output arrays are completely partitioned, you can potentially improve the throughput of the IP by completing the following steps:

  1. Wrap the Array Subset function or Build Array function in a subVI.
  2. Set a value for the Initiation interval directive of the SubVI to ensure that the FPGA IP Builder unrolls the introduced for loops and implements each iteration in parallel. An Initiation interval value of 1 is often optimal.

 

Your Feedback! poor Poor  |  Excellent excellent   Yes No
 Document Quality? 
 Answered Your Question? 
Add Comments 1 2 3 4 5 submit