Optimizing File I/O in LabVIEW and LabVIEW RT
Overview
This document will explain how file I/O works and show how to optimize file I/O performance in various operating systems, including LabVIEW RT OS.
Table of Contents
How does file I/O work?
Every operating system (OS) is responsible for implementing a file structure. However, the basic concepts are the same for all operating systems. We will begin by looking at the structure of a disk, which is the physical medium on which data is stored. The disk generally consists of several platters. It is easy to think of a disk as a record player. To access the data on the disk, two read/write heads, attached to a single arm, are moved to the correct location. The platters are divided into equal-sized tracks. A track is made up of sectors. These sectors are the unit of data transfer. So, while users and applications think in terms of writing records, the operating system breaks these down into writable sectors which it then writes one at a time.
When the disk drive is operating, it is rotating at a constant speed. To read or write a sector of data, we must correctly position the heads at the beginning sector of the correct track. The time required to move the head to the correct track is "seek time." The time for the disk to rotate until the starting sector is under the head is known is "rotational delay." The sum of these two numbers is known as "access time." The time to complete the read and write operation is known as the data transfer time.
What is caching and how does it affect file I/O?
In general, a cache is a buffer that is used as an intermediate and faster data storage medium. More specifically, a disk cache is a buffer in main memory that maintains a local copy of disk sectors. Accessing main memory is significantly faster than accessing the disk. When an application needs to write to a sector on disk, the operating system first checks to see if that sector is available on the cache. If it is, no access to the disk is required. The data is written in main memory. Some time later, the contents of the cache will be written out to the disk. If the sector is not found in the cache the sector will be brought in from the disk to the cache.
Caching is used to reduce the average access time, by assuming that a sector that was written to once will likely be written to again some time later. There are many different design implementations of this idea, with various cache size implementations.
How can I optimize file I/O in LabVIEW?
Before we discuss optimization methods in LabVIEW, we need to distinguish between two very different types of file I/O operations.
Many applications require sequential access to a file, where the file is read from or written to, in order, from the start of the file to the end. This type of application will benefit greatly from the caching mechanism, since neighboring sectors will likely be in the cache. Alternatively, our application might require random access to the file. That is, we may need to access unrelated, non-neighboring sectors of a file. This requires a very different way of thinking.
One thing to consider is that some file types are not easy to use for random access applications. An ASCII file requires a varying number of bytes for each data element. For example, the number 756 requires 3 bytes of storage while the number 7 requires only 1 byte. Therefore it is not possible to predict an element's location in the file. To find the element you need, you need to search the entire file. This makes random access very difficult and very inefficient with ASCII files. The solution is to use file types that are easier to search. Binary files are a very good choice for random access applications, since every element uses the same number of bytes in memory. Since we know exactly where a particular element is, we can easily index individual elements in a file.
How is LabVIEW RT different?
To maintain determinism, LabVIEW RT does not use a cache. This means that every write and read operation requires an access to the disk.
How can I optimize file I/O in LabVIEW RT?
Since LabVIEW RT does not use a cache, we need to minimize the access time to our disk.
The most important thing to consider is to write out data one sector at a time. In LabVIEW RT on a PXI controller, the sector size is 512 bytes. The reason for this is that every read and write operation will always execute in 512 byte increments. If our application requests a write of 30 bytes, the OS would have to first read the entire 512 byte sector, replace the 30 bytes you are writing, and then write the entire sector back. Similarly, if you wanted to read 30 bytes, the OS would have to read an entire 512 byte sector first, then pull out the 30 bytes needed. In fact, it takes less time to read or write 512 bytes than it does to read or write smaller byte sizes.
The best performance will be observed if you write 512 bytes at a time, and remain aligned with the file sectors. To ensure that you are starting your write or read at the beginning of a sector, keep your offset restricted to multiples of 512 (0, 512, 1024, etc.). If you wire an input that is not a multiple of 512, then you are no longer aligned with the file sectors. For example, if you wire an input offset of 1, then you actually will have to write to two separate disk sectors. On one sector you'll write 511 bytes, and on the other you'll write the left over 1 byte. Because we always write in 512 byte chunks, you'll actually write 512 bytes both times. This will significanlty degrade the peformance of the application.
Reader Comments | Submit a comment »
Legal
This tutorial (this "tutorial") was developed by National Instruments ("NI"). Although technical support of this tutorial may be made available by National Instruments, the content in this tutorial may not be completely tested and verified, and NI does not guarantee its quality in any way or that NI will continue to support this content with each new revision of related products and drivers. THIS TUTORIAL IS PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND AND SUBJECT TO CERTAIN RESTRICTIONS AS MORE SPECIFICALLY SET FORTH IN NI.COM'S TERMS OF USE (http://ni.com/legal/termsofuse/unitedstates/us/).
