![]() | |
![]() |
| | Thread Tools | Search this Thread | Display Modes |
#1
| |||
| |||
|
#2
| ||||
| ||||
|
|
1. We have noticed that when we run a Read thread and a compute checksum thread in parallel against the same disk/spindle the disk throughput degrades considerably and serializing the two threads gives us much higher throughput overall. Is this behavior expected? |
|
2. If the behavior in 1. is expected then how should we serialize/parallelize the three kinds of threads to achieve optimal performance. |
|
3. Are there Windows APIs to determine the spindles we are working against and throttle our threads accordingly to achieve optimal performance? |

|
4. Any pointers to case studies, experiments or white papers/ research papers by folks who have done this before? |
#3
| |||||
| |||||
|
|
Mine is a disk based backup/restore product. It is multithreaded, which means many backups and restores are happening simultaneously. The product has basically three types of interactions with the disk. 1. Read. This is synchronous. Most reads happen on mounted shadow copies (using VSS). 2. Write. FILE_FLAG_WRITE_THROUGH is used since data integrity is critical to the product. |
|
3. Compute checksums on the files. Each of the above operations are done in their own dedicated threads and currently any number of threads can be run at any point in time doing any of the operations in parallel. Each operation targets exactly one LUN. The size of the data varies from few KBs to Giga bytes. Also, we may suggest clients to optimize their disk architecture (use RAID, faster disks, etc.) but the clients may choose to ignore suggestions. My questions are, 1. We have noticed that when we run a Read thread and a compute checksum thread in parallel against the same disk/spindle the disk throughput degrades considerably and serializing the two threads gives us much higher throughput overall. Is this behavior expected? |
|
2. If the behavior in 1. is expected then how should we serialize/parallelize the three kinds of threads to achieve optimal performance. |
|
3. Are there Windows APIs to determine the spindles we are working against and throttle our threads accordingly to achieve optimal performance? |
|
4. Any pointers to case studies, experiments or white papers/ research papers by folks who have done this before? |
#4
| |||
| |||
|
|
If I recall correctly, NT and its descendants (unlike Unix) flush dirty data to disk when a file is closed and report any error then. |
|
Preallocate any output file to its final size to avoid the overhead of multiple intermediate allocations (and to maximize the probability that it will be laid out contiguously on disk). I think you used to need to use the Ntxxx native system calls |
#5
| |||
| |||
|
|
If I recall correctly, NT and its descendants (unlike Unix) flush dirty data to disk when a file is closed and report any error then. No. Nothing occurs when the file is closed. Flushes go after this by the lazy writer. Just copy a huge file in a Windows shell and look at flushing activity after the copy is reported done. |
|
Preallocate any output file to its final size to avoid the overhead of multiple intermediate allocations (and to maximize the probability that it will be laid out contiguously on disk). I think you used to need to use the Ntxxx native system calls This will spend lots of time zeroing the newly allocated file. Not a way to make things faster. |
#6
| |||
| |||
|
|
Unix-style system-wide sync calls are a fourth (but I don't think Windows offers them). |
|
That's why I explicitly said NOT to do it that way. There at least used to be an NTxxx Create function that accepted a preallocated size parameter that did not result in zeroing out the file. |
|
undocumented NTxxx functions, but can't remember the details. |
#7
| |||
| |||
|
|
That's why I explicitly said NOT to do it that way. *There at least used to be an NTxxx Create function that accepted a preallocated size parameter that did not result in zeroing out the file. *I think that later on MS added a way to do this without having to resort to undocumented NTxxx functions, but can't remember the details. Not having a way to preallocate a file without zeroing it out would be really, really dumb (yes, the early documented interface was really, really dumb, but at least they provided an undocumented mechanism to fix that). *I could imagine that some zeroing activity would still be required if you populated the preallocated space out of sequence, though, given how 'high water marking' works. |
#8
| |||
| |||
|
|
Unix-style system-wide sync calls are a fourth (but I don't think Windows offers them). It does, FlushFileBuffers is fsync(). If you open the volume - like \\.\c: and do FlushFileBuffers on this handle - this is a total volume flush, metadata included. |
|
That's why I explicitly said NOT to do it that way. There at least used to be an NTxxx Create function that accepted a preallocated size parameter that did not result in zeroing out the file. ZwCreateFile with AllocationSize provided. Well, maybe. NTFS has on-disk ValidDataLength, so, zeroing is not mandatory. For FAT, it is surely mandatory. I think it is a good idea to try it. Some people told me once that this kind of creation _starts a background zeroing procedure_. undocumented NTxxx functions, but can't remember the details. ZwCreateFile is documented for kernel mode. |
#9
| |||
| |||
|
|
On Feb 11, 12:19 am, Bill Todd<billt... (AT) metrocast (DOT) net> wrote: That's why I explicitly said NOT to do it that way. There at least used to be an NTxxx Create function that accepted a preallocated size parameter that did not result in zeroing out the file. I think that later on MS added a way to do this without having to resort to undocumented NTxxx functions, but can't remember the details. Not having a way to preallocate a file without zeroing it out would be really, really dumb (yes, the early documented interface was really, really dumb, but at least they provided an undocumented mechanism to fix that). I could imagine that some zeroing activity would still be required if you populated the preallocated space out of sequence, though, given how 'high water marking' works. On NTFS, you can extend a file's allocation by seeking to the size you want with SetFilePointer, and then doing a SetEndOfFile. This will truncate a file if you've not gone past the end. If you have gone past the end, physical space will be allocated, but not cleared. The MFT entry for the file on NTFS includes a limit for how much of the allocated space is valid. You *can* seek into that space and it will read as zeros. If you write into that space, any area beyond the current valid limit and the position where you write will be zeroed at that point, but not the space past where you write. Note that sparse files are different, and are usually created by seeking past the end and then just writing. You can muck with the zeroing some with SetFileValidData, which can, prevent zeroing even on non-contiguous writes in some cases, exposing the old data in the allocated clusters. As you might expect, most user accounts don't have the SE_MANAGE_VOLUME_PRIVILEGE required to use SetFileValidData. Again this is NTFS only, and in many cases only on *local* NTFS drives. If you do SetFilePointer/SetEndOfFile to grow a file on FAT, for example, he will zero the space (since there's no concept of a "valid" limit for an allocation in FAT). NTFS also does anticipatory allocations if you're writing a file sequentially, and tends to attempt to make an allocation several times the size of the allocation required by your write, which it will attempt to physically allocate in a few pieces as possible. The file is physically truncated back down when it's closed. The exact algorithm has changed several times. In Win2K it basically tried to preallocate 16 times the space that was required to complete the write (IOW, if the write required two additional clusters beyond the current preallocation, NTFS would preallocate 32 additional clusters), up to a limit of 1/1024th of the free space. In WinXP it became an exponentially growing function (the first allocation would be done as- is, the second doubled, the third quadrupled, up to some limit), again with some other limits and whatnot factored in. Anyway, some semi- useful documentation in an absolutely horrible format (an executable self-extracting compressed Word document - ugh): http://support.microsoft.com/kb/841551 |
#10
| |||
| |||
|
|
Thanks - what you describe above I once knew as the supported way to accomplish preallocation (which I don't think originally existed in NT) but had forgotten (at least in any detail - that's why I alluded to it only vaguely above). *Now I feel lazy for not having taken the time to rediscover it. |
![]() |
| Thread Tools | Search this Thread |
| Display Modes | |
| |