Universal Agent 6.3.0.4 Release Information for D-07295
Important note about the UDM/UDMSRV change delivered for D-07295
UDM and UDMSRV contain a critical fix in the 6.3.0.4 agent maintenance release that prevents the possibility of duplicate data being written to a file following a network fault tolerant (NFT) recovery. This doesn’t happen in all NFT recovery situations, and if no NFT recovery occurs during a file transfer, the following does not apply.
Summary of the Problem
The possibility of duplicate data can occur when the sending party in a UDM transfer (which can be either a UDM Manager or UDM Server) increments a packet ID before it receives a message from the remote side of the transfer confirming receipt of that packet. The sender won’t adjust the pointer to the data itself until it gets that acknowledgement.
When a network fault occurs and the message acknowledgement isn’t received, the incremented packet ID may be assigned to the same data. When the NFT recovery completes and the packet is sent with the new ID, the receiving end of the transfer behaves as though it just received a new data packet and writes that data to the file.
This problem only affects UDM/UDMSRV versions that support the ACK_WINDOW feature. Specifically, versions 5.2.0.1 thru 6.3.0.3, inclusive.
Summary of the Fix
The 6.3.0.4 agent maintenance release provides a fix to prevent the above situation from occurring. Data sent following an NFT recovery will have the same packet ID that it had in the original transmission attempt. If the receiving end of the transfer actually got the original packet (which means the network fault occurred during acknowledgement), it will discard the packet since it was already written to disk.
If the original packet was never sent, the receiving end will be expecting that packet ID when NFT recovery completes. When it receives that packet, it will acknowledge it and write that packet’s contents to the output file.
Fixed in UDM/UDMSRV versions 6.3.0.4 and later.
Side Effects of the Fix
Unfortunately, the fix has an unintended side effect. Although the receiving end of the transfer has been able to detect and discard messages with duplicate ACK_WINDOW packet IDs since its delivery in 5.2.0.1, the discarded messages aren’t acknowledged back to the sender until version 6.3.0.4.
The sending party is unable to know whether the packet it sends will be treated by the receiver as a duplicate, so it expects that every packet will be acknowledged. If this acknowledgement isn’t received, the sender behaves as though another network fault has occurred and will enter NFT recovery.
When the recovery completes, the sender will re-send the packet, the receiver will discard it, and the sender will be left waiting for another acknowledgement. This is treated again like a network fault, and the cycle repeats, essentially creating an NFT recovery loop where no additional data is transferred.
Solution
The easiest solution to the problem described above is to make sure all parties involved in a UDM transfer are 6.3.0.4 or later.
If the UDM component that sends a file is 6.3.0.4 or later, and the receiving UDM component is 6.3.0.3 or earlier, the NFT recovery loop described above may occur (it doesn’t happen in all situations, even under the same conditions).
If only the receiving side of a UDM transfer is updated to 6.3.0.4 or later, the possibility of duplicate data exists because the 6.3.0.3 or earlier sender will still use an incremented packet ID following an NFT recovery.
Making sure that the sender and receiver are both 6.3.0.4 or later gives the best solution for ensuring data integrity and network fault tolerance.
Note that in a 3-party transfer, if no data is transferred to or from the UDM Manager, its version is irrelevant.