Patch Name: PHKL_10170 Patch Description: s700 10.10 VxFS (JFS) cumulative patch (no OmniStorage) Creation Date: 97/02/21 Post Date: 97/03/07 Hardware Platforms - OS Releases: s700: 10.10 Products: N/A Filesets: JournalFS.VXFS-BASE-KRN Automatic Reboot?: Yes Status: General Superseded Critical: Yes PHKL_10170: HANG PHKL_10134: HANG PHKL_9567: HANG PHKL_8823: HANG CORRUPTION PHKL_7935: PANIC PHKL_7580: CORRUPTION PHKL_7207: ABORT PHKL_7017: PANIC Path Name: /hp-ux_patches/s700/10.X/PHKL_10170 Symptoms: PHKL_10170: KI queuedone, enqueue and queuestart traces on JFS may contain NULL values in the b_apid and b_upid fields. Systems running JFS may hang due to a deadlock problem. The setuid bit will be removed on a JFS file when the file is edited or text has been appended to it when run as root. PHKL_10134: Customer is running processes which access memory mapped files. Once in a while, these processes deadlock. Any other processes attempting to access the memory mapped file hang as well. PHKL_9709: Each time edquota -t is invoked for a VxFS file system, it resets the previously defined file system time limit back to default (7 days). PHKL_9567: This patch addresses 2 distinct VxFS (JFS) symptoms: - When trying to take a file-system snapshot, the mount command could fail with the following error message: # mount -F vxfs -o snapof=/dev/vg00/vxonline \ /dev/vg00/vxbackup /vxbackup vxfs mount: /dev/vg00/vxbackup is already mounted, /vxbackup is busy, or allowable number of mount points exceeded - The system could hang when manipulating directories. PHKL_9265: When MMF activity on VxFS files is very high for a given process (like a process doing a lot of mmap access), then the vhand process may want to pageout some pages onto the VxFS file. On very rare occasion, this pageout process was in a situation were the pageout write can't be satisfied without waiting another ressource (like memory). Then, since vhand can't wait, the page was marked zomb, and later a fault on that page from the process made that process killed by the OS. PHKL_8823: When using edquota the effective user id in the credential structure would sometimes be corrupted. Also when using chown for certain user IDs, the command would fail. PHKL_8349: "vxfs: mesg 008: vx_direrr - /xxx file system inode x \ block y error 22" followed by erroneous indications that the filesystem is corrupted. PHKL_7935: "panic: data page fault", when using fsadm to resize a mounted VxFS filesystem with disk quotas. PHKL_7580: (1) Applications using ftruncate(2) on VxFS files could possibly loose data. This problem was reported with the Empress database. (2) msync(2) with the MS_SYNC flag on VxFS memory map files did not work as documented. Stale data could be found in the buffer cache when resuming file system operations, possibly resulting in data corruption. (3) Poor system performance when directories containing shared libraries, for example /usr, reside on a VxFS file-system. PHKL_7207: attempting to remove linked text file when original file is busy gets ETXTBSY PHKL_7017: This fixes two separate VxFS (JFS) problems. 1) trap type 15 in vx_iget 2) O_DSYNC is ignored for JFS filesystems PHKL_6991: Systems with /usr on a VxFS file-system were experiencing poor performance. PHKL_6953: VxFS reports "No space left on device" when reaching quota limit rather than "Disc quota exceeded" over NFS Defect Description: PHKL_10170: KI problem: The JFS buffer allocation and IO paths were not fully instrumented causing buffer header b_apid and b_upid fields not to be updated consistently. The resulting KI queuedone, enqueue, and queuestart traces contain NULL values in these fields. System can deadlock due to a locking order problem in JFS when vx_fast_read() is called from VOP_BREAD. When a JFS file is created with the SETUID flag, the setuid bit is removed when the file has been edited with vi or text has been appended to it; this should only be the case when the writer is not root. PHKL_10134: The problem corrected is a deadlock caused by procedure vm_wait_for_io being called with the iglock being held and releasing the region lock prior to sleeping. The deadlock is thus caused by another process being able to get the region lock and waiting for the iglock. The fix is now to call vm_wait_for_io at the the end of vx_pageout after the iglock has been released. PHKL_9709: VxFS quota routine vx_getquota() resets the time limit for root because it thinks root should not have a quota limit. Somehow it ignores the fact that the timelimit fields in root's dqblk structure are used to store the file system time limit. PHKL_9567: This patch fixes two different VxFS (JFS) defects: - A snapshot could not be mounted if a process was waiting arbitrarily long for a file record lock. An application using lockf() or fcntl() to get file record locks, and holding the locks for a long period of time, could prevent from mounting a file-system snapshot. - The VxFS rmdir(2) routine could run into a deadlock situation where the directory would be kept locked. Processes attempting to access the locked directory would then wait forever, and eventually this could cause the entire system to hang. PHKL_9265: Under MMF high presure, vx_do_pageio called from vhand incorectly marked a page as r_zomb when EAGAIN occurs on that page. This as the side effect of killing a process that do a fault on that page later on. PHKL_8823: The "edquota" defect was due to an extra parameter being incorrectly passed when calling procedure vx_read1 from vx_dqextred. The "chown" defect was due to an uninitialized field (ex_elen) in the vx_extent structure when allocated by the vx_dqnewid proedure. PHKL_8349: This problem was mainly seen on striped logical volumes. If multiple processes were scanning VxFS directories via commands like ls, find, or cpio, they could cause VxFS to erroneously assume the filesystem is corrupt, making it impossible to remount it until fscked. There would also be errors in the syslog referring to vx_direrr. The defect was in a lack of caching of offsets within the directory block; if the offset changed at an inopportune time, the directory read would fail and the filesystem would be marked corrupt. PHKL_7935: Resizing VxFS filesystems online effectively does quick unmounts and remounts of the filesystem, switching quickly between the two different data areas containing the filesystem structure information. The VxFS disk quota tracking structures were not updated during the switch, with the end result that the disk quota code was accessing invalid memory. The fix was to update the disk quota structures during the switch. PHKL_7580: (1) The VxFS file truncation code was breaking an assumption in brealloc() causing delayed-write buffers to be discarded instead of being flushed to disk. (2) A "purge buffer cache" was not performed by the VxFS pageout code. Stale data could then be found in the buffer cache when resuming file-system operations after a msync(2). (3) VxFS used to purge the buffer cache at mmap(2) time, and the Dynamic Loader (dld.sl) suffered poor performance with shared-libraries residing a VxFS file-system. The fix was to purge the buffer cache at pageout time, and to flush it at pagein time. The previous fix (PHKL_6991) introduced the potential for data corruption, since not invalidating (e.g not purging) meant possibly getting stale data from valid old buffers. Defects #2 and #3 are fixed in 10.20, but #1 is fixed 10.30. PHKL_7207: VxFS forgot to check if nlink is 1 PHKL_7017: JFS neglected to check for the O_DSYNC flag. It only checked for O_SYNC. In vx_iget, the code dereferenced a NULL pointer. PHKL_6991: When creating a memory mapped file, VxFS was flushing and invalidating the file-related buffers from the buffer cache. This behavior caused the dynamic loader (dld.sl) to generate a physical I/O each time it was reading a shared library header before calling mmap(), and shared library headers were never found in the buffer cache. The fix was to only flush (writing dirty buffers) and not do the invalidation. PHKL_6953: Incorrect "No space left on device" errors are generated when the filesystem is not actually full. The filesystem in question is a VxFS filesystem mounted over NFS from another system with quotas enabled on the server. The message occurs when a user reaches the hard limit on the mounted directory. This is caused by the VxFS code in HP-UX interpreting a class of filesystem space allocation failures all as ENOSPC. The fix was to correect this misinterpretation. With this patch installed, when a user exceeds his quota, the error on his terminal will be "Disk quota exceeded". SR: 1653150698 1653161471 1653162297 1653166066 4701329300 4701329292 1653166983 1653170464 1653182857 1653183699 1653186502 1653194555 4701309070 4701346650 5003311837 5003317487 5003348425 Patch Files: /usr/conf/lib/libvxfs_base.a(vx_bio.o) /usr/conf/lib/libvxfs_base.a(vx_bio1.o) /usr/conf/lib/libvxfs_base.a(vx_bsdquota.o) /usr/conf/lib/libvxfs_base.a(vx_dirl.o) /usr/conf/lib/libvxfs_base.a(vx_inode.o) /usr/conf/lib/libvxfs_base.a(vx_mount.o) /usr/conf/lib/libvxfs_base.a(vx_rdwri.o) /usr/conf/lib/libvxfs_base.a(vx_vm.o) /usr/conf/lib/libvxfs_base.a(vx_vnops.o) what(1) Output: /usr/conf/lib/libvxfs_base.a(vx_bio.o): vx_bio.c $Date: 97/02/21 11:33:42 $ $Revision: 1.3.89.13 $ PATCH_10.10 (PHKL_10170) /usr/conf/lib/libvxfs_base.a(vx_bio1.o): vx_bio1.c $Date: 96/05/31 11:40:57 $ $Revision: 1.3.89.11 $ PATCH_10.10 (PHKL_7580) /usr/conf/lib/libvxfs_base.a(vx_bsdquota.o): vx_bsdquota.c $Date: 97/01/15 16:05:32 $ $Revision: 1.3.89.13 $ PATCH_10.10 (PHKL_9709) /usr/conf/lib/libvxfs_base.a(vx_dirl.o): vx_dirl.c $Date: 96/08/20 17:44:39 $ $Revision: 1.3.89.5 $ PATCH_10.10 (PHKL_8349) /usr/conf/lib/libvxfs_base.a(vx_inode.o): vx_inode.c $Date: 96/03/18 10:56:45 $ $Revision: 1.3.89.11 $ PATCH_10.10 (PHKL_7017) /usr/conf/lib/libvxfs_base.a(vx_mount.o): vx_mount.c $Date: 96/07/03 16:19:52 $ $Revision: 1.3.89.10 $ PATCH_10.10 (PHKL_7935) /usr/conf/lib/libvxfs_base.a(vx_rdwri.o): vx_rdwri.c $Date: 97/02/21 11:49:41 $ $Revision: 1.3.89.18 $ PATCH_10.10 (PHKL_10170) /usr/conf/lib/libvxfs_base.a(vx_vm.o): vx_vm.c $Date: 97/02/14 12:11:11 $ $Revision: 1.3.89.21 $ PATCH_10.10 (PHKL_10134) /usr/conf/lib/libvxfs_base.a(vx_vnops.o): vx_vnops.c $Date: 96/12/17 18:12:40 $ $Revision: 1.3.89.17 $ PATCH_10.10 (PHKL_9567) cksum(1) Output: 3841079490 10192 /usr/conf/lib/libvxfs_base.a(vx_bio.o) 4211935824 4792 /usr/conf/lib/libvxfs_base.a(vx_bio1.o) 2997803817 27168 /usr/conf/lib/libvxfs_base.a(vx_bsdquota.o) 1838912048 9152 /usr/conf/lib/libvxfs_base.a(vx_dirl.o) 3693687270 38392 /usr/conf/lib/libvxfs_base.a(vx_inode.o) 2780922089 19448 /usr/conf/lib/libvxfs_base.a(vx_mount.o) 3891689527 26880 /usr/conf/lib/libvxfs_base.a(vx_rdwri.o) 2930568444 10792 /usr/conf/lib/libvxfs_base.a(vx_vm.o) 1606142415 24720 /usr/conf/lib/libvxfs_base.a(vx_vnops.o) Patch Conflicts: None Patch Dependencies: None Hardware Dependencies: None Other Dependencies: None Supersedes: PHKL_6953 PHKL_6991 PHKL_7017 PHKL_7207 PHKL_7580 PHKL_7935 PHKL_8349 PHKL_8823 PHKL_9265 PHKL_9567 PHKL_9709 PHKL_10134 Equivalent Patches: PHKL_10171: s800: 10.10 Patch Package Size: 230 Kbytes Installation Instructions: Please review all instructions and the Hewlett-Packard SupportLine User Guide or your Hewlett-Packard support terms and conditions for precautions, scope of license, restrictions, and, limitation of liability and warranties, before installing this patch. ------------------------------------------------------------ 1. Back up your system before installing a patch. 2. Login as root. 3. Copy the patch to the /tmp directory. 4. Move to the /tmp directory and unshar the patch: cd /tmp sh PHKL_10170 5a. For a standalone system, run swinstall to install the patch: swinstall -x autoreboot=true -x match_target=true \ -s /tmp/PHKL_10170.depot 5b. For a homogeneous NFS Diskless cluster run swcluster on the server to install the patch on the server and the clients: swcluster -i -b This will invoke swcluster in the interactive mode and force all clients to be shut down. WARNING: All cluster clients must be shut down prior to the patch installation. Installing the patch while the clients are booted is unsupported and can lead to serious problems. The swcluster command will invoke an swinstall session in which you must specify: alternate root path - default is /export/shared_root/OS_700 source depot path - /tmp/PHKL_10170.depot To complete the installation, select the patch by choosing "Actions -> Match What Target Has" and then "Actions -> Install" from the Menubar. 5c. For a heterogeneous NFS Diskless cluster: - run swinstall on the server as in step 5a to install the patch on the cluster server. - run swcluster on the server as in step 5b to install the patch on the cluster clients. By default swinstall will archive the original software in /var/adm/sw/patch/PHKL_10170. If you do not wish to retain a copy of the original software, you can create an empty file named /var/adm/sw/patch/PATCH_NOSAVE. Warning: If this file exists when a patch is installed, the patch cannot be deinstalled. Please be careful when using this feature. It is recommended that you move the PHKL_10170.text file to /var/adm/sw/patch for future reference. To put this patch on a magnetic tape and install from the tape drive, use the command: dd if=/tmp/PHKL_10170.depot of=/dev/rmt/0m bs=2k Special Installation Instructions: This JFS patch should not be installed on systems using the OmniStorage product. If you are using the OmniStorage product then using the methods taken to receive this patch please obtain the equivalent OmniStorage/JFS patch. If you cannot locate the patch, please contact your local HP support entity.