Last modified on November 22, 2002 This readme.txt file is for maintenance release v4.0.2b. 1.0 INTRODUCTION Welcome to the hp StorageWorks SAN Switch 2/32. Thank you for choosing our products. This readme.txt contains late-breaking changes that may not be included in the product documentation. The readme.txt also includes a section that lists the contents on this CD. NOTE: The 2Gb HP StorageWorks SAN Switch 2/32 currently (Nov. 22, 2002) ships with the v4.0.2b version of the firmware installed. The latest firmware (v4.0.2b) is available on the HP website at http://hp.com. For specific information related to installing the switch see the HP StorageWorks SAN Switch 2/32 documentation. Firmware v4.0.2b supports the hp StorageWorks SAN Switch 2/32 and hp StorageWorks Core Switch 2/64. This release is cumulative and includes fixes from previous patch versions of the fabric os. CD Directory Structure ============================================================= The HP StorageWorks SAN Switch 2/32 Documentation Kit CD, V4.0.2b contains the following items: - Manuals.pdf - (StorageWorks SAN Switch 2/32 Documentation; links to all documents below and search function) - DSGGD - Docs - README.TXT, includes late-breaking changes to documentation - AA-RTQVA-TE.pdf, hp StorageWorks SAN Switch 2/32 Version 4.0.x Installation Guide - AA-RS24A-TE.pdf, hp StorageWorks Fabric OS Version 3.0.x/4.0.x Reference Guide - AA-RS23A-TE.pdf, hp StorageWorks Fabric OS Procedures Version 3.0.x/4.0.x User Guide - AA-RS22A-TE.pdf, hp StorageWorks Diagnostics and System Error Messages Version 3.0.x/4.0.x Reference Guide - AA-RS25A-TE.pdf, hp StorageWorks Web Tools Version 3.0.x/4.0.x User Guide - AA-RTS4A-TE.pdf, hp StorageWorks Advanced Performance Monitoring Version 3.0.x/4.0.x User Guide - AA-RTSAA-TE.pdf, hp StorageWorks ISL Trunking Version 3.0.x/4.0.x User Guide - AA-RTSDA-TE.pdf, hp StorageWorks Extended Fabric Version 3.0.x/4.0.x User Guide - AA-RTSGA-TE.pdf, hp StorageWorks Fabric Watch Version 3.0.x/4.0.x User Guide - AA-RS26A-TE.pdf, hp StorageWorks Zoning Version 3.0.x/4.0.x User Guide - AA-RTS7A-TE.pdf, hp StorageWorks Remote Switch Version 3.0.x/4.0.x User Guide - Firmware (hp StorageWorks Fibre Channel Firmware and Updates) - firmwareupdate.txt, instructions for updating firmware - v4.0.2b, switch firmware - femib.smiv 2 2/15/01 - festd.smiv2 2/27/01 - v4_0FA.mib (supports Fibre Alliance MIB v3.0) - v4_0FE.mib (v4_0FE.mib is same as festd.smiv2 in v3.0 and refers to RFC 2837) - v4_0SW.mib (Fibre Channel Switch MIB v4.0) - v4_0TRP.mib (Fibre Channel Switch Trap MIB v4.0) - j2re-1_3_1_01-win.exe (sun Java plug-in for windows self- install for browser) - NTAlpha - Cat.exe (mover for upgrading firmware from an Alpha) - Rshd.exe (small server for upgrading firmware from an Alpha) - NTIntel - Cat.exe (mover for downloading firmware from a PC) - Rshd.exe (small server for upgrading firmware from PC) - Acrobat - rp505eng.exe (installer for Acrobat 5.05 Reader plus Search to enable viewing of pdf files) ============================================================= Related Documentation Additional documentation are available via the HP website at: http://www.compaq.com/storage/saninfrastructure.html The PDF files can most easily be read on a PC by installing ACROBAT from the ACROBAT folder (rp505enu.exe) on the CD and then clicking on the manuals.pdf file. This will bring up a page that allows you to select any of the manuals or search all of them for any topic that you need to access. There is also a help.pdf file in the ACROBAT folder for assistance in using ACROBAT. 2.0 GETTING HELP Contact HP customer service for technical support. Be prepared to provide the following information: * Chassis serial number (from the output of the chassisshow telnet command). * Chassis license id (from the output of the licenseidshow command). * Logical switch World Wide Name (WWN), from the output of the switchshow telnet command * Fabric and SAN topology configuration (from the output of the topologyshow telnet command) * Detailed problem description, and troubleshooting steps already performed 3.0 FABRIC WATCH CONFIGURATION FILE HP supplies the following new configuration files, also included on the CD in DSGGD\Firmware; FW232.TXT. The configuration file includes the combined HP and Compaq settings for Fabric Watch. All parameters (except the Fabric Watch default parameters) can be downloaded to a switch by user admin. Log into root to download the Fabric Watch Default settings. IMPORTANT: You do not need to load these files unless you need to restore the switch to the default settings. You have to have root access to change the default Fabric Watch Default settings, the custom settings can be changed by admin access. Performing a configdownload with these files only appends or changes those parameters within this file. It will not change any other parameter on the switch which may have been changed at the customer site. Remember that an entire fabric must have the same configuration settings. 4.0 HP Recommendations HP always recommends redundant fabrics and multi-pathing software for uptime-sensitive environments. Examples of scenarios protected by redundant fabrics include but are not limited to: * Add / Move / Change operations for devices or switches * Changing the core PID format * Changing any other fabric-wide parameters * Erroneous zoning changes / user error * Major upgrades / changes to fabric architecture * Physical disasters e.g. water pipe breaks above "fabric A" rack HP recommends strongly against using drivers that bind by PID. There are several routine maintenance procedures that may result in a device receiving a new PID; the core PID format update is just one example of such a procedure. Examples include but are not limited to: * Changing "Compatibility Mode" settings * Changing switch domain IDs * Merging fabrics * Relocating devices to new ports or new switches (that is, for Add, Move, Change type operations) * Updating the core PID format * Using hot spare switch ports to deal with failures 5.0 REQUIREMENTS AND COMPATIBILITY Web Tools Requirements Web Tools is installed and run on the switch but is displayed in a web browser. For optimum Web Tools performance, the following is recommended: * When running Web Tools with Internet Explorer 5.0 or later in a Windows environment, use Java Plug-in version 1.2.2-008 for Windows. * Before using your web browser, clear the cache files. * In a multi-switch fabric, the video card in the workstation in use should have at least 8MB RAM. * Consider disabling any software installed on your desktop system which enables Internet download scanning capability, such as anti-virus software. Web Tools Workstation Requirements Web Tools is installed and run on the switch but is displayed in a web browser. The workstation used to manage the switch(es) must meet the following requirements in order to run Web Tools: Operating Systems Solaris 2.7 or later, Windows 2000, Windows NT 4.0 RAM (for Windows operating systems) 128 MB or more of RAM for fabrics comprised of 10 switches or less 256 MB or more of RAM for fabrics comprised of 10 to 20 switches 512 MB or more of RAM for fabrics comprised of more than 20 switches Disk space 5 MB or more of free disk space Web Browsers Configure one of the following web browsers to work with Web Tools: * Netscape Communicator 4.5x, 4.7x * Internet Explorer 5.0, 5.5 * Netscape 6.x is not supported in this release. Java Plug-in The following Java Plug-ins must be installed on the workstation: * For Windows 2000, or Windows NT 4.0: Java Plug-in V1.3.1 is required. * For Solaris: Java Plug-in version 1.2.2_02 for Solaris, including the Java Plug-in patch created by Sun Microsystems for Solaris. Java Runtime Environment (JRE) The following JRE must be installed on the workstation: * JRE 1_3_1_01 JRE V1.3.1 is required to run the SAN Appliance. 6.0 FABRIC OS v4.0.2b LIMITATIONS AND KNOWN BUGS configupload The configupload command does not work on the Microsoft FTP server if no password is specified. Fabric Watch During a switch startup or switch reboot, the Fabric Watch daemon is one of the last processes to become active. Depending on the size of the fabric, Fabric Watch might take several minutes to complete its start up sequence. Before Fabric Watch is completely active, it will not be able to monitor events occurring in the fabric. During this time, the switchStatusPolicyShow command will show the default settings instead of any custom settings. Trunking and Long Distance Mode Trunking cannot be used in conjunction with long distance ports (the portcfglongdistance command).In another words, you cannot trunk a port in long distance mode. QuickLoop QuickLoop is not supported on the SAN Switch 2/32 in either its original hub emulation mode, or in Fabric Assist mode. This means that direct attachment of private hosts to the switch is not supported. However, fabric or loop targets attached to the switch may be included in Fabric Assist Zones for private hosts attached to other switches in the fabric. Persistent Error Log Note that Web Tools and SNMP are not equivalent in functionality with the command line interfaces with respect to the enhanced error log sub-system. The output shown by SNMP and Web Tools only displays the last 256 entries of the volatile log information. 7.0 FIRMWARE UPGRADE PROCEDURE FOR WINDOWS ENVIRONMENTS Use the following procedure to upgrade the SAN Switch 2/32 to firmware v4.0.2b: Note: HP recommends that before upgrading the switch firmware, a copy of the switch configuration file should be saved to the FTP server using the configupload command. 1. Verify an FTP service is running on a Unix or Windows machine. 2. Telnet to the switch. 3. At the login prompt, type admin. 4. Type the password. 5. At the prompt type, firmwaredownload. NOTE: This command causes the switch to reset. This will cause disruption to the attached devices momentarily, and will require that existing telnet sessions be restarted. 6. Follow the onscreen prompts, shown next: Do you want to continue [Y]: Server Name or IP Address: 10.10.0.0 User Name: anonymous File Name: release.plist Password: xxxx FirmwareDownload has started. Start to install packages...... The v4.0.2b configuration files begin to download. This takes approximately 5 to 10 minutes. 7. Upon completion of a successful download, you will see the following prompts: Verification SUCCEEDED Firmwaredownload completes successfully. FirmwareDownload has completed successfully. Broadcast message from root (pts/0) Wed Nov 20 09:23:04 2002... The system is going down for reboot NOW !! Switch:admin> Connection to host lost. 8. Re-telnet to the switch, as follows: C:\>telnet 10.10.0.0 Fabric OS (Switch) Switch login: admin Password: xxxx 5_1:admin> firmwareshow Primary partition: v4.0.2b Secondary Partition: v4.0.2b Switch:admin> The switch is now running firmware v4.0.2b. 8.0 HARDWARE ZONING The basic rules for hardware-enforced zoning require that all zone members are all WWNs or Port IDentifiers (PIDs). If you mix WWNs and PIDs the zone will operate as software (NS) enforced. 9.0 WEB TOOLS CHANGES Web Tools changes include the following: * Performance Monitor Changes: In the PortThroughputGraph, SwitchAggregateGraph and SidDidGraph the units Bytes/ sec were added for the Y-axis. * Tooltip values: Bytes were added for throughput less than 1000, otherwise, K,M or G were added respectively. * Extended Fabric Tab: In Web Tools, the Switch Administration dialog box was added. * Extended Fabric tab has a new column in the table with the header "VCXLT Link Init Enabled", and each row now has a checkbox to display/configure VC Translation Link Initialization status. * Zone Administration Radio Buttons: The Default selection is always "Save Config". Previously, when Web Tools Zoning was launched if the "Enable Config" button was selected and then the "Apply" button was selected to enable a configuration, Web Tools Zoning wouldn't reset the radio button to the default after it was done. The radio buttons will now be grayed out while committing the configuration and the default selection "Save Config" will be reset after each successful submission. The user is required to click "Enable Config" to enable a different Zoning configuration. * Zone Tab: Under the Zone tab, if a Node WWN is added to a zone, after applying the change to the switch, the user would see that Node WWN and all of the port WWNs for that node displayed as members of the zone. This has been changed to be consistent with the telnet session which displays only the Node WWN as a member. 10.0 WEB TOOLS ISSUES AND WORKAROUNDS Issue: When you use Netscape V4.77 and a network-attached printer, Web Tools may not be able to print properly if the network is interrupted, causing the printer to become unavailable after the browser was started. Workaround: Once the network problem is identified, close the Netscape browser and attempt the print operation once again. The browser must be shut down and restarted. Issue: Older versions of V1.2.2_02 and V1.3.1 Java plug-in for Solaris do not support the Fabric Event display. Workaround: Ensure the correct JRE version that supplies the correct patch version of the Java plug-in is installed. Issue: The Refresh View button in Fabric View may start blinking 15 seconds after the fabric has been refreshed, when actually the fabric has not changed. Workaround: Ensure the correct JRE version that supplies the correct patch version of the Java plug-in is installed. Issue: When a pop-up window requesting a user response is pushed into the background and refresh is requested, a fatal Internet Explorer error may occur. Workaround: None. Issue: If a Web Tools administrator elects to remove a Web Tools license from an operational system, with active Web Tools clients enabled, subsequent operations on the clients may provide error messages and other information on the client that cannot be easily understood with the context. Workaround: To eliminate this problem, close down the browser before removing the Web Tools license. Issue: There are multiple phases involved with firmware download. Web Tools notifies the user that it has succeeded with phase 1 and continues on with the other phases. Firmware activation involves multiple stages. For example, when Web Tools reports that firmware download has been completed successfully, this indicates that a basic sanity check, package retrieval, package unloading and verification was successful. A reboot is required to activate the newly downloaded firmware. The reboot is done transparently to the Web Tools user and results in a loss of network connectivity with Web Tools for approximately 5 minutes. Workaround: Wait approximately 10 minutes to ensure that all of the application windows have been restored. If Web Tools fails to respond after 10 minutes, you may need to close all Web Tools applications windows and restart them, or contact the system administrator. Issue: Switch is accessed immediately following a switch enable or disable, and inaccurate routing information displays. Workaround: Following a switch enable or disable, it is necessary to wait at least 25-30 seconds for the fabric to reconfigure, and for Fabric Shortest Path First (FSPF) to re-route calculations. Issue: Web Tools will appear to freeze or hang if it is not restarted after an Ethernet IP address is changed via the NetworkConfig View command. Workaround: Web Tools must be restarted when the Ethernet IP address is changed via the NetworkConfig View command. Issue: In the Extended Fabric tab of Switch Admin view, the Long Distance Setting column of the port table enables a user to set the following mode for each port: L0: Normal L1: Medium L2: Long LE: Extended Normal In the case of a trunked port, L1, L2 or LE modes are not supported in V4.0.2b and earlier releases. Currently Web Tools does not distinguish a trunked port and will accept all 4 settings, regardless if the port is trunked or not. If the L1, L2, or LE mode settings is enabled on a trunked port, the data traffic will start to flow across one of the trunk ports and the other ports will be idle. Workaround: Do not change the Long Distance setting for ports that are currently trunked. L1, L2 or LE modes are not supported for trunking ports. 11.0 NEW WEB TOOLS PROCEDURES Web Tools V4.0 Zone Administration View provides a GUI editing tool for zoning database configurations. The following procedures explain how to: * Enable a config * Disable Zoning * Save the config through Web Tools Enabling a Configuration 1. Select the Zone Admin view. The Zone Admin window displays. 2. Select the desired configuration from the Cfg Name drop-down menu. 3. Select the Enable Config radio button. 4. Verify that the enabled configuration is the one that you intend. Note the name of the configuration to the right of the radio buttons. 5. Select Apply. The Configuration Confirmation window appears. The desired configuration should match the configuration name in the text field next to the Enable Config button. 6. Select Yes if you want to proceed with enabling the selected Zone Configuration. The three radio buttons will be grayed out during the asynchronous committing process. The default, which is "Save Config", will reset upon completion. Disabling a Configuration Note that this procedures disables zoning on the fabric. 1. Select the Zone Admin view. The Zone Admin window displays. 2. Select the desired Configuration from the Cfg Name drop-down menu. 3. Select the Disable Zoning radio button. 4. Verify that the disabled configuration is the one that you intend; note the name of the configuration to the right of the radio buttons. 5. Select Apply. The Configuration Confirmation window displays. The configuration to be enabled should match the configuration name given in the text field next to the Disable Config button. 6. Select Yes if you want to proceed with disabling the selected Zone Configuration. The three radio buttons will be grayed out during the asynchronous committing process. The default, which is "Save Config", resets upon completion. Save a Configuration 1. Select the Zone Admin view. The Zone Admin window displays. 2. Save the desired Config from the Cfg Name drop-down menu. 3. Select the Save Config radio button. 4. Verify that the saved configuration is the one that you intend; note the name of the configuration to the right of the radio buttons. 5. Select Apply. The Configuration Confirmation window displays. The configuration to be enabled should match the configuration name given in the text field next to the Save Config button. 6. Select Yes if you want to proceed with saving the selected Zone Configuration. The three radio buttons will be grayed out during the asynchronous committing process. The default, which is "Save Config", resets upon completion. 12.0 FABRIC OS CHANGES ifModeShow and ifModeSet Commands Changing the link mode is not supported for all network interfaces or for all Ethernet network interfaces. These commands are only functional for the "eth0" interface. Example 1: To advertise all modes of operation, follow this scenario for the ifModeSet command: cp0:admin> ifModeSet eth0 IMPORTANT: Exercise care when using this command. Forcing the link to an operating mode not supported by the network equipment to which it is attached may result in an inability to communicate with the system through its Ethernet interface. CAUTION: It is recommended that you only use this command from the serial console port. Are you sure you really want to do this? (yes, y, no, n): [no] y Proceed with caution. Auto-negotiate (yes, y, no, n): [no] y Advertise 100 Mbps / Full Duplex (yes, y, no, n): [yes] y Advertise 100 Mbps / Half Duplex (yes, y, no, n): [yes] y Advertise 10 Mbps / Full Duplex (yes, y, no, n): [yes] y Advertise 10 Mbps / Half Duplex (yes, y, no, n): [yes] y Committing configuration...done. cp0:admin> Example 2: To force 10 Mbps Half Duplex, follow this scenario for the ifModeSet command: cp0:admin> ifModeSet eth0 IMPORTANT: Exercise care when using this command. Forcing the link to an operating mode not supported by the network equipment to which it is attached may result in an inability to communicate with the system through its Ethernet interface. CAUTION: It is recommended that you only use this command from the serial console port. Are you sure you really want to do this? (yes, y, no, n): [no] y Proceed with caution. Auto-negotiate (yes, y, no, n): [no] n Force 100 Mbps / Full Duplex (yes, y, no, n): [no] Force 100 Mbps / Half Duplex (yes, y, no, n): [no] Force 10 Mbps / Full Duplex (yes, y, no, n): [no] Force 10 Mbps / Half Duplex (yes, y, no, n): [no] y Committing configuration...done. cp0:admin> 13.0 SNMP COMMANDS loopporttest Command Short Description Functional test of L_port M->M path on a loop. Syntax loopporttest [-nframes count][-ports itemlist][-seed pattern][-width pattern_width] Availability admin Description Use this command to verify the operation of the switch by sending frames from port M's transmitter, and looping the frames back through an external fiber cable, including all the devices on the loop, into port M's receiver. This exercises all the switch components from the main board to the SFP to the fibre cable to the SFPs (of the devices and the switch) and back to the main board. The cables and SFPs connected should be of the same technology: meaning that a port with a short wavelength SFP is connected to another device with a short wavelength SFP using a short wavelength cable; and a long wavelength port is connected to a long wavelength port; and a copper port is connected to a copper port. Only one frame is transmitted and received at any one time. The port LEDs flicker green rapidly while the test is running. The test method is as follows: 1. Determine which ports are L_ports. 2. Enable ports for cabled loopback mode. 3. Create a frame F of data size (1,024 bytes). 4. Transmit frame F via port M, with D_ID to the FL port (ALPA = 0). 5. Pick up the frame from port M, the FL port. 6. Check if any of the 8 statistic error counters are non-zero: ENC_in, CRC_err, TruncFrm, FrmTooLong, BadEOF, Enc_out, BadOrdSet, DiscC3. 7. Check if the transmit, receive or class 3 receiver counters are stuck at some value. 8. Check if the number of frames transmitted is not equal to the number of frames received. 9. Repeat steps 3 through 8 for all ports present until the number of frames requested is reached, all ports are marked bad. NOTE: You can specify a payload pattern to be used when executing this test. If the pattern is not specified, then at every 30 passes, a different data type is used to generate new pattern to create the frame. The data pattern will be generated base on each data type. Some data types may generate different data patterns on every pass. The data types are repeated every 210 pass. Operands This command has the following operands: -nframes count Specify the number of times (or number of frames per port) to execute this test. The default value is 10. -ports itemlist Specify a list of user ports to test. By default all the user ports in the current slot are tested. You can set the current slot by using the setslot command. -seed pattern Specify the seed pattern of the test packets. The data types are: * CSPAT: 0x7e, 0x7e, 0x7e, 0x7e,... * BYTE_LFSR: 0x69, 0x01, 0x02, 0x05,... * CHALF_SQ: 0x4a, 0x4a, 0x4a, 0x4a,... * QUAD_NOT: 0x00, 0xff, 0x00, 0xff,... * CQTR_SQ: 0x78, 0x78, 0x78, 0x78,... * CRPAT: 0xbc, 0xbc, 0x23, 0x47,... * RANDOM: 0x25, 0x7f, 0x6e, 0x9a,... -width pattern_width Specify the width of the test pattern. Valid values include and 1, 2, and 4 (which are byte, word, and quad). Example To perform a loopback port test: --------------------------------------------- switch:admin> loopporttest -ports 1/0-1/15 Running Loop Port Test ....... Test Complete: "loopporttest" Pass 10 of 10 Duration 0 hr, 0 min & 1 sec (0:0:0:127). passed. --------------------------------------------- Diagnostics Below are possible error messages if failures are detected: DATA INIT PORT_DIED EPI1_STATUS_ERR ERR_STAT ERR_STATS ERR_STATS_2LONG ERR_STATS_BADEOF ERR_STATS_BADOS ERR_STATS_C3DISC ERR_STATS_CRC ERR_STATS_ENCIN ERR_STATS_ENCOUT ERR_STATS_TRUNC ERR_STAT_2LONG ERR_STAT_BADEOF ERR_STAT_BADOS ERR_STAT_C3DISC ERR_STAT_CRC ERR_STAT_ENCIN ERR_STAT_ENCOUT ERR_STAT_TRUNC FDET_PERR FINISH_MSG_ERR FTPRT_STATUS_ERR LESSN_STATUS_ERR MBUF_STATE_ERR MBUF_STATUS_ERR NO_SEGMENT PORT_ABSENT PORT_ENABLE PORT_M2M PORT_STOPPED PORT_WRONG RXQ_FRAME_ERR RXQ_RAM_PERR STATS STATS_C3FRX STATS_FRX STATS_FTX TIMEOUT XMIT See Also camtest, centralmemorytest, cmemretentiontest, cmitest, itemlist, portloopbacktest, portregtest,setslot, spinsilk, sramretentiontest, crossporttest killTelnet Command Short Description A new admin level command, killTelnet, has been added to this release. This command enables the administrator to view a list of active CLI sessions (connected through telnet or serial port) and terminate a session. Syntax killtelnet Availability admin Description Use this command to terminate an open telnet session. The killtelnet command is an interactive menu driven command. Upon invocation, it lists all the current telnet and serial port login sessions. It lists information such as the session number, login name, the idle time, the IP address of the connection, and the time stamp of when the login session was opened. A prompt is then displayed where you can specify the session number of the connection you wish to terminate. Example To terminate an open telnet connection: --------------------------------------------------------- switch:admin> killtelnet Collecting login information....Done List of telnet sessions (5 found) _________________________________________________________________________ Session No USER TTY IDLE FROM LOGIN@ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 0 root ttyS0 5days - 8May02 1 root pts/0 23:40m 192.167.172.90 8May02 2 root pts/1 5days 192.167.172.90 8May02 3 admin pts/2 1.00s 192.167.132.56 5:19pm 4 admin pts/3 10.00s 192.167.133.83 5:23pm ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Enter Session Number to terminate (q to quit) 4 Collecting process information... Done. You have opted to terminate the telnet session:- logged in as "admin ", from "192.168.133.83 " since " 5:23pm" and has been inactive for "10.00s ", the current command executed being: "/bin/sh /fabos/". The device entry is: "pts/3 ". This action will effectively kill these process(es):- USER PID ACCESS COMMAND /dev/pts/3 root 11404 f.... rbash root 11428 f.... chkdefaultpassw root 11717 f.... passwd Please Ensure (Y/[N]): y killing session.... Done! Collecting login information....Done List of telnet sessions (4 found) ___________________________________________________________________________ Session No USER TTY IDLE FROM LOGIN@ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 0 root ttyS0 5days - 8May02 1 root pts/0 23:41m 192.167.172.90 8May02 2 root pts/1 5days 192.167.172.90 8May02 3 admin pts/2 12.00s 192.167.132.56 5:19pm ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Enter Session Number to terminate (q to quit) q switch:admin> ----------------------------------------------------------- CAUTION: The list of open sessions displayed with killtelnet includes the user's current session. Make sure you do not kill your own telnet session. portCfgDelayFlogi Command Short Description A new admin level command, portCfgDelayFlogi, has been added to this release. This command enables a user to configure a port, such that the FLOGI Accept for that port will not be sent out until all the routes are setup. Syntax portCfgDelayFlogi[slot/]port, mode Availability admin Description Use this command to delay the FLOGI Accept for the port until routes are set up for the entire fabric. After a link disruption (caused by for example,hafailover), certain hosts do not query the Name Server before re-establishing communication with the targets. Those devices remember the port ID of the target with which they were communicating and transmit PLOGI to the targets soon after receiving the FLOGI Accept. Since the device bypasses the Name Server, it is possible that the routes for remote targets are not set up at that time. If the host does not receive a response to its PLOGI, the I/Os that were in progress before the link disruption does not resume. By using this command, the user can specify the ports that have this behavior so that the switch will not send an FLOGI Accept until all routes are setup. That way after the link disruption, I/Os will resume. The configuration is saved in the non-volatile memory and is persistent across switch reboot or power cycle. Operands This command has the following operands: slot-Specify the slot number in a switch. The slot number must be followed by a slash ( / ) and the port number. port-Specify a port number (0 - 15). mode-Specify a value of 1 to delay the FLOGI Accept for this port until route setup is complete. Specify a value of 0 to de-configure the delay of the FLOGI Accept for this port. Example To configure switch port 3 as a "delayed FLOGI" port: switch:admin> portcfgdelayflogi 1/3, 1 done. See Also portShow, switchShow, configure, portCfgLport, haFailover 14.0 Command Enhancements supportShow Command Enhancement The supportShow command with no additional parameters will now output information for all 32 ports within a single logical switch. The output from the following commands have been added to the output of the supportShow command: * fabStateShow * fabSwitchShow * chassisShow * fanShow * historyShow portCfgLongDistance Command Restriction Extended links in the same SAN can be either 1 Gigabit or 2 Gigabit but not both. If you have 1 Gigabit Extended links in the SAN you can not add a 2 Gigabit Extended Link to the same SAN. Enhancement A new parameter was added to this command. The new syntax is as follows: portCfgLongDistance [slot/]port[, distance_level][, linkinitmode] linkinitmode Operand Only applies to long distance setup. It does not affect a normal link. It is added to ensure the long distance link initialization sequence. It is not required for long distance, but may be useful when initiating L1 and L2 links. linkinitmode Usage Specify 1 to activate long distance link initialization sequence. Specify 0 to deactivate this mode. The default value is 0 (disabled). This operand is optional. The following example is for a 100km link on a port in a V4.0 switch with long distance link initialization protocol enabled: ---------------------------------------------------------- switch:admin> portCfgLongDistance 2/3 L2 1 done. ---------------------------------------------------------- portcfgshow Command This command is enhanced by the addition of VC Link Init mode data. The setting of this option is shown as blank (..) when the long distance link initialization option is turned off and on. This value is set by the portCfgLongDistance command. timeout Command The syntax for this command is as follows: timeout [interval] Use this command with no operands to display the current telnet timeout value. Use this command with the interval operand to set the timeout value to the specified number of minutes. Using a timeout value of 0 will disable the timeout functionality so that login sessions are never be disconnected. firmwareCommit Command Use this command to copy an updated firmware image on the primary partition to the secondary partition and commit both partitions of a CP to an updated\ version of the firmware. This must be done after each firmwaredownload, and after the switch has been rebooted and a sanity check is performed to make sure the new image is fine. 15.0 PERSISTENT ERROR LOG ENHANCEMENT The Error Log sub-system is enhanced to save a maximum of 1,536 messages in RAM, that is, a total of 256 messages for each error message level (Panic, Critical, Error, Warning, Info, and Debug). In addition, important messages are stored in a separate persistent error log to guarantee that they are not lost in case of power outage or system reboot. The enhancement prevents messages of lesser severity from over-writing messages of greater severity. For example, Warning messages cannot over write Error, Critical or Panic messages. Features of the Persistent Error Log Persistent error log features follow. * Preserved across power cycles and system reboots. * Default capacity to store 1,024 error log entries. * May be resized at run time without having to reboot the switch or the system. * May be resized at run time to configure a maximum of 2,048 entries. Basically, persistent error log can be resized any where between 1,024 and 2,048 entries. * Implemented as a circular buffer. When more than maximum entries are added to the persistent log, old entries are over-written by new entries. * All error messages of levels Panic and Critical are automatically saved in the persistent log. * Guarantees that critical or panic level messages are not lost in the event of unexpected system reboot or fail-over. errdump and errShow Commands The commands errdump or errShow display a superset of the persistent log messages saved during previous system run time cycles and the error log messages generated during the current run time cycle. Options are provided to errDump command to display three options: all the errors (previous persistent log and the current run time log), only errors from the current run time cycle, or the errors from the persistent error log. Options are provided to clear the persistent error log. (errClear -p). With the addition of a persistent log, the errdump and errshow commands display output of both the persistent error log and the volatile error log (RAM). However, the output shown by SNMP and Web Tools only displays the last 256 entries of the volatile log information. Modified Error Log Commands There are three modified, and four new Fabric OS commands implemented for the persistent error logging functionality. The following commands were modified: errDump Command Short Description Display the error log, without page breaks. Syntax errDump [-s swinst] [-p] [-a] Availability admin Description Use this command to display the error log, showing entries in the log without any page breaks. It is identical to errShow, except that errShow prompts the user to type return between each log entry. The output of the errDump command includes the display of errors/events history recorded in the persistent error log and error/events logged in the current run time cycle. This command also provides options to display ONLY those error/event messages that are saved in the persistent error log, or ONLY those messages generated during the current run time cycle. All important error log messages, regardless of their message severity level, are stored in a persistent storage as they are logged. Both the persistent error log and the run time log are limited in space and managed as circular buffers. When either log overflows, old entries are replaced by new entries. The persistent error log is saved across system reboots and power cycles and can be resized at run time. Operands This command has the following operands: -p Display messages from the persistent error log. -a Display messages from the active error log. This displays error log messages logged during the current run time cycle that are present in the volatile memory (RAM). -s swinst This is an optional parameter to specify a switch instance number. This parameter is required on the Standby CP. This parameter should not be used on the Active CP. Not used with SAN Switch 2/32. You must follow -s with the switch instance number where the command is to be executed. Valid values for switch instance are 0 (for the switch instance associated with slots 1 though 4) or 1 (for the switch instance associated with slots 6 though 10). Example To display the error log without page breaks: ---------------------------------------------------------------- switch:switch:admin> errDump See Also errShow, errSaveLvlSet, errSaveLvlShow, errNvLogSizeSet, errNvLogSizeShow, errShow Command New operands have been added to this command. Short Description Display the error log. Syntax errShow [-s swinst] [-a] [-p] Availability admin Description Use this command to display the error log, prompting the user to type return between each log entry. It is identical to errDump, except that errDump displays all entries without page breaks. The output of errshow command includes the display of errors/ events recorded in the persistent error log during previous run time cycles and the display of error/event messages logged in the current run time cycle. This command also provides options to display ONLY those error log messages that are saved in the persistent log and to display ONLY those messages that are logged during the current run time cycle. All important error log messages, regardless of their message severity level, are stored in a persistent storage as they are logged. Both the persistent error log and the run time log are limited in space and managed as circular buffers. When either log overflows, old entries are replaced by new entries. Operands This command has the following operands: -p Display messages from the persistent error log. -a Display messages from the active error log. This displays the error log messages generated during the current run time cycle. -s swinst This is an optional parameter to specify a switch instance number. Not used with SAN Switch 2/32. This parameter is required on the Standby CP. This parameter should not be used on the Active CP. You must follow -s with the switch instance number where the command is to be executed. Valid values for switch instance are 0 (for the switch instance associated with slots 1 though 4) or 1 (for the switch instance associated with slots 6 though 10). Example To display the error log with page breaks: -------------------------------------------------------------------- switch:admin> errShow Error 14 -------- 0x304 (fabos): Jun 14 11:57:52 Switch: 0, Warning FW-STATUS_SWITCH, 3, Switch status changed from HEALTHY/OK to Marginal/Warning Type to continue, Q to stop: ------------------------------------------------------------------- See Also errDump, errSaveLvlSet, errSaveLvlShow, errNvLogSizeSet, errNvLogSizeShow, errClear Command New operands have been added to this command. Short Description Clear the switch error log. Syntax errClear [-s swinst] [-p] Availability admin Description Use this command to clear the error log for a particular switch instance. If no operand is specified, this command clears the error log in RAM; the persistent error log is not cleared. However, if -p option is specified, ONLY the persistent error log is cleared and the error log in RAM is not cleared. Operands This command has the following operands: -s swinst This is an optional parameter to specify a switch instance number. Not used with SAN Switch 2/32. This parameter is required on the Standby CP. This parameter is not used on the Active CP. You must follow -s with the switch instance number where the command is to be executed. Valid values for switch instance are 0 (for the switch instance associated with slots 1 though 4) or 1 (for the switch instance associated with slots 6 though 10). -p Clear messages ONLY from the persistent error log. Example The following example shows how to clear the current run time error log on the Active CP: switch:admin> errClear The following example shows how to clear the persistent error log on the Active CP. switch:admin> errClear -p The following example shows how to clear the current run time error log on the Standby CP, for the switch instance 0.switch: admin> errClear The following example shows how to clear the persistent error log on the Standby CP, for the switch instance 0. switch:admin>errClear -p See Also errDump, errShow, errNvLogSizeSet, errNvLogSizeShow errSaveLvlSet Command Short Description Set error save level of a switch. This command is new for Fabric OS V4.0.2b. Syntax errSaveLvlSet [-s swinst] lvl Availability admin Description Use this command to control types of messages that are saved in the persistent error log. Message types are based on the message severity levels. By default, all messages of type Panic and Critical are saved in the persistent log. If you want to save messages of log levels less severe than Critical, use this command to specify a new message save level. This new message save level is not persistent across a reboot. It is in effect only for that run time cycle. Operands This command has the following operands: -s swinst This is an optional parameter. It shows switch instance number. Not used with SAN Switch 2/32. This parameter is required on the Standby CP. This parameter should not be used on the Active CP. You must follow -s with the switch instance number where the command is to be executed. Valid values for switch instance are 0 (for the switch instance associated with slots 1 though 4) or 1 (for the switch instance associated with slots 6 though 10). lvl Message severity level. Save those error log messages whose message severity level is less than (more severe) or equal to this level. The valid values are: Critical 1 Error 2 Warning 3 Info 4 Debug 5 errSaveLvlShow Command Short Description Show current error save level setting of a switch. This command is new for Fabric OS V4.0.2b. Syntax errSaveLvlShow [-s swinst] Availability admin Description Use this command to find out the current value of the persistent error log save level for a given switch instance. Operands This command has the following operands: -s swinst This is an optional parameter to specify a switch instance number. Not used with SAN Switch 2/32. This parameter is required on the Standby CP. This parameter should not be used on the Active CP. You must follow -s with the switch instance number where the command is to be executed. Valid values for switch instance are 0 (for the switch instance associated with slots 1 though 4) or 1 (for the switch instance associated with slots 6 though 10). Example Following example shows how to display current error log save level. switch:admin> errSaveLvlShow switch:admin> errSaveLvlShow See Also errSaveLvlSet, errNvLogSizeSet, errNvLogSizeShow, errShow errNvLogSizeSet Command Short Description Resize the persistent error log. This command is new for Fabric OS V4.0.2b. Syntax errNvLogSizeSet [-s swinst] number_of_entries Availability admin Description Use this command to resize the persistent error log of a switch to a new size specified by the operand number_of_entries. The persistent error log is resized immediately after the successful execution of this command. Operands This command has the following operands: -s swinst Not used with SAN Switch 2/32. This is an optional parameter. It specifies a switch instance number. This parameter is required on the Standby CP. This parameter should not be used on the Active CP. You must follow -s with the switch instance number where the command is to be executed. Valid values for switch instance are 0 (for the switch instance associated with slots 1 though 4) or 1 (for the switch instance associated with slots 6 though 10). number_of_entries Specify the new persistent error log size in unit of number of error log entries. Error log can be resized within specified limits. This command fails if an attempt is made to change the persistent error log beyond the range of valid values. Valid values are from 1024 to 2048. Example The following example shows how to resize the persistent error log to 1500 entries. ------------------------------------------------------------------- switch:admin> errNvLogSizeSet 1500 ------------------------------------------------------------------- switch:admin> errNvLogSizeSet 1500 ------------------------------------------------------------------- See Also errNvLogSizeShow, errSaveLvlShow, errShow errNvLogSizeShow Command Short Description Show the current persistent (non-volatile) error log configuration of a switch. This command is new for Fabric OS V4.0.2b. Syntax errNvLogSizeShow [-s swinst] Availability admin Operands This command has the following operands: -s swinst Not used with SAN Switch 2/32. This is an optional parameter. This parameter is required on the Standby CP. This parameter should not be used on the Active CP. You must follow -s with the switch instance number where the command is to be executed. Valid values for switch instance are 0 (for the switch instance associated with slots 1 though 4) or 1 (for the switch instance associated with slots 6 though 10). Example The following example shows how to display persistent error log configuration. ------------------------------------------------------------------- switch:admin> errNvLogSizeShow ------------------------------------------------------------------- NOTE: The Persistent Error Log stores up to 1,024 entries. The following example shows how to display persistent error log configuration on the Standby CP, for switch instance -0. ------------------------------------------------------------------- switch:admin> errNvLogSizeShow ------------------------------------------------------------------- See Also errNvLogSizeSet, errSaveLvlShow, errShow 16.0 MIB FILES The following MIB files are supported: * v4_0FA.mib (supports Fibre Alliance MIB v3.0) * v4_0FE.mib (v4_0FE.mib is same as festd.smiv2 in v3.0) * v4_0SW.mib (HP Fibre Channel Switch MIB V4.0) * v4_0TRP.mib (HP Fibre Channel Switch Trap MIB V4.0) IMPORTANT: Please make sure that SNMP-FRAMEWORK-MIB and RFC1155-SMI are loaded before loading v4_0FE.mib. 17.0 FABRIC WATCH DEAMON STARTUP During a switch startup or switch reboot, the Fabric Watch daemon is one of the last processes to become active. Depending on the size of the fabric, Fabric Watch may take several minutes to complete its start up sequence. Before Fabric Watch is completely active, it will not be able to monitor events occurring in the fabric. During this time, the switchStatusPolicyShow command will show the default settings instead of any custom settings. 18.0 UPDATING THE CORE SWITCH PID FORMAT Updating the Core Switch PID Format is necessary when upgrading an existing SAN to support the hp StorageWorks SAN Switch /32, or hp StorageWorks Core Switch 2/64. When a switch with more than sixteen ports is introduced into an existing fabric, this parameter needs to be set on all switches in the fabric. IMPORTANT: HP always recommends redundant fabrics and multi-pathing software for uptime-sensitive environments. If a redundant SAN architecture is in place, the Core PID update can take place without application downtime. To ensure maximum ease of administration, this parameter can and should be proactively set on a fabric before it ever enters production, whether or not an upgrade to larger switches is planned. Updating Core PID Format Switch Procedure This process should be executed as part of either the online or offline update processes when updating an existing fabric. However, it may be implemented in a stand-alone manner on a non-production fabric, or a switch that has not yet joined a fabric. Before executing this procedure, ensure that all switches in the fabric are running Fabric OS versions that support the new addressing mode. HP recommends 2.6.0c or later for 1-Gbs switches, 3.0.2c or later for 2-GBs edge switches, and 4.0.0 or later for the Core Switch. IMPORTANT: All switches running any version of Fabric OS 4.x are shipped with the Core Switch PID Format enabled, so it is not necessary to perform the PID format change on these switches. The actual process of changing the PID format is quite simple. Telnet into each switch in the fabric, and disable the switch using the switchDisable command. Use the configure command to change the parameter as in the following example: Example: ---------------------------------------------------------------- switch:admin> switchdisable switch:admin> configure Configure... Fabric parameters (yes, y, no, n): [no] yes Domain: (1..239) [1] R_A_TOV: (4000..120000) [10000] E_D_TOV: (1000..5000) [2000] Data field size: (256..2112) [2112] Sequence Level Switching: (0..1) [0] Disable Device Probing: (0..1) [0] Suppress Class F Traffic: (0..1) [0] SYNC IO mode: (0..1) [0] VC Encoded Address Mode: (0..1) [0] Core Switch PID Format: (0..1) [0] 1 Per-frame Route Priority: (0..1) [0] Long Distance Fabric: (0..1) [0] BB credit: (1..27) [16] Virtual Channel parameters (yes, y, no, n): [no] Switch Operating Mode (yes, y, no, n): [no] Zoning Operation parameters (yes, y, no, n): [no] RSCN Transmission Mode (yes, y, no, n): [no] Arbitrated Loop parameters (yes, y, no, n): [no] System services (yes, y, no, n): [no] Portlog events enable (yes, y, no, n): [no] Committing configuration...done. switch:admin> switchenable ---------------------------------------------------------------- Use cfgEnable [active_zoning_config] on the switch to update zoning to use the new PID format. This does not change the definition of zones in the fabric, but causes the lowest level tables in the zoning database to be updated with the new PID format setting. Example: ---------------------------------------------------------------- switch:admin> cfgEnable myZoningCfg ---------------------------------------------------------------- Finally, use the switchEnable command to re-enable the switch. Once this is done on every switch in the fabric, all switches in the fabric will operate with the new addressing mode. NOTE: For more information on upgrading to larger port count switches, refer to the HP SAN Design Guide and HP support services. 19.0 EXTENSIVE INFORMATION ON THE PID FORMAT This section provides best practices for updating an existing production SAN to the new PID format. In addition to the core PID format update process, there are a number of very common scenarios in which a device may be assigned a new PID. Therefore the procedures included in this section are applicable to other areas of SAN administration, and should be generally useful to any SAN administrator. Overview A Port Identifier (PID) is one of two addressing mechanisms used in Fibre Channel. This is analogous to specifying the physical switch and port a device is attached to in data networks. It is not analogous to an IP address. There are numerous situations in which a device's PID may change. PIDs are assigned by a Fibre Channel switch when a device logs into the fabric. An example PID might look like this: 011F00. The other Fibre Channel addressing mechanism is the World Wide Name (WWN). This is analogous to an Ethernet MAC address. A device's WWN never changes. WWNs are assigned by the factory when a device is manufactured. An example WWN might look like this: 10:00:00:60:69:51:0e:8b The method switches use for assigning PIDs has changed between the 16-port switches and the hp StorageWorks SAN Switch /32, or hp StorageWorks Core Switch 2/64. The old PID format was XX1YZZ, where "Y" was a hexadecimal number which specified a particular port on a switch and "1" was a constant. (The use of the constant was based on an overly conservative reading of the Fibre Channel standards.) XX was used for the domain ID and ZZ for the AL_PA. Since all switches had sixteen or fewer ports at the time that method was established, one hexadecimal digit was entirely adequate. The new format is XXYYZZ, where "YY" represents a port. Using the entire middle byte for the port allows addressing up to 256 ports per switch. This change was necessary to support products with more than sixteen ports. When a switch with the new core PID format is introduced into an existing fabric, the core PID format needs to be set on all switches in the fabric to prevent segmentation. The Core PID format is always set on the hp StorageWorks SAN Switch /32 and hp StorageWorks Core Switch 2/64. In order to make this change, it is necessary to schedule an outage for the fabric. This does not require application downtime, if redundant fabrics are used. If redundant fabrics are not used, there are numerous failure cases and even routine maintenance scenarios that will result in application downtime. This is true for any currently available Fibre Channel technology. In new installations, the PID format should always be set to the new addressing method before the fabric enters production. When updating an existing SAN, there are several scenarios that must be evaluated before changing the PID format. Proactively setting the core PID format on all new fabrics before they enter production will prevent the need to update those fabrics in the future. This is strongly recommended as a step in the deployment of all new fabrics. There is no difference in the behavior of a fabric with either PID format; changing to the new format during deployment will merely save administrative effort later on. Some device drivers map logical disk drives to physical Fibre Channel counterparts by PID. An example in a Windows HBA driver would be "Drive E: = PID 011F00". Most drivers can either dynamically change PID mappings or use the WWN of the Fibre Channel disk for mapping, not the PID. For Example, "Drive E: = WWN 10:00:00:60:69:51:0e:8b". For those few drivers that use static PID binding, when the format is changed (PID à 010F00), the mapping breaks and must be manually fixed. (The driver still has "Drive E: = PID 011F00" but the actual device address is now "010F00") This can be done by rebooting the host, or using a manual update procedure on the host. This is discussed in more detail in the following sections. In the more typical case where WWN or dynamic PID binding is used changing the device's PID does not affect the mapping, but before updating the PID format, it is necessary to determine whether or not any devices in the SAN bind by PID. In every case where devices bind by PID, any such procedure will become difficult or impossible to execute without downtime. In some cases, device drivers allow the user to manually specify persistent bindings by PID. In these cases, such devices must be identified and an appropriate update procedure created. If possible, the procedure should involve changing from PID binding to WWN binding. The Recommended Approaches section of this section discusses in more detail the process of updating to the new PID format. This starts with evaluating a production SAN to see which if any devices bind by PID. Then either an online or offline update procedure will be chosen to perform the actual update. The Frequently Asked Questions section provides a Q&A format to discuss the issues surrounding a core PID format update. Finally, the Detailed Procedures section provides examples of step-by-step instructions for certain PID-bound devices. These procedures are applicable to any of a broad class of routine maintenance tasks; indeed, they would apply to these devices in many scenarios with any Fibre Channel switch in any addressing mode. While this section is not comprehensive, it should give a SAN administrator the information needed to plan and execute a successful core PID format update, and also provide useful information for other SAN management tasks. Recommended Approaches To update to the new PID format in a non-production or non-critical environment, the procedure can be very simple. Schedule downtime for all devices attached to the SAN, and perform the offline update procedure below. This is the easiest process, but the most disruptive to the environment. The process of updating to the new PID format online in a production environment is slightly more complex. It is broken down into two phases: the evaluation phase and the update phase. At the end of the evaluation phase, all information needed to safely update to the new PID format will be in hand, and an update procedure will be created. Depending on the results of the evaluation phase and the SAN's uptime requirements, either the online or offline update processes will be usedin the update phase. Evaluation Phase In this phase, the SAN will be evaluated to determine how each device driver will respond to the PID format change, and how the SAN's multi-pathing software will respond to a fabric service interruption. Data Collection In many sites, detailed documentation about the SAN is kept up-to-date. If so, it may be possible to skip the Data Collection step. However, some sites will not already have this information at hand. In these cases, it is necessary to perform a site survey, collecting information on each device in the SAN. The purpose is to find any devices that bind to PIDs, and find out how the multi-pathing software will respond to the process. Learning this will have broad applicability: PID-bound devices will not be able to seamlessly perform in many routine maintenance or failure scenarios, and improperly configured multi-pathing software may cause downtime, rather than avoiding it. Any kind of device could bind by PID. While the vast majority of devices do not do so, each device should be evaluated prior to attempting an online update. This is a non-comprehensive list of information to collect, which would be both generally useful and relevant to the PID update process: * HBA driver versions * Fabric OS versions * RAID array microcode versions * SCSI bridge code versions * JBOD drive firmware versions * Multi-pathing software versions * HBA time-out values * Multi-pathing software time-out values * Kernel time-out values In addition to looking for information about the code revisions in use, look at the way each device is used. Some device drivers do not automatically bind by PID, but allow the operator to manually create a PID binding. For example, persistent binding of PIDs to logical drives may be done in many HBA drivers. Make a list of all devices, which are configured this way. If manual PID binding has been done, consider changing to WWN binding. This is a non-comprehensive list of device types, which may be manually configured to bind by PID. * HBA drivers (persistent binding) * RAID arrays (LUN access control) * SCSI bridges (LUN mapping) Data Analysis Once you have collected the code versions of each device on the SAN, they must be evaluated to find out if any of them automatically bind by PID. Do this in cooperation with the support providers for each device on the SAN. Some providers may simply be able to answer this question; in other cases, it may be necessary to perform empirical testing. Most devices do not bind by PID when running up-to-date drivers. However, some older driver versions may behave this way. Whenever possible, HP recommends using up-to-date drivers that do not bind by PID, as binding by PID creates management difficulty in a number of scenarios. The drivers shipping by default with HP/UX and AIX at the time of this writing still bind by PID, and so detailed procedures are provided for these operating systems in the final section of this overview. Similar procedures can be developed for other operating systems which run HBA drivers that bind by PID. HP recommends upgrading to WWN-binding drivers if these are available. There is no inherent PID binding problem with either AIX or HP/UX. It is the HBA drivers shipping with these operating systems that bind by PID. Both OSs are expected to release HBA drivers that bind by WWN, and these may already be available through some support channels. Work with the appropriate support provider to get more information about driver availability. Also evaluate whether or not devices which are manually bound by PID can be migrated to WWN binding. If not, determine what procedures will be required to change their bindings. It is also important to understand how multi-pathing software will react when one of the two fabrics is taken offline. If the time-outs are set correctly, then switchover between fabrics should be transparent to the users of the system. Empirical Testing For some devices, it may not be possible to determine whether or not they bind by PID by asking the support provider. In these cases, empirical testing is recommended. If you are not sure about a device, work with the support provider to create a test environment. Create as close a match as practical between the test environment and the production environment, and perform an update using the Online Update procedure. Devices that bind by PID will be unable to adapt to the new format, and one of three approaches will need to be taken with them: * A plan can be created for working around the device driver's limitations in such a way as to allow an online update. See the Detailed Procedures section for examples. * The device can be upgraded to drivers that do not bind by PID. * Downtime can be schedule to reset the device during the core PID update process, which generally allows the mapping to be rebuilt. If options #1 or #2 are used, the procedures should again be validated in the test environment. Some multi-pathing software installations handle fabric failover gracefully and quickly. There are quite a few variables which affect this, including but not limited to: * HBA time-out values * Multi-Pathing software time-out values * Kernel time-out values If the behavior of multi-pathing software in the SAN is not well understood, a test should be run to determine this as well. For installations where the multi-pathing software does handle switchover automatically and seamlessly, the update process is greatly simplified. Update Plan Creation The specific procedures used to update to the new PID format will vary on a site-by-site basis. However, there are general guidelines for the update process options. These guidelines are provided to give SAN administrators a starting point for creating site-specific procedures, not to be the complete procedures themselves. The online update process is only intended for use in uptime-critical dual-fabric environments, with multi-pathing software. If dual-fabrics are not in place, there are a number of routine maintenance scenarios which require scheduled downtime, the core PID migration process being only one example. High-uptime environments should always use a redundant fabric SAN architecture. The general format for an online update plan is as follows: * Backup all data. Verify backups. * Verify that I/O continues over the other fabric. * Disable all switches one at a time in the fabric to be updated. * Verify that I/O continues over the other fabric after each switch disable. * Change the PID format on each switch in the fabric. (Procedure provided below.) * One at a time, re-enable the switches in the updated fabric. In a core/edge network, enable the core switches first. Once the fabric has re-converged, use the cfgenable command to update zoning.(Procedure provided below.) For any devices manually bound by PID, update their bindings. This may involve changing them to the new PIDs, or may (preferably) involve changing to WWN binding. For any devices automatically bound by PID, two options exist: Execute a custom procedure to rebuild its device tree online. Examples are given in the Detailed Procedures section of this readme.txt. Reboot the device to rebuild the device tree. Some operating systems require a special command to do this. e.g. "boot -r" in Solaris. For devices that do not bind by PID or have had their PID binding updated, mark online or re-associate the disk devices with the multi-pathing software and resume I/O over the updated fabric. Repeat with the other fabric(s). The general format for an offline update plan is as follows: 1, Backup all data. Verify backups. 2. Shut down all hosts and storage devices attached to the SAN. 3. Disable all switches in the fabric to be updated. 4. Change the PID format on each switch in the fabric. (Procedure provided below.) 5. One at a time, re-enable the switches in the updated fabric. In a core/edge network, enable the core switches first. 6. Once the fabric has re-converged, use the cfgEnable command to update zoning. (Procedure provided below.) Repeat steps 3 through 6 for all fabrics in the SAN. Bring the devices online in the order appropriate to the SAN. This usually involves starting up the storage arrays first, and the hosts last. For any devices manually bound by PID, bring the device back online, but do not start applications. Update their bindings and reboot again if necessary. This may involve changing them to the new PIDs, or may (preferably) involve changing to WWN binding. For any devices automatically bound by PID, reboot the device to rebuild the device tree. (Some operating systems require a special command to do this. e.g. "boot -r" in Solaris.) For devices that do not bind by PID or have had their PID binding updated, bring them back up and resume I/O. Verify that all I/O has resumed correctly. Migrating from manual PID binding (e.g. persistent binding on an HBA) to manual WWN binding and/or upgrading drivers to versions that do not bind by PID can often be done before setting the core PID format. This can be advantageous, as it prevents the update process from having as many variables. Update Phase Online Update Provided that careful planning, testing, and general due-diligence has been performed, it should be safe to update the core PID format parameter in a live, production environment. This requires dual fabrics with multi-pathing software. Create a detailed update plan. Guidelines are provided above. Schedule a time for the update when the least critical traffic is running. This is a best practice for any kind of online add, move, change, upgrade, or update. At very least, it is a good idea to avoid running backups during the update process, as tape drives tend to be very sensitive to I/O interruption. Backup all data with verification, then update one fabric at a time, verifying I/O at each step in the process. Offline Update It is possible to execute an offline update with less advance planning. However, it requires that all devices attached to the fabric be offline. Which option to choose depends on the uptime requirements of the site: high uptime sites should all have dual fabrics with multi-pathing software, so the online update is an option for those SANs. Single fabric sites must use offline procedures. Create an update plan. Guidelines are provided above. Schedule an outage for all devices attached to the SAN. Backup all data with verification, then update all fabrics in the SAN at the same time. Bring all devices back online, and verify I/O. Hybrid Update It is possible to combine the online and offline methods for fabrics where only a few devices bind by PID. Since any hybrid procedure will be extremely customized, it is necessary to work closely with the SAN service provider in these cases. PID Frequently Asked Questions Q: What is a PID? A: A PID is a Port Identifier. PIDs are used by the routing and zoning services in Fibre Channel fabrics to identify ports in the network. They are not used to uniquely identify a device; the World Wide Name (WWN) does that. Q: What situations can cause a PID to change? A: Many scenarios cause a device to receive a new PID. For example, unplugging the device from one port and plugging it into a different port will cause this. (This might happen when cabling around a bad port, or when moving equipment around.) Another example is changing the domain ID of a switch, which might be necessary when merging fabrics, or changing compatibility mode settings. Q: Why do some devices handle a PID change well, and some poorly? A: Some older device drivers behave as if a PID uniquely identifies a device. These device drivers should be updated if possible to use WWN binding instead. A device's WWN never changes, unlike its PID. PID binding creates problems in many routine maintenance scenarios, and should always be avoided. Fortunately, very few device drivers still behave this way, and these are expected to be updated as well. Q: Must I schedule downtime for my SAN to perform the PID update? A: Generally, no. Provided that you have dual-fabrics and are certain that no devices in your SAN bind by PID, the update process is simple and user-transparent. It is necessary to spend some effort to ensure this before attempting an online update, and if the effort is less attractive than the downtime, you may prefer to schedule an outage. Q: Must I stop all traffic on the SAN before performing the update? A: If you are running dual-fabrics with multi-pathing software, you can update one fabric at a time. Move all traffic onto one fabric in the SAN, update the other fabric, move the traffic onto the updated fabric, and update the final fabric. Without dual-fabrics, stopping traffic is highly recommended. This is the case for many routine maintenance situations, so dual-fabrics are always recommended for uptime-sensitive environments. Q: How can I avoid having to change PID formats on fabrics in the future? A: The core PID format can be set on switches other than the hp StorageWorks SAN Switch /32, and hp StorageWorks Core Switch 2/64 on a fabric at initial installation. The update could also be opportunistically combined with any scheduled outage. Setting the format proactively far in advance of adoption of higher port count switches is the best way to ensure administrative ease. Q: Where can I get more information on upgrading to larger switches? A: Refer to the SAN Design Guide and HP support. Detailed Procedures These procedures are not intended to be comprehensive. They provide a starting point from which a SAN administrator could develop a site-specific procedure for a device that binds automatically by PID, and cannot be rebooted due to uptime requirements. HP/UX 1. Backup all data. Verify backups. 2. If you are not using multi-pathing software, stop all I/O going to all volumes connected through the switch/fabric to be updated. 3. If you are not using multi-pathing software, unmount the volumes from their mount points using umount. The proper usage would be umount . For example: umount /mnt/jbod 4. If you are using multi-pathing software, use that software to remove one fabric's devices from its configuration. 5. Deactivate the appropriate volume groups using vgchange. The proper usage would be: vgchange -a n . For example: vgchange -a n /dev/jbod 6. Make a backup copy of the volume group directory using tar from within /dev. For example: tar -cf /tmp/jbod.tar jbod 7. Export the volume group using vgexport. The proper usage is: vgexport -m For example: vgexport -m /tmp/jbod_map /dev/jbod 8. Login into each switch in the fabric. 9. Issue the command switchDisable. 10. Issue the command configure and change the Core Switch PID Format to 1. 11. Issue the command switchEnable. Enable the core switches first, then the edges. 12. Once you have done this to all switches in the fabric and verified that it has re-converged properly, issue the command cfgEnable [effective zone configuration] on one of the switches in that fabric. For example: cfgEnable my_zones 13. Clean the lvmtab file by using the command vgscan. 14. Change to /dev and untar the file that was tared in step 4. For example: tar -xf /tmp/jbod.tar 15. Import the volume groups using vgimport. The proper usage would be vgimport -m For example: vgimport -m /tmp/jbod_map /dev/jbod /dev/dsk/c64t8d0 /dev/dsk/c64t9d0 16. Activate the volume groups using vgchange. The proper usage is: vgchange -a y . For example: vgexport -a y /dev/jbod 17. If you are not using multi-pathing software, mount all devices again and restart I/O. For example: mount /mnt/jbod IMPORTANT: If you are using multi-pathing software, re-enable the affected path. The preceding steps do not "clean up" the results from ioscan. When viewing the output of ioscan, notice that the original entry is still there, but now has a status of NO_HW. # ioscan -funC disk Class I H/W Path Driver S/W State H/W Type Description ------------------------------------------------------------------------------------ disk 0 0/0/1/1.2.0 adisk CLAIMED DEVICE SEAGATE ST39204LC /dev/dsk/clt2d0 /dev/rdsk/c1t2d0 disk 1 0/0/2/1.2.0 adisk CLAIMED DEVICE HP DVD-ROM 304 /dev/dsk/c3t2d0 /dev/rdsk/c3t2d disk 319 0/4/0/0.1.2.255.14.8.0 adisk CLAIMED DEVICE SEAGATE ST336605FC /dev/dsk/c64t8d0 /dev/rdsk/c64t8d0 disk 320 0/4/0/0.1.18.255.14.8.0 adisk NO_HW DEVICE SEAGATE ST336605FC /dev/dsk/c65t8d0 /dev/rdsk/c65t8d0 To remove the original (outdated) entry, use the command rmsf (remove special file). The proper usage for this command would be rmsf -a -v . For example: rmsf -a -v /dev/dsk/c65t8d0 18. Validate that the entry is removed by using the command ioscan -funC disk. Notice in the table below that the NO_HW entry is no longer listed. het46 (HP-50001)> ioscan -funC disk Class I H/W Path Driver S/W State H/W Type Description ------------------------------------------------------------------------------------- disk 0 0/0/1/1.2.0 adisk CLAIMED DEVICE SEAGATE ST39204LC /dev/dsk/clt2d0 /dev/rdsk/c1t2d0 disk 1 0/0/2/1.2.0 adisk CLAIMED DEVICE HP DVD-ROM 304 /dev/dsk/c3t2d0 /dev/rdsk/c3t2d0 disk 319 0/4/0/0.1.2.255.14.8.0 adisk CLAIMED DEVICE SEAGATE ST336605FC /dev/dsk/c64t8d0 /dev/rdsk/c64t8d0 19. Repeat for all fabrics. AIX Procedure 1. Backup all data. Verify backups. 2. If you are not using multi-pathing software, stop all I/O going to all volumes connected through the switch or fabric to be updated. 3. If you are not using multi-pathing software, varyoff the volume groups. The command usage is varyoffvg . For example: varyoffvg datavg If you are not using multi-pathing software, unmount the volumes from their mount points using umount. The command usage is umount . For example: umount /mnt/jbod If you are using multi-pathing software, use that software to remove one fabric's devices from its configuration. Remove the device entries for the fabric you are migrating. For example, if the HBA for that fabric is fcs0, execute the command: rmdev -Rdl fcs0 Login into each switch in the fabric. Issue the switchDisable command. Issue the configure command and change the Core Switch PID Format to 1. Issue the configenable [effective_zone_configuration] command. For example: configenable my_config Issue the switchEnable command. Enable the core switches first, then the edges. Rebuild the device entries for the affected fabric using the cfgmgr command. For example: cfgmgr -v This command may take several minutes to complete. If you are not using multi-pathing software, vary on the disk volume groups. The proper usage would be varyonvg . For example: varyonvg datavg If you are not using multi-pathing software, mount all devices again and restart I/O. For example: mount /mnt/jbod If you are using multi-pathing software, re-enable the affected path. Repeat for all fabrics. ---------------------------------------------------------------- Copyright November 22, 2002, Hewlett Packard Company. All rights reserved. Product names mentioned herein may be trademarks and/or registered trademarks of their respective companies.