Known issues and limitations

The following table lists known issues and limitations that exist in VxFlex OS 3.0.1.2. Each table is sorted according to issue severity (from high to low).

NOTE:

If an issue was reported by customers, the customers' Service Request numbers appear in the "Issue number & SR number" column, and serve to correlate between customer-reported issues and the VxFlex OS (SCI) issue number.

Table 1. Known issues and limitations—AMS
Issue number & SR number Problem summary Workaround
SCI-56173

SR# 18510788

In very rare cases, when AMS tries to connect to the LIA service, a timeout error occurs due to the LIA process crashing endlessly. In the lia exception file exp.0, for example, the following line is recorded multiple times:

Termination due to signal 11. PID 20598 Faulting address (nil). errno 0

Steps to reinstall LIA with its original configuration:

1. Make a copy of the LIA cfg files: mkdir -p /tmp/lia/cfg; cp /opt/emc/scaleio/lia/cfg/* /tmp/lia/cfg/

2. Uninstall LIA: rpm -e EMC-ScaleIO-lia

3. Install LIA: rpm -ivh /root/install/EMC-ScaleIO-lia-<version>.rpm

4. Copy back the config files: rm -rf /opt/emc/scaleio/lia/cfg/*; cp /tmp/lia/cfg/* /opt/emc/scaleio/lia/cfg/

5. Restart LIA: pkill lia

SCI-50342 When using iDRAC 3.21.26.22 or lower, AMS upgrade to 3.0.1.1 (iDRAC 4.00.00.00) might fail with the following error message in the AMS and iDRAC Job queue log: 'unable to extract a payload' Upgrade iDRAC manually to 3.34.34.34.34, and then click Retry in AMS to perform the upgrade to 4.00.00.00. This needs to be done on all relevant nodes. Refer to iDRAC upgrade guidelines on how to perform a manual upgrade of firmware, or refer to the Upgrade VxFlex OS Guide.
SCI-47595 In some scenarios, following a non-disruptive upgrade (NDU) from v2.6.x to v3.0.x, some of the nodes might not exit maintenance mode. 1. In AMS, go to the Backend tab -> Storage in the SDS view.

2. Look for the SDSs that are in maintenance mode and expand their device lists.

3. Find devices that are in error state, right click them and clear device errors.

When all device errors are cleared, the SDSs can exit maintenance mode.

SCI-42253 When there are many requests sent to the 'perccli' utility it may become stuck and respond with empty answer. If this happens during deployment, the deployment may fail during VD creation on the physical disks. Wait a little until the sampler successfully samples the 'perccli' utility, and then click "Retry".
SCI-41950

SR# 13244526

AMS might report false hardware alerts after receiving incomplete data from iDrac. For example:

SIO07.05.1000004 NODE_FAILED_TO_CONNECT_TO_BMC_CLOSED

SIO07.02.1000001 STORAGE_CONTROLLER_INVALID_STATE_CLOSED

SIO07.05.0000015 NODE_INVALID_CMOS_BATTERY

SIO07.03.0000004 CPU_SOCKET_INVALID_TEMPERATURE

SIO07.05.0000002 NODE_INVALID_VOLTAGE

SIO07.03.1000007 CPU_SOCKET_INVALID_VOLTAGE_CLOSED

They are not actual alerts, rather false positives caused by incorrect data received from iDrac (i.e. due to network issues between AMS and BMC).

Compare AMS alerts with iDrac status; if an issue is not reported in iDrac console, it means it is a false-positive alert.
SCI-39387

SR# 12011318

AMS requires the certificate imported for its "renew certificate" process to be an intermediate CA certificate. However, the AMS can successfully import a non-intermediate-CA certificate, that is one which is not valid for signing further certificates. If that happens, it negatively impacts the ability to manage and support the system when there is need to perform a function which is not available through the AMS or when the AMS is not available. Generate a new certificate for the AMS that is valid for signing further certificates. Perform the AMS "renew certificate" process using the new certificate.
SCI-41564 In the AMS GUI, when performing an ownership migration, without specifying a Monitor user, AMS will hang in "taking ownership" state. This state blocks the addition of new nodes. Re-run the query with a Monitor user specified. AMS will return to a normal state after a connectivity error is generated.
SCI-44806 When performing a "Replace SVM" procedure on a v2.6.X system from SLES 12.2 to CentOS 7.5, the Upload OVA stage might fail after the SVM was successfully replaced. If this happens, the error message "There are no nodes that require OVA deployment" is generated. Restart the AMS service, and the Replace SVM Upload OVA process should continue as expected.
Table 2. Known issues and limitations—AMS GUI
Issue number & SR number Problem summary Workaround
SCI-16900 When trying to remove a standby MDM, while the SVM is running and the MDM is down, a failure message is returned. This is caused because the AMS is also trying to remove the rpm. The error message can be ignored. The MDM running on the node is not part of the VxFlex OS system anymore.
SCI-40019 When a new software version is added to the AMS repository it will result in other future operations like "Add Node" to fail with a "version too old" message. None
SCI-46339 Having a different number of NVDIMMs per node across a cluster may create a problem if one is manually trying to change the default AMS wizard selection of SDS devices at the "Add Devices" stage of the wizard. Manually changing this selection can violate the rules by which AMS assigns SDS devices to Storage Pools. This will create an error alert in the deployment wizard. However, the alerts are not specific enough to indicate how to reverse the last (manual) action. 1. Do not click the "Abort" button. If you click "Abort", you will have to start the deployment all over again, including re-installation of the nodes.

2. Click "Close".

3. In the "Backend > Devices" view, filter the display using "by SDSs".

4. For each SDS, check how many Acceleration Pools it has. Usually there will be one NVDIMM per Acceleration Pool.

5. From "System Settings", click "Add Nodes". You will be back to the last step of the deployment wizard.

6. You can only add SDS devices to a Storage Pool accelerated by an Acceleration Pool which has an assigned NVDIMM on the same node.

SCI-23351 In AMS, when upgrading the GUI (Windows Server) the installation path option is modifiable, but it should be unavailable during an upgrade. Avoid changing the installation path in the upgrade wizard.
SCI-54877 During NDU, after the node has finished the reboot cycle for Firmware, Driver and SDC upgrade, the following issue may occur in rare cases on ESXi nodes. AMS sends a request to vCenter for the ESXi to exit maintenance mode. The upgrade of the ESXi patch sometimes changes the ESXi certificate, and a reconnect process is needed. During the reconnect process, AMS receives the following error: Command Exiting node from maintenance mode failed: "Could not open VMware connection on Node <Node Name> with IP <Node Mgmt IP> due to VMWare exception". This is due to the VMware exception: VI SDK invoke exception:java.net.ConnectException: Connection refused: connect If this error message is displayed in the AMS GUI, click Retry.
Table 3. Known issues and limitations—Gateway
Issue number & SR number Problem summary Workaround
SCI-50867 When running the upgrade process via the VxFlex OS Gateway, a problem may occur when upgrading RHEL based nodes. When the upgrade process tries to check the IP address of a node, the following message may appear for some of the RHEL nodes: "Could not get IPs of <ip address>. VxFlex OS upgrade still relies on the ifconfig command, which should be installed separately on the host.

Once it is installed, click Retry on the VxFlex OS Installer window.

SCI-12370 In some scenarios, when using IPv6, Installation Manager (IM) might fail to identify that several IP addresses represent the same physical node. This can result in redundant install/upgrade operations. None
SCI-20141 The "auto collect logs" feature starts due to an error, and prevents the user from doing anything in the VxFlex OS Installer, until log collection is finished. 1. Stop the automatic log collection.

2. Disable the feature, so that it will not start again until you finish your task.

3. Enable the "auto collect logs" feature again.

SCI-38801 The replace SVM utility does not address static routes configured in the system. If you have static routes configured in your SLES SVM, reconfigure them after you run "Replace SVM" process on your newly created Centos SVMs.
SCI-42967 When initiating an upgrade of a ScaleIO/VxFlex OS system (from v2.6 and below to v3.0) with a large number of objects in the system, in some rare scenarios, the VxFlex OS Gateway might fail to complete the retrieval of the system topology from the MDM. Perform an MDM ownership switch to resolve the issue.
SCI-47833 The SVM patching feature might not work properly on a Windows Server-based VxFlex OS Gateway. Use a Linux-based VxFlex OS Gateway.
SCI-5466 Configuration changes of an existing system using a modified CSV file are not supported. Use one of the VxFlex OS management user interfaces to configure the system.
SCI-40372 During the VxFlex OS Gateway rpm upgrade flow, when lockbox is configured, the upgrade displays an error message: "missing information in mdm credentials (username or password) - cannot update lockbox". Ignore this message, as the Lockbox has been configured successfully.
SCI-13157 When trying to collect logs using the VxFlex OS Installation Manager while the system utilizes 100% of the disk space on all nodes, the log collection operation takes a very long time, and eventually a misleading time-out error is returned. None
Table 4. Known issues and limitations—GUI
Issue number & SR number Problem summary Workaround
SCI-41101 Mapping a very large amount (thousands) of volumes to an SDC in a single GUI session might fail due to connectivity issues. Use CLI/API and/or map volumes gradually in smaller groups
SCI-43176 During a system upgrade, while trying to log in using the GUI, an "Internal Error #34" message might appear in the login window. Retry login.
SCI-41817 In the GUI, when all snapshots are expanded in a V-tree, the 60th snapshot cannot be viewed. Navigate to the required snapshot from this preset and then switch to the Volumes preset in order to see all the relevant information.
SCI-42781 When configuring NVDIMMs in systems using the GUI, all discovered memory modules should be assigned to Acceleration Pools, and cannot be left undefined. Before adding NVDIMM devices, from the Unmount NVDIMM option, choose a region or regions and unmount only those regions. The rest of the regions will be available for AMS. Note: There is a region per single NVDIMM.
SCI-42944 VxFlex OS v3.0 GUI connected to a ScaleIO v2.0.1.x cluster is missing the Inflight checksum option Use the CLI to enable checksum for a storage device.
Table 5. Known issues and limitations—MDM
Issue number & SR number Problem summary Workaround
SCI-47388 RFcache devices are not addressed when invoking the CLI command:

--update_sds_original_paths.

The command updates path configuration for all devices with the currently assigned SDS devices path.

Use the following SCLI command to fix the device path:

scli --update_device_original_path

Usage: scli --update_device_original_path (--device_id <ID> | ((--sds_id <ID> | --sds_name <NAME> | --sds_ip <IP> [--sds_port <PORT>]) (--device_name <NAME> | --device_path <PATH>)))

Description: Changes the device's original path configuration to the path currently assigned to the device

Parameters:

--sds_id <ID> SDS ID

--sds_name <NAME> SDS name

--sds_ip <IP> SDS IP address

--sds_port <PORT> Port assigned to the SDS

--device_id <ID> Device ID

--device_name <NAME> Device Name

--device_path <PATH> SDS storage device path or file path

SCI-11969 Under heavy load, on a VxFlex OS installed device on a slave MDM, the slave MDM might become temporarily out-of-sync (degraded). This happens because Master MDM updates cannot be written to the device within a one-second timeout period. As degradation is temporary, the risk to the system is minimal and is auto-corrected. To avoid this issue, it is recommended to install VxFlexOS MDM on faster media (NVMe/SSD). In larger systems, it is also recommended to install the MDMs on separate machines.
SCI-16315 After configuring virtual IP addresses, if the Master MDM discovers that its virtual IP addresses are unreachable, it will try to perform a switch-over. Virtual IP addresses may be unreachable because the data network switch is down and the cluster is actually using a different network. If no MDM is able to obtain the virtual IP addresses, the MDM processes might shut down. Once the network problem is fixed, start the MDM processes again using create_service.
SCI-19588 In an extremely non-uniform Storage Pool configuration (SDS or Fault Set that accounts for more than half of the Storage Pool capacity), some of the capacity will not be used by the system, even though it appears to be "free".
SCI-41601 The snapshot policy mechanism allows for offsets of up to 1 minute in snapshot creation, due to delays caused by MDM reboots and switch-overs. This might present an inconsistent snapshots view. The snapshot's creation time is a display-only attribute of the snapshot, and has no affect on snapshot maintenance or the order in which snapshots are eventually deleted
SCI-42408 When using the snapshot capability with Fine Granularity (FG) Storage Pools, the Base volume physical capacity and Snapshot physical capacity size calculation (post compression) in the GUI/CLI might not be accurate. This issue will be addressed in a future release.
SCI-8508 It is not possible to add an MDM when the network latency is greater than 200 msec. None
SCI-12999 When an LDAP user is assigned to both Security and BackEndConfig groups, upon CLI login, the message "User role is SuperUser" appears, even if the user is not assigned to all the groups. Since the actual permissions are set according to the assigned groups, the message can be ignored.
SCI-27564 Original snapshot deletion time might not represent the actual deletion time. Deviations might occur due to MDM crashes or switch-overs of up to 1 minute per crash. Actual deletion time can be calculated from system information with the help of Customer Support.
SCI-11046 When there are device-related and SDS-related oscillating failures in the system and an MDM switch-over occurs, those oscillating failures may not be updated in the current Master MDM. When an additional MDM switch-over occurs, these oscillating failures counters will be available.
SCI-14632 Changing the size of a device that is in use by an SDS is not supported. To resize a device, first remove the device from the VxFlex OS system, resize it, and add it back to the system.
SCI-21795

By default, the SDS will not add new devices that allow less than ~50 MB/s in 1 MB writes.

During the disk initialization phase, the SDS writes ~200 MB. This might be prominent when using multiple partitions on a device.

When using multiple partitions, add the devices to the SDS, one-by-one. Change the SDS add new device timeout by changing the following parameter in conf.txt: mdm_to_tgt_net__send_timeout (in milli-seconds)
Table 6. Known issues and limitations—Network
Issue number & SR number Problem summary Workaround
SCI-11405 SDS IP addresses must not be ambiguous. For instance, 127.0.0.1 must not be used, as it refers to several machines. None
SCI-12038 The MDM and SDSs might restart due to a known issue in glibc version 2.12-1.166 or earlier of RH6. The issue is likely to occur when there is heavy traffic on the network. Update the glibc to version 2.12-1.167. More information can be found in Red Hat Bugzilla (Bug 1243824): https://bugzilla.redhat.com/show_bug.cgi?id=1243824
SCI-27540

SR# 08520639, 07724183

The SDS connectivity test (SDS network test) tool might return inconsistent results in networks with configuration issues (Routing, MTU, etc), and when non-vxFlex OS traffic is running on the data subnet (SDS-SDS, SDC-SDS). None.
Table 7. Known issues and limitations—SDC
Issue number & SR number Problem summary Workaround
SCI-50039

SR# 16490613

In SLES 12.4 installations, SCSI commands issued via the SG_IO ioctl to /dev/scini* devices will fail and cause a kernel "oops". None.
SCI-11026 An ESXi host might not recognize a VxFlex OS volume resize operation. Perform a re-scan of the ESXi host storage adapters.
SCI-2763 When uninstalling a Linux SDC while I/O is running, the process might fail and generate the following error message: "Module scini is in use". Reboot the node.
Table 8. Known issues and limitations—SDS
Issue number & SR number Problem summary Workaround
SCI-44997 Addition of pre-partitioned NVMe disks to a VxFlex OS system will cause removal of the partitions instead of failing the operation. Prior to adding new disks to a VxFlex OS system, make sure that they do not contain any valuable data.
SCI-44515 In rare cases, when deleting a large number of volumes with snapshots while an SDS reboot occurs, the deletions can be finished in the absence of the rebooting SDS. That would follow with the devices in the SDS are automatically attached as "new" devices. Despite being marked as new, these devices still have data residing in NVRAM (from before the reboot). This data can be erased only after the devices finish their attachment as "new" devices. The SDS does not attach the devices because it does not have enough space in NVRAM for both the old NVRAM data and the new data. Remove the disks from the SDS and add them back again.
SCI-38954 In a hyper-converged Linux environment, if more than 2,000 volumes are mapped to a given SDC, restarting the SDS on the same machine may cause the SDS devices on the machine to enter an error state. This error state can be resolved by using the "clear device error" command.
SCI-43259 An attempt to migrate a volume towards an unbalanced Storage Pool where one of the devices is completely full will produce a "No space in destination SP" message. Make sure that the target Storage Pool is balanced before migrating volumes.
SCI-15736 When almost all capacity in a VxFlex OS system is used, and the system is in maintenance mode, read I/Os may fail. This is due to the fact that in order to assure consistent reads, when a read is performed to a new location, the copy to the temporary copy must be written as well. None
SCI-44410 Volume snapshot deletion seems stuck or may take long time to complete. Snapshot deletion is dependent on the system status, and will not complete until system rebuild is over.
SCI-35732 When a disk has failed in an ESXi HCI node, the Storage VM might freeze. This will result in SDS failure, and commencement of a rebuild operation. 1. Shut down the SVM.

2. Enter the ESXi host into maintenance mode (Shut down or migrate any VM located on the host).

3. Reboot the host.

4. Identify the faulty device and remove it from the SVM, using "edit virtual machine".

5. Start the SVM. The SDS should start, the device should be removed and rebuild should be initiated.

SCI-3526 Multipath devices cannot be added as SDS devices. None
Table 9. Known issues and limitations—vSphere VxFlex OS plug-in
Issue number & SR number Problem summary Workaround
SCI-26137 When re-mapping a volume to an SDC on an ESXi node, the device appears to be in detached state. This occurs because when the vSphere VxFlex OS plug-in unmaps the volume, it first detaches the device to make sure that it is not being used, and the ESXi "remembers" that detached state. Perform an attach command on the device using vSphere web client, PowerCLI, etc.
SCI-15183 The vSphere VxFlex OS plug-in does not allow unmapping a volume from the SDC when the SDC is disconnected. Unmap using the VxFlex OS GUI or VxFlex OS CLI.
SCI-28108 vSphere VxFlex OS plug-in: When deploying a system with a mix of storage and acceleration devices of both VMDK and RDM datastores, the deployment is not successful, and generates the error: "Cannot create datastore. Error details: VI SDK invoke exception:com.vmware.vim25.HostConfigFault" Do not use VMDK mixed environments, because they are not supported.
SCI-38905 When installing VxFlex OS using the vSphere VxFlex OS plug-in, and rolling back from a failed installation, upon re-launching the installation wizard, some of the previously chosen configuration parameters might be missing. Cancel the operation, and start deployment again.
SCI-9912 In VMware environments, when an MDM cluster configuration fails, only the 'Roll-Back entire deployment' button appears. There is no 'Roll-Back failed Tasks' option. Roll back the entire system and re-deploy.
SCI-13862 When using the vSphere VxFlex OS plug-in to add SDS devices to an existing system, the plug-in uses the device ID identifier (which is not unique across nested/virtual ESXs) to check if the device was already added. Attempting to add an SDS device post-deployment might fail. Try to perform this operation using the VxFlex OS GUI or CLI.
SCI-19609 In an ESXi environment, when the SDC was installed manually using the CLI, if you attempt to upgrade the SDC using the vSphere VxFlex OS plug-in, the SDC upgrade fails with a 1009 error message. This indicates that an unexpected error was encountered. Set a name for the SDC to be used by the VxFlex OS system, in the following format:

ESX-<IP_ADDRESS_OF_ESX>

For example: ESX-10.103.110.54

Alternatively, upgrade the SDC manually.

SCI-22879 In the vSphere VxFlex OS plug-in, if the password field of a non-selected ESXi is empty in the "Pre-Deployment Actions" screen, the "Run" button is disabled and the operation cannot be started. Enter the vCenter/datacenter password, and it auto-fills the ESXi hosts below them.
SCI-27158 In the vSphere VxFlex OS plug-in, during a device removal from an Acceleration Pool, it might not be possible to close/cancel the pop-up window. The issue occurs when trying to exit the credentials page. Click the OK button with the correct credentials, or refresh the web browser.
SCI-33443 In some cases, when performing a restart to the vSphere-client service, the vSphere VxFlex OS plug-in is deleted from the vCenter. Register the plug-in again to the vCenter. Refer to the Deploy VxFlex OS Guide for the detailed procedure.
SCI-33572 In some cases, when upgrading the vSphere VxFlex OS plug-in (by unregistering the old version and registering the new version), the plug-in remains in the old version. The browser's cache needs to be cleared. After clearing the browser's cache, the SWF file will be automatically downloaded again.
SCI-38603 In the vSphere installation wizard, switching back and forth between installation screens during deployment might miss the option of replicate selection (in the Add devices screen). Close the wizard and restart the deployment.
SCI-26831 Use of the vSphere VxFlex OS plug-in to map a non-named volume to an SDC fails. Make sure the volume to be mapped has a name prior to mapping it.
SCI-35880 During plugin deployment, in some cases an error is raised because of timeout, and the following message is displayed: "Failed to setSdsPerformanceProfile - SDS does not exist." Click Retry to continue with the deployment.
SCI-7385 When running the PluginSetup script, a message may appear indicating that the script is not trusted. The script is trusted. It is possible to select Always trust, and this message will not be shown again. None