Intel CIM error after update/upgrade ESXi host

It is not uncommon to get errors after a host update/upgrade you didn’t see before on the host. The following error could be seen as an example of errors with the same cause.

The error

I did an upgrade of a 6.5 host (installed with HPE ESXi image almost 2 years ago) to the latest HPE ESXi image of 6.7 U3. After final boot the error appeared in /var/log/vmkernel.log every few seconds.

update_error_01

In text:

WARNING: LinuxThread: 423: sfcb-intelcim: Error cloning thread: -1 (bad0117)
User: 3173: sfcb-intelcim: wantCoreDump:sfcb-intelcim signal:11 exitCode:0 coredump:enable
UserDump: 3130: sfcb-intelcim: Dumping cartel 2103260 (from world 2103267) to file /var/lo
UserDump: 3258: sfcb-intelcim: Userworld(sfcb-intelcim) coredump complete

CoreDump did not sound very encouraging. And because it appeared every few seconds, I wanted to get rid of these messages.

The solution

Because every message showed the string intelcim, it was easy to guess where the error comes from. First, showed installed software (VIB) containing this string.

esxcli software vib list | grep -i intel

update_error_02

Found VIB intelcim-provider (0.5-3.3) with installation date 27th of June in 2018. Such an old software can probable causes error, so uninstalling it by running the following commands.

  • Because this is a CIM-provider, stop CIM agent on the host
    esxcli system wbem set --enable false
  • Uninstalling VIB
    esxcli software vib remove -n intelcim-provider
  • Start CIM agent on the host
    esxcli system wbem set --enable true

No more (of these) errors.

The reason

So why is this old software still on the host after an upgrade? It is quite simple: intelcim-provider was in HPE image of 6.5 but is not part of 6.7 image. Therefore the VIB is not getting updated and is also not getting removed.

vCenter Tag based VM placement

Introduction

The idea behind tag based placement is quite simple. It assists administrators to keep VMs respectively VMDKs on desired datastores, based on tags. For example, a VM is defined to be located in datacenter 1 (DC1). With tag based placement you assign tag “Storage DC1” to all datastores in DC1 and configure a policy that is assigned to VMs in DC1. All wizards in vCenter take these policy into account.

Necessary components:

  • Tag-category,
  • Tags for datastores,
  • Storage Policy Bases Management (SPBM) policy for tag based management.

Requirements:

  • Datastores used for placement must be tagged. It is also possible to use datastores, not tagged with a specific tag. Either way, some datastores must be tagged.
  • Each VM/VMDK that should be placed with policy, needs to assign a policy.

Basic setup in H5 Client

For this post I used current 6.7 U3 version of vCenter. As an example I demonstrate the setup for a two datacenter environment (DC1, DC2).

  • First, create a tag-category. To do so go to Menu –> Tags & Custom Attributes –> CATEGORIES and press NEW_SPBMplacement_1.jpg
    In this example I only use this category for datastores.
  • Then create a tag for each placement option. In this example I create a tag for each datacenter (“Storage DC1” and “Storage DC2”). Switch to TAGS and press NEW.
    _SPBMplacement_2
    Select the right tag-category.
  • Now assign your tags to your datastores. You can do this by right-click a (or more) datastore(s).
    _SPBMplacement_3
    And select tag.
    _SPBMplacement_4
    Assign your tags to all datastores that should be used for placement.
  • Create Storage Policy Based Management (SPBM) policies. In this example a policy for each DC. These policies will be assigned to your VMs respectively VMDKs. Go to Menu –> Policies and Profiles –> VM Storage Policies. And press Create VM Storage Policy.
    _SPBMplacement_5.jpg
    Go through the wizard:
    Fill in Name and Description 
    _SPBMplacement_6
    Select options in policy. If you want to use this policy just for placement, select the option Enable tag based placement rules.
    _SPBMplacement_7
    Select your tag-category and decide if you want the policy to include or exclude your tagged datastores. In this example I want to include tagged datastores.
    _SPBMplacement_11.jpg
    And select appropriate tag.
    _SPBMplacement_8
    _SPBMplacement_9
    Nice feature in wizard is the ability to view compatible and incompatible datastores to this policy.
    _SPBMplacement_10.jpg

 

During administration

There are at least two tasks, you are asked by wizard where to place a VM. These are: VM creation and storage vMotion.

VM Creation

A VM that gets created does not have any policies applied. Therefore you will see all datastores compatible.

_SPBMplacement_12

Here you can select VM Storage Policy of choice and you will see compatible datastores for this policy.

_SPBMplacement_13

When you select a compatible datastore, check compatibility is marked green. When datastore is not compatible you see a warning. This warning does not prevent you from VM creation!

_SPBMplacement_14.jpg

Storage vMotion

For Storage vMotion it is very similar. There could already be a policy applied. During wizard you can change applied policy just like at VM creation. The difference is that you can also change policy per VMDK here. Click Configure per disk and select Browser at Configuration File or VMDK you want to migrate.

_SPBMplacement_15

And choose policy in selection windows.

_SPBMplacement_16

Applied another policy (without migration)

To apply another policy to a/more VM(s), go to VM –> Configure –> Policies.
_SPBMplacement_17

Apply another policy per whole VM or for each VMDK separately.

_SPBMplacement_18

When you apply a policy, location of VM is not compliant to, Compliance Status changes to Noncompliant. Migration to a compatible datastore would solve this status.

_SPBMplacement_19

Change policy

When you change an existing policy, that is already applied to VMs, you will get a Compliance Status of Out of Date. Just Check compliance in GUI or CLI (later in this post) and you get a current state._SPBMplacement_21.jpg

 

Useful PowerCLI commands

Show tagged datastores (category: CategoryName)

Get-Datastore | Get-TagAssignment -Category CategoryName

Show un-tagged datastores (category: CategoryName)

$TaggedDSs = @((Get-Datastore | Get-TagAssignment -Category CategoryName).Entity)
ForEach ($DS in Get-Datastore) {if ($DS -notin $TaggedDSs) {$DS.name}}

Show all VMs without an applied storage policy

Get-VM | Get-SpbmEntityConfiguration | where {$_.StoragePolicy -eq $null}

Check policy compliance for a specific VM (VMname)

Get-SpbmEntityConfiguration -VM VMname -CheckComplianceNow

Show all VMs not compliant to applied storage policy. This command only shows VMs when location of configuration files are not compliant.

Get-SpbmEntityConfiguration -VMsOnly | where {$_.ComplianceStatus -ne "compliant"}

This command shows an entry for each VMDK or VM where location does not meet policy.

Get-SpbmEntityConfiguration | where {$_.ComplianceStatus -ne "compliant" } | select @{N="entitiy"; E={get-vm -Id ($_.id).Split("/")[0]}}, Entity, StoragePolicy, ComplianceStatus, TimeOfCheck

 

Limitations

  • As you can see in this post, it is not mandatory to place a VM on a compatible datastore. Just a warning is displayed.
  • There is a pre-defined alarm in vCenter “VM storage compliance alarm”. For my understanding this alarm should fire, when a VM/VMDK is not compliant to storage policy. At least in relation to a placement policy I did not manage to trigger this alarm at all. Even VMware support did not solve the situation.
  • I did also find no way to manage to get an alarm-icon in vCenter inventory when a VM is not compliant to placement policy. So all non-compliant VMs would could be seen at first glance in GUI. To find VMs in GUI not compliant you have to:
    • Menu –> Policies and Profiles –> VM Storage Policies –> Select Policy –> VM Compliance
      _SPBMplacement_20
    • Select VM –> Configure –> Policies
  • Actually there is no connection between storage and host-placement. This means when you use tag based placement like in my example to place VMs in one of more datacenter, there is no automatism to run this VM on a host in the same datacenter. To do so you can write a script to migrate VMs or place them in corresponding DRS group on a regular schedule.

IMHO

This is a nice application of vCenter tags. It makes it more simple to keep VMs on desired datastores. But it is no complete solution for this problem,  because there is no alarm for a VM running on a incompatible datastore and because the hosting ESXi server cannot be managed.

In my opinion the usage of storage policies can be cumbersome. If you use policies just for one task like placement, it is handy because you just need as much policies as you have groups of datastores. A huge limitation of storage policies is that there can just be on policy applied to a VM respectively VMDK. So when you need more options in a policy that does not fit for all VMs you need to create as much policies as you have distinguished options within policies. This can be overwhelming very fast.

New in 3PAR SSMC 3.6: Topology Insight

A new – and badly documented – feature in StoreServ Management Console 3.6 is Topology Insights. Sounds first like the probably known Maps. But Topology View shows an end-to-end view: VM <–> VMDK <–> Datastore <–> Host <–> Virtual Volume <–> System. Enriched with performance-information! This short post is about this feature and how to get it.

Continue reading “New in 3PAR SSMC 3.6: Topology Insight”

PSP rule for active/standby controller arrays (like Nimble Storage)

First, what does active/standby means. For this blog post, it means the array hast at least 2 controllers, one presents ALL LUNs to the hosts, the second controller presents NO active LUNs path to the hosts. Just in case of a manual or automatic failover, the second controller takes-over ALL LUN-presentations – and active paths. Examples of such arrays are:

  • [very old one] HP MSA 1000
  • HPE Nimble Storage
  • Dell EMC SC Series Compellent

VMware changed a default setting in ESXi 6.7 that controls the handling of broken storage paths. This Setting is action_OnRetryErrors and is part of Storage Array Type Plug-In (SATP). In 6.0 and 6.5 default was OFF, since 6.7 default if ON. Put simply: this setting turned on can lead to no device access when controller fails over! More information about the behavior of action_OnRetryErrorsyou can found here and here.

Continue reading “PSP rule for active/standby controller arrays (like Nimble Storage)”

Bad deployment and storage size estimation in vCenter migration wizard

When migrating vCenter from Windows to Photon appliance, wizard estimates deployment (vCPU and memory) and storage size – based on data of migration assistant, running on source vCenter. It happens quite often, storage size is much higher than used in Windows vCenter. When do researching, free up disk space is a suggested task. Most often this does not work properly.

Last vCenter I had to migrate to 6.7 managed about 1000 VMs. According to my experience I expected a medium deployment and large storage size. But wizard showed X-Large deployment size (up to 35k VMs!) and storage size (2TB in size!) – and no possibility to change. Current DB (external SQL Server) sized about 50GB. So X-Large was just not acceptable.

After some analysis, it came up, one table in database consumed more than 33GB of 50GB DB size. This table was VPX_TEXT_ARRAY. It had more than 8 million rows. According to KB 2005333 it is safe to delete some rows. As outlined in KB, you can run SQL statement:

DELETE FROM VPX_TEXT_ARRAY
WHERE NOT EXISTS(SELECT 1 FROM VPX_ENTITY WHERE ID=VPX_TEXT_ARRAY.MO_ID)

in vCenter DB. Of course, vCenter services has to be shut down! So read KB carefully. In my case, delete all but about 5k rows took nearly half an hour. Furthermore database log file increased size to about the double of the consumed size of the table. So take care, log file has enough space to grow.  I did not try, but in this blog post, necessary rows are copied to a temporary table, original gets truncated and rows gets moved back. For my understanding this should be done much quicker. After deletion I shrank DB and log files.

When done, I was able to select deployment size between small and X-Large; storage size between default (smallest) and X-Large in migration wizard.

Hint

If you want to know details about deployment and storage size of VCSA, you can run CD-ROM/Disk:\vcsa-cli-installer\win32\vcsa-deploy --supported-deployment-sizes. You get a table of all combinations of sizes and deployed vCPUs, memory and storage. Additionally you see for hot much hosts and VMs this is recommended.

VCSA 6.7 U3 sends no log data to syslog server

Recently a customer noticed that all his vCenter Server Appliances (VCSA) running 6.7 U3 (6.7.0.40000 respectively 6.7 build 14368073) do not send any data to syslog server any more. Rebooting VCSA or restarting service gets the system to send data again – for a few minutes. Changing syslog-settings in Appliance Management (VAMI, Port 5480) also brings VCSA to send logs for a few minutes.

VMware support know this issue already: rsyslogd is causing the problem. It will be fixed in 6.7 U3 P01. These days VMware release vCenter 6.7 U3a (6.7.0.41000). This version did not fix this issue. Hope, P01 is coming soon.

Implement VMware Skyline for vSphere

In this post I will describe how to implement VMware Skyline by deploying Skyline Collector in your datacenter. If you don’t know, Skyline is a cloud-based proactive support platform. At the moment of writing, support for these products are available: vSphere, NSX, vSAN, vRealize Operations and Horizon. You should also know, that when you have active Production Support or Premier Services subscription, Skyline is included!

To read more about Skyline, visit: https://www.vmware.com/support/services/skyline.html

If you don’t know Skyline I would definitely recommend to try it in Hands-on Lab: HOL-2015-01-SDC at:
https://labs.hol.vmware.com/HOL
Tip here: after creating your temporary account within the lab, open Skyline Advisor in your local browser using this account.

Continue reading “Implement VMware Skyline for vSphere”