My plan was to create an article discussing basic configurations in the homelab to get things started but Broadcom released VCF 9.0.2 on the 20th of January. So I decided to upgrade my homelab over the weekend instead of finalizing configurations needed for the next steps.
I have to be honest. I planned the whole weekend for the upgrade because of the glitches we had upgrading to 9.0.1 just to have enough time to resolve it.
I will not start how to download files from the Broadcom download portal and upload to the offline depot.
I write the prerequisites based on my findings / issues I had during the upgrade.
Anything else was straight forward and just ticking off one checkbox after the other.
Based on the four bullets listed, I will now start providing the details for every single one. Hopefully it will save someone time. As always, better start prepared instead of using a trial and error approach.
Downloading the files and needed binaries from the Broadcom Portal is simple, but the issue with the offline depot is the /metadata folder inside the /PROD folder as it contains a .json file called productVersionCatalog.json and it contains the listing of the all the products, their versions, etc. Without an updated file, neither Fleet Manager nor SDDC Manager will detect the additional binaries provided by the offline depot. As soon as you get hands on the complete /metadata folder, just replace it, restart the LCM services in both Fleet and SDDC Managers and the new content is available.
Looking at the different appliances we have in VCF9, these need a configured backup target:
Because of the past I am aware that the different appliances support different protocols for backup. Firstly I took a look at the supported protocols per appliance and figured, the common supported protocol is SFTP. As important backup is, but if I use a single backup target I refuse to enable different protocols. Every different protocol needs to get configured and managed and provides additional attack vectors because of opened ports. Yes, it is still a homelab and security is not the main focus, but still, if it is simple to add a bit more security, I take that chance.
My first idea was to configure a SFTP server on my QNAP NAS, but after a few minutes I figured out that the configuration is kind of complex to implement. So I reconsidered going for a dedicated linux server to work as SFTP server for appliance backups.
As always, before starting to ramp up a new service I created forward and reverse DNS entries!
I use Ubuntu server as an OS but this is not the idea to start a discussion about the different linux distributions. Implementing a SFTP server is straight forward. During the OS installation I checked the box to install the OpenSSH server. After the installation I reconfigured the SSHD configuration.
sudo vi /etc/ssh/sshd_config
The reason is to enable the SFTP subsystem including basic authentication to provide the least level of security. I left more lines commented to save the initial configuration. That is not necessary, but that's the way I prefer changing config files.
# override default of no subsystems
# Subsystem sftp /usr/lib/openssh/sftp-server
Subsystem sftp internal-sftp
# Example of overriding settings on a per-user basis
# Match User anoncvs
# X11Forwarding no
# AllowTcpForwarding no
# PermitTTY no
# ForceCommand cvs server
# custom configuration
HostKeyAlgorithms=+ssh-rsa
PubkeyAcceptedAlgorithms=+ssh-rsa
Match Group sftpgroup
ChrootDirectory %h
X11Forwarding no
AllowTcpForwarding no
ForceCommand internal-sftp
I added this block as text that you can copy and paste it into your own configuration file. In the end mine looks like this:
The two algorithm parameters are important for VMware products to work.
Finally add a group and a user for authentication.
sudo groupadd sftpusers
sudo useradd -m -G sftpusers -s /sbin/nologin sftp-backup
sudo passwd sftp-backup
Finally file system permissions and backup folder
I added a dedicated mount point to store the backups. Firstly to increase manageability and secondly to prevent linux from malfunctioning because of a filled partition. I added a volume to the /backupData mount point. Assuming the folder and mount point are configured, next is change the ownership to make it work.
sudo chown sftp-backup:sftpusers /backupData
Restart the SSH service and add needed firewall rules and we are good to go.
Once doing the first backups I found out, it's not working for VCF Automation. The reason, VCFA needs access to the lost+found folder inside the backup directory. For some reason it had root:root configured as ownership. I changed this one to sftp-backup:sftpusers too.
sudo chown -R sftp-backup:sftpusers /backupData/lost+found
Now we have a backup target via SFTP protocol for our appliances. For the vCenter I used the VAMI to configure, NSX manager is configured inside NSX and VCFA & Fleet are configured via the fleet management inside VCF Operations.
The backup scheduler for Fleet and VCFA is configured in the "Backup Settings" section above the "SFTP Settings".
Next on the list is the SDDC Manager offline depot connection. I found multiple articles describing how to import custom certificates via the Developer Center and the API. As I am using a self-signed certificate from the QNAP certification app itself, I can't a ca certificate and therefor not able creating the chain for validation. So the SDDC manager refused to establish an encrypted connection. My next idea was, if encryption is not working, let's try the same trick with the SDDC Manager I did with the VCF installer appliance. I logged in and changed the application-prod.properties file.
The is located here: /opt/vmware/vcf/lcm/lcm-app/conf/application-prod.properties
PLEASE NOTE: before changing the file the permissions need to changed because the file is read only. Please change back to 400 once the change is implemented.
I decided to add the parameter lcm.depot.adapter.httpsEnabled=false in the LCM DEPOT PROPERTIES section and it worked perfectly fine.
Via SSH login to SDDC Manager as user vcf and su - root to gain access to the configuration file.
Last step is to restart the LCM service to apply the changes.
systemctl restart lcm
Voila: the offline depot connection in SDDC Manager worked like a charm and I was able to download the needed binaries to the fleet management. If storage space on the fleet manager gets filled, just delete unused binaries to free up space.
As we already know, the NSX edge don't like the AMD Ryzen CPUs. Thanks to William Lam (at least I found the information on his blog) there is now an ESX host based fix. Login to each ESX via SSH as root and issue the following command:
echo 'cpuid.brandstring = "AMD EPYC Ryzen 9 9955HX"' >> /etc/vmware/config
That makes the ESX host believe it has an EPYC AMD CPU builtin. If this fix is persistent across further ESX host upgrade? I don't know, we will see once 9.0.3 will be available.
Once all prerequisites are met we start the upgrade procedure. Keep in mind, stick with the Broadcom recommended order, otherwise things might break, the upgrade won't work, you maneuver yourself into a messed up state.
The correct order which I found is:
All products not listed are be upgraded at your own preference, that's at least my understanding. Please consult the Product interoperability matrix before finalizing you individual upgrade procedure.
During the upgrade for me it sometimes was a bit challenging find the "upgrade" button to click, but it is always the same. If you're confronted with a new UI, it takes some time to get used to it.
Some buttons are pretty obvious
Others are a bit hidden and, at least for me in first place, a bit misleading, like installing the upgrade for the VCF Fleet Manager.
You need to click "New Patch" to select the upgrade binaries. All other products are showing that there is a new patch / upgrade available and lead direct to the guided upgrade process.
During the upgrade of the ESX hosts you receive some errors about the NVMe disk controller and NVMe devices. Please double-check if all Warnings and Errors are only around those devices. If yes, you can skip the Errors and continue with the upgrade.
If hope my article helps that your VCF homelab upgrade procedure runs really smooth. Please don't forget, the workarounds discussed in this article are for your homelab and should be user in any production environment.