Tag Archives: xen

Installing EL7 onto EL5 Xen hosts

With RedHat recently releasing RHEL 7 (and CentOS promptly getting their rebuild out the door shortly after), I decided to take the opportunity to start upgrading some of my ageing RHEL/CentOS (EL) systems.

My personal co-location server is a trusty P4 3.0Ghz box running EL 5 for both host and Xen guests. Xen has lost some popularity in favour of HVM solutions like KVM, however it’s still a great hypervisor and can run Linux guests really nicely on even hardware as old as mine that lacks HVM CPU extensions.

Considering that EL 5, 6 and 7 are all still supported by RedHat, I would expect that installing EL 7 as a guest on EL 5 should be easy – and to be fair to RedHat it mostly is, the installation was pretty standard.

Like EL 5 guests, EL 7 guests can be installed entirely from the command line using the standard virt-install command – for example:

$ virt-install --paravirt \
 --name MyCentOS7Guest \
 --ram 1024 \
 --vcpus 1 \
 --location http://mirror.centos.org/centos/7/os/x86_64/ \
 --file /dev/lv_group/MyCentOS7Guest \
 --network bridge=xenbr0

One issue I had is that the installer no longer prompts for network information to use to download the rest of the installer and instead assumes you have a DHCP server, an assumption that isn’t always correct. If you want to force it to use a static address, append the following parameters to the virt-install command.

 -x 'ip=192.168.1.20 netmask=255.255.255.0 dns=8.8.8.8 gateway=192.168.1.1'

The installer will proceed and give you an option to either use VNC to get a graphical installer, or to accept the more basic/limited text mode installer. In my case I went with the text mode installer, generally this is fine for average installations, except that it doesn’t give you a lot of control over partitioning.

Installation completed successfully, but I was not able to subsequently boot the new guest, with an error being thrown about pygrub being unable to find the boot partition.

# xm create -c vmguest
Using config file "./vmguest".
Traceback (most recent call last):
  File "/usr/bin/pygrub", line 774, in ?
    raise RuntimeError, "Unable to find partition containing kernel"
RuntimeError: Unable to find partition containing kernel
No handlers could be found for logger "xend"
Error: Boot loader didn't return any data!
Usage: xm create <ConfigFile> [options] [vars]

 

Xen works a little differently than VMWare/KVM/VirtualBox in that it doesn’t try to emulate hardware unnecessarily in paravirtualised mode, so there’s no BIOS. Instead Xen ships with a tool called pygrub, that is essentially an application that implements grub and goes through the process of reading the guest’s /boot filesystem, displaying a grub interface using the config in /boot, then when a kernel is selected grabs the kernel and associated information and launches the guest with it.

Generally this works well, certainly you can boot any of your EL 5 guests with it as well as other Linux distributions with Xen paravirtulised compatible kernels (it’s merged into upstream these days).

However RHEL has moved on a bit since 2007 adding a few new tricks, such as replacing Grub with Grub2 and moving from the typical ext3 boot partition to an xfs boot partition. These changes confuse the much older utilities written for Xen, leaving it unable to read the boot loader data and launch the guest.

The two main problems come down to:

  1. EL 5 can’t read the xfs boot partition created by default by EL 7 hosts. Even if you install optional xfs packages provided by centosplus/centosextras, you still can’t read the filesystem due to the version of xfs being too new for it to comprehend.
  2. The version of pygrub shipped with EL 5 doesn’t have support for Grub2. Well, technically it’s supposed to according to RedHat, but I suspect they forgot to merge in fixes needed to make EL 7 boot.

I hope that RedHat fix this deficiency soon, presumably there will be RedHat customers wanting to do exactly what I’m doing who will apply some pressure for a fix, however until then if you want to get your shiny new EL 7 guests installed, I have a bunch of workarounds for those whom are not faint of heart.

 

For these instructions, I’m assuming that your guest is installed to /dev/lv_group/vmguest, however these instructions should work equally for image files or block devices.

Firstly, we need to check what the state of the /boot partition is – we need to make sure it is an ext3 volume, or convert it if not. If you installed via the limited text mode installer, it will be an xfs partition, however if you installed via VNC, you might be able to change the type to ext3 and avoid the next few steps entirely.

We use kpartx -a and -d respectively to expose the partitions inside the block device so we can manipulate the contents. We then use the good ol’ file command to check what type of filesystem is on the first partition (which is presumably boot).

# kpartx -a /dev/lv_group/vmguest
# file -sL /dev/mapper/vmguestp1
/dev/mapper/vmguestp1: SGI XFS filesystem data (blksz 4096, inosz 256, v2 dirs)
# kpartx -d /dev/lv_group/vmguest

Being xfs, we’re probably unable to do much – if we install xfsprogs (from centos extras), we can verify it’s unreadable by the host OS:

# yum install xfsprogs
# xfs_check /dev/mapper/vmguestp1
bad sb version # 0xb4b4 in ag 0
bad sb version # 0xb4a4 in ag 1
bad sb version # 0xb4a4 in ag 2
bad sb version # 0xb4a4 in ag 3
WARNING: this may be a newer XFS filesystem.
#

Technically you could fix this by upgrading the kernel, but EL 5’s kernel is a weird monster that includes all manor of patches for Xen that were never included into upstream, so it’s not a simple (or even feasible) operation.

We can convert the filesystem from xfs to ext3 by using another newer Linux system. First we need to export the boot volume into an image file:

# dd if=/dev/mapper/vmguestp1  | bzip2 > /tmp/boot.img.bz2

Then copy the file to another host, where we will unpack it and recreate the image file with ext3 and the same contents.

$ bunzip2 boot.img.bz2
$ mkdir tmp1 tmp2
$ sudo mount -t xfs -o loop boot.img tmp1/
$ sudo cp -avr tmp1/* tmp2/
$ sudo umount tmp1/
$ mkfs.ext3 boot.img
$ sudo mount -t ext3 -o loop boot.img tmp1/
$ sudo cp -avr tmp2/* tmp1/
$ sudo umount tmp1
$ rm -rf tmp1 tmp2
$ mv boot.img boot-new.img
$ bzip2 boot-new.img

Copy the new file (boot-new.img) back to the Xen host server and replace the guest’s/boot volume with it.

# kpartx -a /dev/lv_group/vmguest
# bzcat boot-new.img.bz2 > /dev/mapper/vmguestp1
# kpartx -d /dev/lv_group/vmguest

 

Having fixed the filesystem, Xen’s pygrub will be able to read it, however your guest still won’t boot. :-( On the plus side, it throws a more useful error showing that it could access the filesystem, but couldn’t parse some data inside it.

# xm create -c vmguest
Using config file "./vmguest".
Using <class 'grub.GrubConf.Grub2ConfigFile'> to parse /grub2/grub.cfg
Traceback (most recent call last):
  File "/usr/bin/pygrub", line 758, in ?
    chosencfg = run_grub(file, entry, fs)
  File "/usr/bin/pygrub", line 581, in run_grub
    g = Grub(file, fs)
  File "/usr/bin/pygrub", line 223, in __init__
    self.read_config(file, fs)
  File "/usr/bin/pygrub", line 443, in read_config
    self.cf.parse(buf)
  File "/usr/lib64/python2.4/site-packages/grub/GrubConf.py", line 430, in parse
    setattr(self, self.commands[com], arg.strip())
  File "/usr/lib64/python2.4/site-packages/grub/GrubConf.py", line 233, in _set_default
    self._default = int(val)
ValueError: invalid literal for int(): ${next_entry}
No handlers could be found for logger "xend"
Error: Boot loader didn't return any data!

At a glance, it looks like pygrub can’t handle the special variables/functions used in the EL 7 grub configuration file, however even if you remove them and simplify the configuration down to the core basics, it will still blow up.

# xm create -c vmguest
Using config file "./vmguest".
Using <class 'grub.GrubConf.Grub2ConfigFile'> to parse /grub2/grub.cfg
WARNING:root:Unknown image directive load_video
WARNING:root:Unknown image directive if
WARNING:root:Unknown image directive else
WARNING:root:Unknown image directive fi
WARNING:root:Unknown image directive linux16
WARNING:root:Unknown image directive initrd16
WARNING:root:Unknown image directive load_video
WARNING:root:Unknown image directive if
WARNING:root:Unknown image directive else
WARNING:root:Unknown image directive fi
WARNING:root:Unknown image directive linux16
WARNING:root:Unknown image directive initrd16
WARNING:root:Unknown directive source
WARNING:root:Unknown directive elif
WARNING:root:Unknown directive source
Traceback (most recent call last):
  File "/usr/bin/pygrub", line 758, in ?
    chosencfg = run_grub(file, entry, fs)
  File "/usr/bin/pygrub", line 604, in run_grub
    grubcfg["kernel"] = img.kernel[1]
TypeError: unsubscriptable object
No handlers could be found for logger "xend"
Error: Boot loader didn't return any data!
Usage: xm create <ConfigFile> [options] [vars]

Create a domain based on <ConfigFile>

At this point it’s pretty clear that pygrub won’t be able to parse the configuration file, so you’re left with two options:

  1. Copy the kernel and initrd file from the guest to somewhere on the host and set Xen to boot directly using those host-located files. However then kernel updating the guest is a pain.
  2. Backport a working pygrub to the old Xen host and use that to boot the guest. This requires no changes to the Grub2 configuration and means your guest will seamlessly handle kernel updates.

Because option 2 is harder and more painful, I naturally chose to go down that path, backporting the latest upstream Xen pygrub source code to EL 5. It’s not quite vanilla, I had to make some tweaks to rip out a couple newer features that were breaking it on EL 5, so I’ve packaged up my version of pygrub and made it available in both source and binary formats.

Download Jethro’s pygrub backport here

Installing this *will* replace the version installed by the Xen package – this means an update to the package on the host will undo these changes – I thought about installing it to another path or making an RPM, but my hope is that Red Hat get their Xen package fixed and make this whole blog post redundant in the first place so I haven’t invested that level of effort.

Copy to your server and unpack with:

# tar -xkzvf xen-pygrub-6f96a67-JCbackport.tar.gz
# cd xen-pygrub-6f96a67-JCbackport

Then you can build the source into a python module and install with:

# yum install xen-devel gcc python-devel
# python setup.py build
running build
running build_py
creating build
creating build/lib.linux-x86_64-2.4
creating build/lib.linux-x86_64-2.4/grub
copying src/GrubConf.py -> build/lib.linux-x86_64-2.4/grub
copying src/LiloConf.py -> build/lib.linux-x86_64-2.4/grub
copying src/ExtLinuxConf.py -> build/lib.linux-x86_64-2.4/grub
copying src/__init__.py -> build/lib.linux-x86_64-2.4/grub
running build_ext
building 'fsimage' extension
creating build/temp.linux-x86_64-2.4
creating build/temp.linux-x86_64-2.4/src
creating build/temp.linux-x86_64-2.4/src/fsimage
gcc -pthread -fno-strict-aliasing -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fPIC -I../../tools/libfsimage/common/ -I/usr/include/python2.4 -c src/fsimage/fsimage.c -o build/temp.linux-x86_64-2.4/src/fsimage/fsimage.o -fno-strict-aliasing -Werror
gcc -pthread -shared build/temp.linux-x86_64-2.4/src/fsimage/fsimage.o -L../../tools/libfsimage/common/ -lfsimage -o build/lib.linux-x86_64-2.4/fsimage.so
running build_scripts
creating build/scripts-2.4
copying and adjusting src/pygrub -> build/scripts-2.4
changing mode of build/scripts-2.4/pygrub from 644 to 755

# python setup.py install

Naturally I recommend reviewing the source code and making sure it’s legit (you do trust random blogs right?) but if you can’t get it to build/lack build tools/like gambling, I’ve included pre-built binaries in the archive and you can just do

# python setup.py install

Then do a quick check to make sure pygrub throws it’s help message, rather than any nasty errors indicating something went wrong.

# /usr/bin/pygrub

 

We’re almost ready to try booting again! First create a directory that the new pygrub expects:

# mkdir /var/run/xend/boot/

Then launch the machine creation – this time, it should actually boot and run through the usual systemd startup process. If you installed with /boot set to ext3 via the installer, everything should just work and you’ll be up and running!

If you had to do the xfs to ext3 conversion trick, the bootup process will explode with scary errors like the following:

.......
[ TIME ] Timed out waiting for device dev-disk-by\x2duuid-245...95b2c23.device.
[DEPEND] Dependency failed for /boot.
[DEPEND] Dependency failed for Local File Systems.
[DEPEND] Dependency failed for Relabel all filesystems, if necessary.
[DEPEND] Dependency failed for Mark the need to relabel after reboot.
[  101.134423] systemd-journald[414]: Received request to flush runtime journal from PID 1
[  101.658465] type=1305 audit(1405735466.679:4): audit_pid=476 old=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:auditd_t:s0 res=1
Welcome to emergency mode! After logging in, type "journalctl -xb" to view
system logs, "systemctl reboot" to reboot, "systemctl default" to try again
to boot into default mode.
Give root password for maintenance
(or type Control-D to continue):

The issue is that the conversion of the filesystem changed it’s UUID, plus the filesystem type in /etc/fstab no longer matches.

We can fix this easily by dropping to the recovery shell by entering the root password above and executing the following commands:

guest# sed -i -e '/boot/ s/UUID=[0-9\-]*/\/dev\/xvda1/' /etc/fstab
guest# sed -i -e '/boot/ s/xfs/ext3/' /etc/fstab
guest# cat /etc/fstab | grep '/boot'

Make sure the cat returns a valid /boot line, it should be using /dev/xvda1 as the device and ext3 as the filesystem now.

Finally, stop and start the instance (reboots seem to hang for me):

guest# shutdown -h now
xm create -c vmguest1

It should now boot correctly! Go forth and enjoy your new VM!

CentOS Linux 7 (Core)
Kernel 3.10.0-123.el7.x86_64 on an x86_64

This is certainly a hack – doing this backport of pygrub solved my personal issue, but it’s entirely possible it may break other things, so do your own testing and determine whether it’s suitable for you and your environment or not.

virt-viewer remote access tricks

Sometimes I need to connect directly to the console of my virtual machines, typically this is usually when working with development or experimental VMs where SSH/RDP/VNC isn’t working for whatever reason, or when I’m installing a new OS entirely.

To view virtual machines using libvirt (by both KVM or Xen), you use the virt-viewer command, this launches a window and establishes a VNC or SPICE connection into the virtual machine.

Historically I’ve just run this by SSHing into the virtual machine host and then using X11 forwarding to display the virtual machine window on my laptop. However this performs really badly on slow connections, particularly 3G where it’s almost unusable due to the design of X11 forwarding not being particularly efficient.

However virt-viewer has the capability to run locally and connect to a remote server, either directly to the libvirt daemon, or via an SSH tunnel. To do the latter, the following command will work for KVM (qemu) based hypervisors:

virt-viewer --connect qemu+ssh://user@host.example.com/system vmnamehere

With the above, you’ll have to enter your SSH password twice – first to establish the connection to the hypervisor and secondly to establish a tunnel to the VM’s VNC/SPICE session – you’ll probably quickly decide to get some SSH keys/certs setup to prevent annoyance. ;-)

This performs way faster than X11 forwarding, plus the UI of virt-manager stays much more responsive, including grabbing/ungrabbing of the local keyboard/mouse, even if the connection or server is lagging badly.

If you’re using Xen with libvirt, the following should work (I haven’t tested this, but based on the man page and some common sense):

virt-viewer --connect xen+ssh://user@host.example.com/ vmnamehere

If you wanted to open up the right ports on your server’s firewall and are sending all traffic via a secure connection (eg VPN), you can drop the +ssh and use –direct to connect directly to the hypervisor and VM without port forwarding via SSH.

How Jethro Geeks – IRL

A number of friends are always quite interested in how my personal IT infrastructure is put together, so I’m going to try and do one post a week ranging from physical environments, desktop, applications, server environments, monitoring and architecture.

Hopefully this is of interest to some readers – I’ll be upfront and advise that not everything is perfect in this setup, like any large environment there’s always ongoing upgrade projects, considering my environment is larger than some small ISPs it’s not surprising that there’s areas of poor design or legacy components, however I’ll try to be honest about these deficiencies and where I’m working to make improvements.

If you have questions or things you’d like to know my solution for, feel free to comment on any of the posts in this series. :-)

 

Today I’m examining my physical infrastructure, including my workstation and my servers.

After my move to Auckland, it’s changed a lot since last year and is now based around my laptop and gaming desktop primarily.

All the geekery, all the time

This is probably my most effective setup yet, the table was an excellent investment at about $100 off Trademe, with enough space for 2 workstations plus accessories in a really comfortable and accessible form factor.

 

My laptop is a Lenovo Thinkpad X201i, with an Intel Core i5 CPU, 8GB RAM, 120GB SSD and a 9-cell battery for long run time. It was running Fedora, but I recently shifted to Debian so I could upskill on the Debian variations some more, particularly around packaging.

I tend to dock it and use the external LCD mostly when at home, but it’s quite comfortable to use directly and I often do when out and about for work – I just find it’s easier to work on projects with the larger keyboard & screen so it usually lives on the dock when I’m coding.

This machine gets utterly hammered, I run this laptop 24×7, typically have to reboot about once every month or so, usually from issues resulting with a system crash from docking or suspend/resume – something I blame the crappy Lenovo BIOS for.

 

I have an older desktop running Windows XP for gaming, it’s a bit dated now with only a Core 2 Duo and 3GB RAM – kind of due for a replacement, but it still runs the games I want quite acceptably, so there’s been little pressure to replace – plus since I only really use it about once a week, it’s not high on my investment list compared to my laptop and servers.

Naturally, there are the IBM Model M keyboards for both systems, I love these keyboards more than anything (yes Lisa, more than anything <3 ) and I’m really going to be sad when I have to work in an office with other people again whom don’t share my love for loud clicky keyboards.

The desk is a bit messy ATM with several phones and routers lying about for some projects I’ve been working on, I’ll go through stages of extreme OCD tidiness to surrendering to the chaos… fundamentally I just have too much junk to go on it, so trying to downsize the amount of stuff I have. ;-)

 

Of course this is just my workstations – there’s a whole lot going on in the background with my two physical servers where the real stuff happens.

A couple years back, I had a lab with 2x 42U racks which I really miss. These days I’m running everything on two physical machines running Xen and KVM virtualisation for all services – it was just so expensive and difficult having the racks, I’d consider doing it again if I brought a house, but when renting it’s far better to be as mobile as possible.

The primary server is my colocation box which runs in a New Zealand data center owned by my current employer:

Forever Alone :'( [thanks to my colleagues for that]

It’s an IBM xseries 306m, with 3.0Ghz P4 CPU, 8GB of RAM and 2x 1TB enterprise grade SATA drives, running CentOS (RHEL clone). It’s not the fastest machine, but it’s more than speedy enough for running all my public-facing production facing services.

It’s a vendor box as it enabled me to have 3 yrs onsite NBD repair support for it, these days I have a complete hardware spare onsite since it’s too old to be supported by IBM any longer.

To provide security isolation and easier management, services are spread across a number of Xen virtual machines based on type and risk of attack, this machine runs around 8 virtual machines performing different publicly facing services including running my mail servers, web servers, VoIP, IM and more.

 

For anything not public-facing or critical production, there’s my secondary server, which is a “whitebox” custom build running a RHEL/CentOS/JethroHybrid with KVM for virtualisation, running from home.

Whilst I run this server 24×7, it’s not critical for daily life, so I’m able to shut it down for a day or so when moving house or internet providers and not lose my ability to function – having said that, an outage for more than a couple days does get annoying fast….

Mmmmmm my beautiful monolith

This attractive black monolith packs a quad core Phenom II CPU, custom cooler, 2x SATA controllers, 16GB RAM, 12x 1TB hard drives in full tower Lian Li case. (slightly out-of-date spec list)

I’m running RHEL with KVM on this server which allows me to run not just my internal production Linux servers, but also other platforms including Windows for development and testing purposes.

It exists to run a number of internal production services, file shares and all my development environment, including virtual Linux and Windows servers, virtual network appliances and other test systems.

These days it’s getting a bit loaded, I’m using about 1 CPU core for RAID and disk encryption and usually 2 cores for the regular VM operation, leaving about 1 core free for load fluctuations. At some point I’ll have to upgrade, in which case I’ll replace the M/B with a new one to take 32GB RAM and a hex-core processor (or maybe octo-core by then?).

 

To avoid nasty sudden poweroff issues, there’s an APC UPS keeping things running and a cheap LCD and ancient crappy PS/2 keyboard attached as a local console when needed.

It’s a pretty large full tower machine, so I except to be leaving it in NZ when I move overseas for a while as it’s just too hard to ship and try and move around with it – if I end up staying overseas for longer than originally planned, I may need to consider replacing both physical servers with a single colocated rackmount box to drop running costs and to solve the EOL status of the IBM xseries.

 

The little black box on the bookshelf with antennas is my Mikrotik Routerboard 493G, which provides wifi and wired networking for my flat, with a GigE connection into the server which does all the internet firewalling and routing.

Other than the Mikrotik, I don’t have much in the way of production networking equipment – all my other kit is purely development only and not always connected and a lot of the development kit I now run as VMs anyway.

 

Hopefully this is of some interest, I’ll aim to do one post a week about my infrastructure in different areas, so add to your RSS reader for future updates. :-)

Virtualbox Awesomeness

Work recently upgraded us to the latest MS Office edition for our platform. Most of our staff run MacOS, but we have a handful of Windows users and one dedicated Linux user (guess who?) who received MS Office 2010 for Windows.

I’ve been using MS Office 2007 under Wine for several years, it was never perfect, but about 90% of the functionality worked with some exceptions such as PDF export and certain UI and performance artifacts.

With the 2010 upgrade I decided to instead switch to using Windows under a VM on my laptop to avoid any headaches and to fix the missing features and performance issues experienced running Office under Wine.

Whilst I’m a fan of Xen and KVM, they aren’t so well suited for desktop virtualisation as they’re designed more for server environments and don’t offer some of the more desktop focused features such as seamless integration, video acceleration and easy point & click management interfaces.

Instead I went with VirtualBox thanks to it being mostly open source (open source with exception for a few extensions for USB 2.0 forwarding and network boot) and with a pretty good reputation as a decent VM application.

It also has some of the user-friendly desktop features you’d expect such as being able to forward USB hardware through to guest, mounting any folder on the host as a network share (without needing to setup samba) and 2D/3D video acceleration.

But the real killer feature for me was the seamless windows feature, which allows me to boot the virtual windows desktop and Windows applications alongside my Linux applications smoothly and without the nastiness of an RDP window.

Windows & Linux application windows running together concurrently.

Sadly it’s not quite good enough for you to be able to run the latest Windows games in as the 3D acceleration is quite basic, but it’s magnificent for just about any other non-multimedia application.

The only glitch I found, is that if you have dual screens, you can only run the windows session on one screen at a time, although virtualbox does allow moving the session between monitors whilst running so it’s not too big a deal.

The other annoying issue I had with virtualbox is that it uses image files for storing the guest VMs and it doesn’t appear possible to get it to use an LVM volume instead – so in my case, I waste a bit of space and performance for unnecessary filesystem formatting to store the Windows VM. I guess this is a feature that only a small subset of users would want so it’s not particularly high priority for them to add it.

I’m running Win7 with 2 virtual cores and 1GB of RAM on top of a host with an Intel Core i5 CPU (with hardware virtualisation enabled), 8GB RAM and a Intel 320 series SSD and it’s pretty damn snappy.

As a side note, the seemless window integration also works for Linux-based guests, so you could also do the same ontop of a Windows host, or even Linux-on-Linux if desired.