Oracle VM and multiple local disks

For my Oracle VM test environment I have a server available with multiple internal disks of different size and speed. So I was wondering if it is possible to have all these disks used together for my virtual machines in Oracle VM.

If all disks would have been the same size and speed, I could easily use the internal raid controller to put them in mirror, stripe or raid5 and end up with one large volume, alias disk, for my Oracle VM. However due to the differences in characteristics of the disks (speed/size) this is not a good idea. So I started to look in Oracle VM Manager (the java console) to see what is possible.

It turned out soon to me that Oracle VM is designed for a different architecture: in fact the desired setup is to dispose of a (large) SAN box with shared storage that is available to multiple servers. Then all these servers can be put in a server pool, sharing the same storage. This setup allows live migration of running machines to another physical server. Of course this makes sense because it fits nicely in the concept of grid computing: if any physical server fails, just restart your virtual machine on another one, and add machines according to your performance needs. But it doesn’t help me: I don’t have got one storage with multiple servers, but I have one server with multiple disks.

So I started to browse a little in all the executables of the OVM installation, and I found under /usr/lib/ovs the ovs-makerepo script. According to me the architecture is as follows (as far as I can find on the internet, because there is not much clear documentation on this): when installing OVM, you have a /boot a / and a swap partition (just as in traditional linux) and OVM requires one large partition to be used for virtual machines, which will be mounted under /OVS. In this partition you find subdirectories “running_pool” which contains all the virtual machines that you have created and that you can start, and a subdirectory “seed_pool” which contains templates you can start from for creating new machines. There is also “local”, “remote” and “publish_pool”, however they were irrelevant for me at the moment and I didn’t try to figure out what they are used for.

With this in mind I can install Oracle VM on my first disk and end up with 4 partitions on /dev/sda:

   Filesystem 1K-blocks     Used Available Use% Mounted on
   /dev/sda1     248895    25284    210761  11% /boot
   (sda2 is swap)
   /dev/sda3    4061572   743240   3108684  20% /
   /dev/sda4   24948864 22068864   2880000  89% /OVS

With this in mind I now want to add the space on my second disk (/dev/sdb) to this setup. So first I create one large partition on the disk using fdisk. Then I create an ocfs file system on it as follows:

[root@nithog ovs]# mkfs.ocfs2 /dev/sdb1
mkfs.ocfs2 1.2.7
Filesystem label=
Block size=4096 (bits=12)
Cluster size=4096 (bits=12)
Volume size=72793694208 (17771898 clusters) (17771898 blocks)
551 cluster groups (tail covers 31098 clusters, rest cover 32256 clusters)
Journal size=268435456
Initial number of node slots: 4
Creating bitmaps: done
Initializing superblock: done
Writing system files: done
Writing superblock: done
Writing backup superblock: 4 block(s)
Formatting Journals: done
Writing lost+found: done
mkfs.ocfs2 successful

Initially I created the file system as ext3 which worked well. However there was one strange thing. This is what you get:

  • Create a new (paravirtualized) (linux) virtual machine in this new (ext3-based) repository (see later how exactly)
  • Specify a disk of e.g. 2Gb
  • Complete the wizard
  • This prepares a machine where you can start using the linux installer on the console to install the machine (do not start to install yet)
  • Now look in …/running_pool/machine_name and see a file of 2Gb
  • Now do du -sk on …/running_pool/machine and see that only 20Kb is used
  • From the moment you start to partition your disk inside the virtual machine, the output of “du -sk” grows the same amount as the data you really put in it. So it behaves a bit like ‘dynamic provisioning’.
  • Note however that ls -l shows a file of 2Gb at any time

I don’t know for the moment if this behaviour is caused by the fact that the file system is ext3, but anyway, I leave it up to you to judge if this is an advantage or a disadvantage.

Now when trying to add my new sdb1 partition as an extra repository, I got:

Usage:

[root@nithog ~]# /usr/lib/ovs/ovs-makerepo
 usage: /usr/lib/ovs/ovs-makerepo <source> <shared> <description>
        source: block device or nfs path to filesystem
        shared: filesystem shared between hosts?  1 or 0
        description: descriptive text to be displayed in manager

Execution:

   [root@nithog ovs]# /usr/lib/ovs/ovs-makerepo /dev/sdb1 0 "Repo on disk 2" 
   ocfs2_hb_ctl: Unable to access cluster service while starting heartbeat mount.ocfs2: 
   Error when attempting to run /sbin/ocfs2_hb_ctl: "Operation not permitted" 
   Error mounting /dev/sdb1

Seems like the script expects something like a cluster, but I just have a standalone node… I think that this script is intended to add a shared repository to a cluster of nodes. No problem, let’s try to convert our standalone machine to a one-node cluster: create the file /etc/ocfs2/cluster.conf:

cluster:
        node_count = 1
        name = ocfs2
node:
        ip_port = 7777
        ip_address = 10.7.64.160
        number = 1
        name = nithog
        cluster = ocfs2

Note that the indented lines MUST start with a <TAB> and then the parameter with its value. After creating this file I could do:

   [root@nithog ovs]# /etc/init.d/o2cb online ocfs2
   Starting O2CB cluster ocfs2: OK

and then
[root@nithog ovs]# /usr/lib/ovs/ovs-makerepo /dev/sdb1 0 "Repo on disk 2" Initializing NEW repository /dev/sdb1 SUCCESS: Mounted /OVS/877DECC5B658433D9E0836AFC8843F1B Updating local repository list. ovs-makerepo complete

As you can see, an extra subdirectory is created in the /OVS file system, with a strange UUID as its name. Under this directory my new file system /dev/sdb1 is mounted. This file system is a real new repository, because under /OVS/877DECC5B658433D9E0836AFC8843F1B you find as well the running_pool and seed_pool directories. It is also listed in /etc/ovs/repositories (but it is NOT recommended to edit this file manually).

Then I looked in the Oracle VM Manager (the java based web gui) but I didn’t find anything of this new repository. It looks as if this gui is not (yet) designed to handle multiple repositories. However I started to figure out if my new disk could really be used for virtual machines, and my results are:

  • When creating a new virtual machine, you have no chance of specifying in which repository it has to come
  • It seems to come in the repository where there is the most amount of free space (but I should do more testing to get 100% certainty)
  • When adding a new disk to an existing virtual machine (an extra file on oracle-vm level) the file will come in the same repository, even the same directory as where the initial files of your virtual machine are located. If there is NOT enough free space on the disk, Oracle VM will NOT put your file in another repository on another disk.
  • You can move the datafiles of your virtual machine to any other location while the machine is not running, and while changing the reference to the file in /etc/xen/<machine_name>
  • So actually it looks that on xen-level you can put your vm datafiles in any directory; the concept of the repositories seems to be oracle-vm specific.
  • So if you create a new virtual machine and Oracle puts it in the wrong repository, it is not difficult at all to move it afterwards to another filesystem/repostory. It just requires a little manual intervention. However it seems recommended to keep your machines always in an oracle-vm repository, in the running_pool, because only in that way it can be managed by the Oracle-vm gui.

I am sure that there are many things that have an abvious explanation, but I have to admit that I didn’t read the manuals of ocfs and oracle vm completely from the start to the end. Also I think that Oracle

Conclusion: Oracle VM seems to be capable of having multiple repositories on different disks, but the GUI is not ready to handle them. But with a minimum of manual intervention, it is easy to do all desired tasks in command-line mode.

Advertisements

4 Responses to Oracle VM and multiple local disks

  1. Paul says:

    The reason for the du -sk not showing a file size of 2GB is that the file is a sparse file. I am unsure if OCFS supports sparse files.

    Tp create a 4GB sparse file, do something like this:
    dd if=/dev/zero of=system.img bs=1024 seek=4194304 count=0

    The file should be created in under a second, ls will show it as being 4GB, but it will utilise no disk space.

  2. Thanks for the informative article.

    I did much of the same trail blazing and I documented a procedure to add storage to a VM guest (outside of VM Manager) without moving the initial installation files. It is done basically by:

    – make new device available to VM Host
    – on your new storage device, create a new repository using ovs-makerepo (as you have done in your test)
    – create an empty image file
    – create a sym link in the original reporitory for the new img file
    – shutdown the guest
    – add a reference to the new img file link in vm.cfg
    – start the guest
    – logon to the guest..fdisk -l and you should see your new cross repository device

    You are correct on VM Managers limitations. It will not know of the new disks and not allow you to add cross-repository storage. Hopefully more functionality to come. Also, I’ve read the entire manual and it doesn’t cover this topic in much detail.

  3. Tomas says:

    To answer the question of if ocfs2 support sparse files, the answer is yes, quoted from http://oss.oracle.com/projects/ocfs2/dist/documentation/v1.6/ocfs2-1_6-usersguide.pdf : “OCFS2 Release 1.4 was released in July 2008. It was available on all three Enterprise Linux distributions, namely, Oracle Linux, Red Hat’s EL and Novell’s SLES. The new
    features in that release included sparse files, unwritten extents, inline-data, and shared
    writeable mmap.”

  4. Quentin says:

    Ur posting, Modern Window Treatments “Oracle VM and
    multiple local disks | Oracle at work” was in fact well
    worth writing a comment down here in the comment section!
    Just simply desired to state u really did a tremendous work.
    Regards -Felipe

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: