The ultimate story about OCR, OCRMIRROR and 2 storage boxes – Chapter 1

Sep 1, 2009

Scenario 1: loss of ocrmirror, both nodes up

(This is the followup of article “Introduction“)

Facts
  • CRS is running on all nodes
  • The storage box containing the OCRmirror is made unavailable to both hosts (simulating a crash of one storage box).

What happens?
Read the rest of this entry »


The ultimate story about OCR, OCRMIRROR and 2 storage boxes – Introduction

Aug 28, 2009

Some time ago I wrote a blog about stretched clusters and the OCR. The final conclusion at that time was that there was no easy way to get your OCR safe on both storages, and hence I disrecommended clusters with 2 storage boxes. However, after some more investigation I may have to change my mind. I did extended testing on the OCR and in this blog I want to share my experiences. Read the rest of this entry »


Apex beats Access

Sep 21, 2008

This year I became part of the parents council of my childrens school. And like every small organisation in Belgium (football clubs, gymnastics, schools, …) it is the time of year to do the annual mussel-event for fund raising (mmm, delicious). My school happened to use an Access-IIS-ASP application to do the billing when people leave the event (do the calculation for their food and drink consumptions). But o fortuna, the evening before the event, the application turned out to systematically crash when entering data, and murphy oh murphy, the author of it had just become father and could not be reached. Panic of course. Read the rest of this entry »


A really recommended ASM patch – failing lun

Aug 6, 2008

The following is a real life experience about failing disks/luns and how ASM reacts to this. We used 10.2.0.4 on Solaris with a HDS storage box and MPXio. We made an ASM diskgroup of 2 mirrorred disks. Then we made the luns unavailable to the hosts (hide lun). The result was not really what we expected. Read the rest of this entry »


Oracle VM and multiple local disks

Jun 9, 2008

For my Oracle VM test environment I have a server available with multiple internal disks of different size and speed. So I was wondering if it is possible to have all these disks used together for my virtual machines in Oracle VM.

If all disks would have been the same size and speed, I could easily use the internal raid controller to put them in mirror, stripe or raid5 and end up with one large volume, alias disk, for my Oracle VM. However due to the differences in characteristics of the disks (speed/size) this is not a good idea. So I started to look in Oracle VM Manager (the java console) to see what is possible.

It turned out soon to me that Oracle VM is designed for a different architecture: in fact the desired setup is to dispose of a (large) SAN box with shared storage that is available to multiple servers. Then all these servers can be put in a server pool, sharing the same storage. This setup allows live migration of running machines to another physical server. Of course this makes sense because it fits nicely in the concept of grid computing: if any physical server fails, just restart your virtual machine on another one, and add machines according to your performance needs. But it doesn’t help me: I don’t have got one storage with multiple servers, but I have one server with multiple disks.

So I started to browse a little in all the executables of the OVM installation, and I found under /usr/lib/ovs the ovs-makerepo script. According to me the architecture is as follows (as far as I can find on the internet, because there is not much clear documentation on this): when installing OVM, you have a /boot a / and a swap partition (just as in traditional linux) and OVM requires one large partition to be used for virtual machines, which will be mounted under /OVS. In this partition you find subdirectories “running_pool” which contains all the virtual machines that you have created and that you can start, and a subdirectory “seed_pool” which contains templates you can start from for creating new machines. There is also “local”, “remote” and “publish_pool”, however they were irrelevant for me at the moment and I didn’t try to figure out what they are used for.

With this in mind I can install Oracle VM on my first disk and end up with 4 partitions on /dev/sda:

   Filesystem 1K-blocks     Used Available Use% Mounted on
   /dev/sda1     248895    25284    210761  11% /boot
   (sda2 is swap)
   /dev/sda3    4061572   743240   3108684  20% /
   /dev/sda4   24948864 22068864   2880000  89% /OVS

With this in mind I now want to add the space on my second disk (/dev/sdb) to this setup. So first I create one large partition on the disk using fdisk. Then I create an ocfs file system on it as follows:

[root@nithog ovs]# mkfs.ocfs2 /dev/sdb1
mkfs.ocfs2 1.2.7
Filesystem label=
Block size=4096 (bits=12)
Cluster size=4096 (bits=12)
Volume size=72793694208 (17771898 clusters) (17771898 blocks)
551 cluster groups (tail covers 31098 clusters, rest cover 32256 clusters)
Journal size=268435456
Initial number of node slots: 4
Creating bitmaps: done
Initializing superblock: done
Writing system files: done
Writing superblock: done
Writing backup superblock: 4 block(s)
Formatting Journals: done
Writing lost+found: done
mkfs.ocfs2 successful

Initially I created the file system as ext3 which worked well. However there was one strange thing. This is what you get:

  • Create a new (paravirtualized) (linux) virtual machine in this new (ext3-based) repository (see later how exactly)
  • Specify a disk of e.g. 2Gb
  • Complete the wizard
  • This prepares a machine where you can start using the linux installer on the console to install the machine (do not start to install yet)
  • Now look in …/running_pool/machine_name and see a file of 2Gb
  • Now do du -sk on …/running_pool/machine and see that only 20Kb is used
  • From the moment you start to partition your disk inside the virtual machine, the output of “du -sk” grows the same amount as the data you really put in it. So it behaves a bit like ‘dynamic provisioning’.
  • Note however that ls -l shows a file of 2Gb at any time

I don’t know for the moment if this behaviour is caused by the fact that the file system is ext3, but anyway, I leave it up to you to judge if this is an advantage or a disadvantage.

Now when trying to add my new sdb1 partition as an extra repository, I got:

Usage:

[root@nithog ~]# /usr/lib/ovs/ovs-makerepo
 usage: /usr/lib/ovs/ovs-makerepo <source> <shared> <description>
        source: block device or nfs path to filesystem
        shared: filesystem shared between hosts?  1 or 0
        description: descriptive text to be displayed in manager

Execution:

   [root@nithog ovs]# /usr/lib/ovs/ovs-makerepo /dev/sdb1 0 "Repo on disk 2" 
   ocfs2_hb_ctl: Unable to access cluster service while starting heartbeat mount.ocfs2: 
   Error when attempting to run /sbin/ocfs2_hb_ctl: "Operation not permitted" 
   Error mounting /dev/sdb1

Seems like the script expects something like a cluster, but I just have a standalone node… I think that this script is intended to add a shared repository to a cluster of nodes. No problem, let’s try to convert our standalone machine to a one-node cluster: create the file /etc/ocfs2/cluster.conf:

cluster:
        node_count = 1
        name = ocfs2
node:
        ip_port = 7777
        ip_address = 10.7.64.160
        number = 1
        name = nithog
        cluster = ocfs2

Note that the indented lines MUST start with a <TAB> and then the parameter with its value. After creating this file I could do:

   [root@nithog ovs]# /etc/init.d/o2cb online ocfs2
   Starting O2CB cluster ocfs2: OK

and then
[root@nithog ovs]# /usr/lib/ovs/ovs-makerepo /dev/sdb1 0 "Repo on disk 2" Initializing NEW repository /dev/sdb1 SUCCESS: Mounted /OVS/877DECC5B658433D9E0836AFC8843F1B Updating local repository list. ovs-makerepo complete

As you can see, an extra subdirectory is created in the /OVS file system, with a strange UUID as its name. Under this directory my new file system /dev/sdb1 is mounted. This file system is a real new repository, because under /OVS/877DECC5B658433D9E0836AFC8843F1B you find as well the running_pool and seed_pool directories. It is also listed in /etc/ovs/repositories (but it is NOT recommended to edit this file manually).

Then I looked in the Oracle VM Manager (the java based web gui) but I didn’t find anything of this new repository. It looks as if this gui is not (yet) designed to handle multiple repositories. However I started to figure out if my new disk could really be used for virtual machines, and my results are:

  • When creating a new virtual machine, you have no chance of specifying in which repository it has to come
  • It seems to come in the repository where there is the most amount of free space (but I should do more testing to get 100% certainty)
  • When adding a new disk to an existing virtual machine (an extra file on oracle-vm level) the file will come in the same repository, even the same directory as where the initial files of your virtual machine are located. If there is NOT enough free space on the disk, Oracle VM will NOT put your file in another repository on another disk.
  • You can move the datafiles of your virtual machine to any other location while the machine is not running, and while changing the reference to the file in /etc/xen/<machine_name>
  • So actually it looks that on xen-level you can put your vm datafiles in any directory; the concept of the repositories seems to be oracle-vm specific.
  • So if you create a new virtual machine and Oracle puts it in the wrong repository, it is not difficult at all to move it afterwards to another filesystem/repostory. It just requires a little manual intervention. However it seems recommended to keep your machines always in an oracle-vm repository, in the running_pool, because only in that way it can be managed by the Oracle-vm gui.

I am sure that there are many things that have an abvious explanation, but I have to admit that I didn’t read the manuals of ocfs and oracle vm completely from the start to the end. Also I think that Oracle

Conclusion: Oracle VM seems to be capable of having multiple repositories on different disks, but the GUI is not ready to handle them. But with a minimum of manual intervention, it is easy to do all desired tasks in command-line mode.


ASM and large amount of files: follow-up

Apr 20, 2008

As promised I will keep you up to date about the problem in my previous post about the issues with ASM on Solaris when a large amount of files exists (or have existed). I can confirm that there is a fix now that completely solves the problem.

Read the rest of this entry »


Warning: ASM and large amount of files (on Solaris)

Mar 5, 2008

This post is to warn you of a potential problem you may encounter. A combination where it definitely occurs is 10.2 Rac and ASM on Solaris. The problem is the following: due to a lot of db activity, no cleanup of archivelog files, multiple databases present and multiple instances per database, the amount of archivelog files has grown up to +100.000 in ASM (in my case). As a result it turns out that a query on v$asm_file lasts 1 minute or even more, up to 3 minutes. Not a problem on itself, wasn’t it that the query on v$asm_file turns out to block all file manipulation operations in ASM, including the creation/deletion of archivelogs or registering archivelogs in the controlfile. This generates CF enqueue waits in the databases using asm (for lgwr and arc) and very soon waits by user processes on log file sync becasue the lgwr is blocked. In this way your production may be frozen until the query on v$asm_file ends or is interrupted. Knowing that emagent can access v$asm_file, and creation and registering or archivelogs as well, this problem may occur very often, especially in data guard environments where a lot of archivelog manipulation is done.

Read the rest of this entry »