The ultimate story about OCR, OCRMIRROR and 2 storage boxes – Chapter 4

Scenario 4: Loss of ocrmirror from the non-OCR MASTER

This is a vollow-up of chapter 3.
Let’s try to do the same thing as scenario 3, however now hiding the lun from a node NOT being the OCR MASTER, while crs is running on both nodes.

What happens?

So we hide the ocrmirror lun from node 1, because after the previous test node 2 is still the master. At the moment of hiding the lun nothing apears in any logfile on any node. This is an interesting fact, because when we removed the ocrmirror from the ocr master in scenario 3, we got the messages “problem writing the buffer …” and “Vote information on disk 0 [/dev/oracle/ocr] is adjusted from [1/2] to [2/2]” immediately in the crsd logfile. So this indicates that only the ocr master reads/writes(?) constantly in the ocr/ocrmirror and hence, detects IO errors immediately. The non-ocr-master doesn’t do anything.

To prove that the ocrmirror is really invisble for the non-ocr-master, we do on node 1:

(nodea01 /app/oracle/bin) $ dd if=/dev/oracle/ocrmirror of=/dev/null bs=64k count=1
dd: /dev/oracle/ocrmirror: open: I/O error

ocrcheck on node 2 (the master) has no problem, as it still sees both devices:

         Device/File Name         : /dev/oracle/ocr
                                    Device/File integrity check succeeded
         Device/File Name         : /dev/oracle/ocrmirror
                                    Device/File integrity check succeeded
         Cluster registry integrity check succeeded

But doing ocrcheck on node 1 gives:

PROT-602: Failed to retrieve data from the cluster registry

and its alert file says:

2008-07-23 09:33:43.221
[client(14867)]CRS-1011:OCR cannot determine that the OCR content contains the latest updates. Details in /app/oracle/crs/log/nodea01/client/ocrcheck_14867.log.

and the associated client logfile shows:

Oracle Database 10g CRS Release 10.2.0.4.0 Production Copyright 1996, 2008 Oracle. All rights reserved.
2008-07-23 09:33:43.210: [OCRCHECK][1]ocrcheck starts…
2008-07-23 09:33:43.220: [ OCRRAW][1]proprioini: disk 0 (/dev/oracle/ocr) doesn’t have enough votes (1,2)
2008-07-23 09:33:43.221: [ OCRRAW][1]proprinit: Could not open raw device
2008-07-23 09:33:43.221: [ default][1]a_init:7!: Backend init unsuccessful : [26]
2008-07-23 09:33:43.221: [OCRCHECK][1]Failed to access OCR repository: [PROC-26: Error while accessing the physical storage]
2008-07-23 09:33:43.221: [OCRCHECK][1]Failed to initialize ocrchek2
2008-07-23 09:33:43.221: [OCRCHECK][1]Exiting [status=failed]…

Now let’s see if node 1 really has a problem with this or not?

(nodea01 /app/oracle/bin) $ crsstat<br />HA Resource                                   Target     State<br />-----------                                   ------     -----<br />ora.ARES.ARES1.inst                           ONLINE     ONLINE on nodea01<br />ora.ARES.ARES2.inst                           ONLINE     ONLINE on nodeb01<br />ora.ARES.db                                   ONLINE     ONLINE on nodeb01<br />ora.AMIGO.AMIGO1.inst                         ONLINE     ONLINE on nodea01<br />ora.AMIGO.AMIGO2.inst                         ONLINE     ONLINE on nodeb01<br />ora.AMIGO.db                                  ONLINE     ONLINE on nodeb01<br />ora.nodea01.ASM1.asm                          ONLINE     ONLINE on nodea01<br />ora.nodea01.LSNRARES_NODEA01.lsnr             ONLINE     ONLINE on nodea01<br />ora.nodea01.LSNRAMIGO_NODEA01.lsnr            ONLINE     ONLINE on nodea01<br />ora.nodea01.gsd                               ONLINE     ONLINE on nodea01<br />ora.nodea01.ons                               ONLINE     ONLINE on nodea01<br />ora.nodea01.vip                               ONLINE     ONLINE on nodea01<br />ora.nodeb01.ASM2.asm                          ONLINE     ONLINE on nodeb01<br />ora.nodeb01.LSNRARES_NODEB01.lsnr             ONLINE     OFFLINE<br />ora.nodeb01.LSNRAMIGO_NODEB01.lsnr            ONLINE     OFFLINE<br />ora.nodeb01.gsd                               ONLINE     ONLINE on nodeb01<br />ora.nodeb01.ons                               ONLINE     ONLINE on nodeb01<br />ora.nodeb01.vip                               ONLINE     ONLINE on nodea01

Obviously not. I seem to be able to query the ocr without any problem. Even modifying the OCR succeeds from node 1:

srvctl add service -d ARES -s aressrv -r ARES1 -a ARES2

gives in the crs logfile:

2008-07-23 09:41:04.656: [ CRSRES][46723] Resource Registered: ora.ARES.aressrv.cs
2008-07-23 09:41:05.786: [ CRSRES][46724] Resource Registered: ora.ARES.aressrv.ARES1.srv

This seems to indicate again that all OCR manipulation (read + write) goes though the ocr master (node 2 who still sees both ocr and ocrmirror). Still during all these actions nothing has appeared in any logfile of node 2 (the master who still sees both luns).

Now let’s do an interesting test and stop the master. We may assume that node 1 will become the new master then, however node 1 at this time sees only one device with one vote. So it cannot run CRS like that…

Let’s see what happens. After stopping crs on node 2, the crs logfile on node 1 shows:

2008-07-23 09:44:08.406: [ OCRMAS][25]th_master:13: I AM THE NEW OCR MASTER at incar 2. Node Number 1
2008-07-23 09:44:08.415: [ OCRRAW][25]proprioo: for disk 0 (/dev/oracle/ocr), id match (1), my id set (1385758746,1866209186) total id sets (2), 1st set (1385758746,1866209186), 2nd set (1385758746,1866209186) my votes (1), total votes (2)
2008-07-23 09:44:08.418: [ OCRRAW][25]propriowv_bootbuf: Vote information on disk 0 [/dev/oracle/ocr] is adjusted from [1/2] to [2/2]

This makes sense. Node 1 becomes the master because node 2 is leaving. However it evaluates its configuration and sees an ocr with one vote and no ocrmirror. This violates rule 3 and hence it updates the vote count (he can do that, he is the new master), and luckily he does NOT decide to crash…

And indeed all seems as expected now on node 1:

         Device/File Name         : /dev/oracle/ocr<br />                                    Device/File integrity check succeeded<br />         Device/File Name         : /dev/oracle/ocrmirror<br />                                    Device/File unavailable<br />

The situation is now that the unavailable ocrmirror still has one vote (because node 1 could not update its vote count) and the ocr has just received 2 votes from node 1.

Now we restart crs again on node 2 and we see in its logfile:

2008-07-23 09:47:19.583: [ OCRRAW][1]proprioo: for disk 0 (/dev/oracle/ocr), id match (1), my id set (1385758746,1866209186) total id sets (1), 1st set (1385758746,1866209186), 2nd set (0,0) my votes (2), total votes (2)
2008-07-23 09:47:19.583: [ OCRRAW][1]proprioo: for disk 1 (/dev/oracle/ocrmirror), id match (1), my id set (1385758746,1866209186) total id sets (2), 1st set (1385758746,1866209186), 2nd set (1385758746,1866209186) my votes (1), total votes (2)

Node 2 may be a little confused, because when both ocr and ocrmirror are available, he would expect each of them to have one vote…

Now lets do an ocrcheck again on node 2 and get:

         Device/File Name         : /dev/oracle/ocr<br />                                    Device/File integrity check succeeded<br />         Device/File Name         : /dev/oracle/ocrmirror<br />                                    Device/File needs to be synchronized with the other device<br />

Aha, this is the right output when ocrmirror has 2 votes and ocr has 1 vote.

So let’s do as what Oracle tells in the output above, let’s synchronize the ocrmirror again with the other device:

So we do an “ocrconfig -replace ocrmirror /dev/oracle/ocrmirror” on node 2 (who can see both ocr and ocrmirror). Bad luck, this fails, because node 2 is not the master. Node 1 has become the master (see above) and node 1 cannot see ocrmirror. So node 1 cannot verify or correct the vote count on ocrmirror. Hence this last command gives in the alert file of node 1:

2008-07-23 09:57:26.251: [ OCROSD][35]utdvch:0:failed to open OCR file/disk /dev/oracle/ocrmirror, errno=5, os err string=I/O error
2008-07-23 09:57:26.251: [ OCRRAW][35]dev_replace: master could not verify the new disk (8)
[ OCRSRV][35]proas_replace_disk: Failed in changing configurations in the Master 8

After making the lun visible again on node 1 and repeating the last command, all succeeds again without error: the crs logfile of node 1 then shows:

2008-07-23 10:04:48.419: [ OCRRAW][33]proprioo: for disk 0 (/dev/oracle/ocr), id match (1), my id set (1385758746,1866209186) total id sets (1), 1st set (1385758746,1866209186), 2nd set (0,0) my votes (2), total votes (2)
2008-07-23 10:04:48.419: [ OCRRAW][33]propriogid:1: INVALID FORMAT
2008-07-23 10:04:48.484: [ OCRRAW][33]propriowv_bootbuf: Vote information on disk 1 [/dev/oracle/ocrmirror] is adjusted from [0/0] to [1/2]
2008-07-23 10:04:48.485: [ OCRRAW][33]propriowv_bootbuf: Vote information on disk 0 [/dev/oracle/ocr] is adjusted from [2/2] to [1/2]
2008-07-23 10:04:48.557: [ OCRMAS][25]th_master: Deleted ver keys from cache (master)
2008-07-23 10:04:48.557: [ OCRMAS][25]th_master: Deleted ver keys from cache (master)

and in the same file on node 2 we see:

2008-07-23 10:04:48.492: [ OCRRAW][40]proprioo: for disk 0 (/dev/oracle/ocr), id match (1), my id set (1385758746,1866209186) total id sets (2), 1st set (1385758746,1866209186), 2nd set (1385758746,1866209186) my votes (1), total votes (2)
2008-07-23 10:04:48.493: [ OCRRAW][40]proprioo: for disk 1 (/dev/oracle/ocrmirror), id match (1), my id set (1385758746,1866209186) total id sets (2), 1st set (1385758746,1866209186), 2nd set (1385758746,1866209186) my votes (1), total votes (2)
2008-07-23 10:04:48.504: [ OCRMAS][25]th_master: Deleted ver keys from cache (non master)
2008-07-23 10:04:48.504: [ OCRMAS][25]th_master: Deleted ver keys from cache (non master)

The fact that the logfile on node 2 shows these messages indicates that the vote update which is done by the master (node 1) is propagated in some way to the other nodes, who in term update some kind of local cache (I think).

Conclusion of Scenario 4

Hiding the lun from the non-ocr-master still can’t confuse CRS. However it takes longer for the cluster to detect that there is a problem with the storage, as only the master is able to detect io errors on the ocr/ocrmirror. But in the end it can be recovered again without downtime.

So you should be convinced now that we can’t confuse CRS, right?… Then you don’t know me yet, I still have scenario 5. Read on in the next chapter.

About these ads

2 Responses to The ultimate story about OCR, OCRMIRROR and 2 storage boxes – Chapter 4

  1. [...] is a follow-up of chapter 4. In this final scenario, we do the same thing as in scenario 4. I.e. while crs is running on both [...]

  2. [...] Geert De Paep-Scenario 4: Loss of ocrmirror from the non-OCR MASTER [...]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: