<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>Oracle at work</title>
	<atom:link href="http://geertdepaep.wordpress.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://geertdepaep.wordpress.com</link>
	<description>An extract of my Oracle related opinions</description>
	<lastBuildDate>Fri, 04 Nov 2011 09:01:45 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='geertdepaep.wordpress.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://s2.wp.com/i/buttonw-com.png</url>
		<title>Oracle at work</title>
		<link>http://geertdepaep.wordpress.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://geertdepaep.wordpress.com/osd.xml" title="Oracle at work" />
	<atom:link rel='hub' href='http://geertdepaep.wordpress.com/?pushpress=hub'/>
		<item>
		<title>The ultimate story about OCR, OCRMIRROR and 2 storage boxes &#8211; Conclusion</title>
		<link>http://geertdepaep.wordpress.com/2009/10/15/the-ultimate-story-about-ocr-ocrmirror-and-2-storage-boxes-conclusion/</link>
		<comments>http://geertdepaep.wordpress.com/2009/10/15/the-ultimate-story-about-ocr-ocrmirror-and-2-storage-boxes-conclusion/#comments</comments>
		<pubDate>Thu, 15 Oct 2009 14:00:29 +0000</pubDate>
		<dc:creator>pier00</dc:creator>
				<category><![CDATA[High availability]]></category>

		<guid isPermaLink="false">http://geertdepaep.wordpress.com/?p=85</guid>
		<description><![CDATA[This is a follow-up of chapter 5. The most important thing in this story is the fact that it is perfectly possible to configure your Oracle RAC cluster with 2 storage boxes in a safe way. You just need an independent location for the 3rd voting disk, but if you have that, you can be [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=geertdepaep.wordpress.com&amp;blog=2187749&amp;post=85&amp;subd=geertdepaep&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>This is a follow-up of <a href="http://geertdepaep.wordpress.com/2009/10/07/the-ultimate-story-about-ocr-ocrmirror-and-2-storage-boxes-chapter-5/">chapter 5</a>.<br />
The most important thing in this story is the fact that it is perfectly possible to configure your Oracle RAC cluster with 2 storage boxes in a safe way. You just need an independent location for the 3rd voting disk, but if you have that, you can be sure that your cluster will remain running when one of those storage boxes fail. You will even be able to repair it <em>without downtime</em> after e.g. buying a new storage box (call Uptime for good prices&#8230;:)</p>
<p>So were all these tests then really needed? Yes, I do think so, because of the following reasons
<ul><span id="more-85"></span><br />	
<li>Seeing that it works gives much more confidence than just reading it in the Oracle doc, or on Metalink</li>
<p>	
<li>The story of the vote count is really interesting. There is (almost) nothing to find about this in the Oracle doc or metalink. With the information in this blog, you will be able to better understand and interpret the error messages in the log files. You will also know better when to (not) update the vote count manually.</li>
<p>	
<li>The concept of OCR master is nice to know. Again, it gives your more insight in the messages in the logfiles.</li>
<p></ul>
<p>But apart from these straightforward conclusions, there is one thing I find most interesting. The different scenarios have produced different output, and in one case (scenario 5) even real error messages, allthough they all did the same thing: removing the ocrmirror. With the different scenario&#8217;s above, you know <span style="text-decoration:underline;"><strong>why</strong></span> the output can be different. Because if ever you have to handle a RAC case with Oracle support, and you get as reply &#8220;we are unable to reproduce your case&#8221;, you may now be able to give them more info about what parameter can make a difference (who is ocr master, where is crs stopped, &#8230;). Otherwise it can be so frustrating that something fails in your situation and does work in somebody else&#8217;s situation.<br />But now I may be getting too philosophical (which I tend to have after another good &#8220;<a href="http://en.wikipedia.org/wiki/Trappist_beer" target="_blank">Trappist</a>&#8220;)&#8230;</p>
<p>Good luck with it!</p>
<p>P.S. And yes, I do have them all in my cellar&#8230;</p>
<p>
<div class="zemanta-pixie"><img class="zemanta-pixie-img" alt="" src="http://img.zemanta.com/pixy.gif?x-id=2348ceb8-73c2-8036-99cd-19880b542588" /></div>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/geertdepaep.wordpress.com/85/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/geertdepaep.wordpress.com/85/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/geertdepaep.wordpress.com/85/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/geertdepaep.wordpress.com/85/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/geertdepaep.wordpress.com/85/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/geertdepaep.wordpress.com/85/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/geertdepaep.wordpress.com/85/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/geertdepaep.wordpress.com/85/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/geertdepaep.wordpress.com/85/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/geertdepaep.wordpress.com/85/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/geertdepaep.wordpress.com/85/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/geertdepaep.wordpress.com/85/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/geertdepaep.wordpress.com/85/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/geertdepaep.wordpress.com/85/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=geertdepaep.wordpress.com&amp;blog=2187749&amp;post=85&amp;subd=geertdepaep&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://geertdepaep.wordpress.com/2009/10/15/the-ultimate-story-about-ocr-ocrmirror-and-2-storage-boxes-conclusion/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/4fbfb550fb3f85bbaf5284bb99e4df59?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">pier00</media:title>
		</media:content>

		<media:content url="http://img.zemanta.com/pixy.gif?x-id=2348ceb8-73c2-8036-99cd-19880b542588" medium="image" />
	</item>
		<item>
		<title>The ultimate story about OCR, OCRMIRROR and 2 storage boxes &#8211; Chapter 5</title>
		<link>http://geertdepaep.wordpress.com/2009/10/12/the-ultimate-story-about-ocr-ocrmirror-and-2-storage-boxes-chapter-5/</link>
		<comments>http://geertdepaep.wordpress.com/2009/10/12/the-ultimate-story-about-ocr-ocrmirror-and-2-storage-boxes-chapter-5/#comments</comments>
		<pubDate>Mon, 12 Oct 2009 14:00:46 +0000</pubDate>
		<dc:creator>pier00</dc:creator>
				<category><![CDATA[High availability]]></category>

		<guid isPermaLink="false">http://geertdepaep.wordpress.com/?p=84</guid>
		<description><![CDATA[Scenario 5: Loss of ocrmirror from non-ocr-master &#8211; reloaded This is a follow-up of chapter 4. In this final scenario, we do the same thing as in scenario 4. I.e. while crs is running on both nodes, we hide the ocrmirror from the non-ocr-master node, which is node 2 now. So node 1 is the [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=geertdepaep.wordpress.com&amp;blog=2187749&amp;post=84&amp;subd=geertdepaep&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<h3>Scenario 5: Loss of ocrmirror from non-ocr-master &#8211; reloaded</h3>
<p>This is a follow-up of <a href="http://geertdepaep.wordpress.com/2009/10/07/the-ultimate-story-about-ocr-ocrmirror-and-2-storage-boxes-chapter-4/">chapter 4</a>.<br />
In this  final scenario, we do the same thing as in scenario 4. I.e. while crs is running  on both nodes, we hide the ocrmirror from the non-ocr-master node, which is node  2 now. <br />So node 1 is the master, we hide ocrmirror from node 2 and we verify on node 2:<br />
<blockquote>(nodeb01 /app/oracle/crs/log/nodeb01) $ dd  if=/dev/oracle/ocrmirror of=/dev/null bs=64k count=1<br />dd:  /dev/oracle/ocrmirror: open: I/O error</p></blockquote>
<p>What happens?<br /><span id="more-84"></span></p>
<p>As we know from  scenario 4, ocrcheck on node 2 now fails with:<br />
<blockquote>(nodeb01 /app/oracle/crs/log/nodeb01) $ ocrcheck<br />PROT-602:  Failed to retrieve data from the cluster registry</p></blockquote>
<p>On node 1 all  is ok. This is still the same as scenario 4, but in scenario 4 we now stopped  crs on the ocr master who can see both luns. In this scenario we will now stop  crs on the non-master node (node 2) who can see only ocr. </p>
<p>And now it gets interesting&#8230;.<br />
<blockquote>-bash-3.00# crsctl stop crs<br />OCR initialization failed accessing  OCR device: PROC-26: Error while accessing the physical  storage</p></blockquote>
<p>Did I say &#8220;really interesting&#8221;? We don&#8217;t seem to be able  to stop crs anymore on the non-ocr-master node. Maybe it is worth referring to  the RAC FAQ on Metalink that says &#8220;If the corruption happens while the Oracle  Clusterware stack is up and running, then the corruption will be tolerated and  the Oracle Clusterware will continue to funtion without interruptions&#8221;. That&#8217;s  true, but they don&#8217;t seem to speak about stopping crs. Anyway, the real  &#8220;playing&#8221; continues:</p>
<p>Let&#8217;s try to tell Oracle CRS that the ocr is the  correct version to continue with, and ask kindly to increase its votecount to  2. We do this on node 2 and get:<br />
<blockquote>ocrconfig -overwrite<br />PROT-19: Cannot proceed while clusterware is  running. Shutdown clusterware first</p></blockquote>
<p>Deadlock on node 2! We can&#8217;t stop crs, but in order trying to correct the problem, crs has to be down&#8230;</p>
<p>Moreover, at this time, it is not possible anymore to modify the OCR. Both nodes now give:<br />
<blockquote>(nodea01 /app/oracle/crs/log/nodea01/client) $ srvctl remove service -d ARES -s aressrv<br />PRKR-1007 : getting of cluster database ARES configuration failed, PROC-5: User does not have permission to perform a cluster registry operation on this key. Authentication error [User does not have permission to perform this operation] [0]<br />PRKO-2005 : Application error: Failure in getting Cluster Database Configuration for: ARES</p></blockquote>
<p>And doing the above command on each node gives always in the alert logfile of node 1 (who is the master):<br />
<blockquote>[&nbsp; OCRAPI][29]a_check_permission_int: Other doesn&#8217;t have permission</p></blockquote>
<p>Note: &#8220;srvctl add service&#8221; doesn&#8217;t work either.</p>
<p>Now it seems like things are really messed up. We have never seen permission errors before. Please be aware now that the steps below are the steps I took trying to get things right again. There may be other options, but I only did this scenario once, with the steps below:</p>
<p>As the original root cause of the problem was making the ocrmirror unavailable, let&#8217;s try to tell the cluster to forget about this ocrmirror, and continue only with ocr, which is still visible on both nodes.</p>
<p>So in order to remove ocrmirror from the configuration, we do as root on node 2:<br />
<blockquote>-bash-3.00# ocrconfig -replace ocrmirror &#8220;&#8221;</p></blockquote>
<p>Note: specifying an empty string (&#8220;&#8221;) is used to remove the raw device from the configuration.</p>
<p>At that time in the crs logfile of node 1:<br />
<blockquote>2008-07-23 11:11:18.136: [&nbsp; OCRRAW][29]proprioo: for disk 0 (/dev/oracle/ocr), id match (0), my id set (1385758746,1028247821) total id sets (2), 1st set (1385758746,1866209186), 2nd set (1385758746,1866209186) my votes (1), total votes (2)<br />2008-07-23 11:11:18.136: [&nbsp; OCRRAW][29]propriowv_bootbuf: <font color="#ff6600">Vote information on disk 0 [/dev/oracle/ocr] is adjusted from [1/2] to [2/2]</font><br />2008-07-23 11:11:18.195: [&nbsp; OCRMAS][25]th_master: Deleted ver keys from cache (master)<br />2008-07-23 11:11:18.195: [&nbsp; OCRMAS][25]th_master: Deleted ver keys from cache (master)</p></blockquote>
<p>That looks ok. We will be left with one ocr device having 2 votes. This is intended behaviour.</p>
<p>In the alert file of node 1, we see:<br />
<blockquote>2008-07-23 11:11:18.125<br />[crsd(26268)]CRS-1010:The OCR mirror location /dev/oracle/ocrmirror was removed.</p></blockquote>
<p>and in the crs logfile of node 2:<br />
<blockquote>2008-07-23 11:11:18.155: [&nbsp; OCRRAW][34]proprioo: for disk 0 (/dev/oracle/ocr), id match (1), my id set (1385758746,1028247821) total id sets (2), 1st set (1385758746,1866209186), 2nd set (1385758746,1028247821) my votes (2), total votes (2)<br />2008-07-23 11:11:18.223: [&nbsp; OCRMAS][25]th_master: Deleted ver keys from cache (non master)<br />2008-07-23 11:11:18.223: [&nbsp; OCRMAS][25]th_master: Deleted ver keys from cache (non master)</p></blockquote>
<p>(node 2 updates its local cache) and in the alert file of node 2:<br />
<blockquote>2008-07-23 11:11:18.150<br />[crsd(10831)]CRS-1010:The OCR mirror location /dev/oracle/ocrmirror was removed.</p></blockquote>
<p>Now we do an ocrcheck on node 2:
<pre>(nodeb01 /app/oracle/crs/log/nodeb01) $ ocrcheck</pre>
<pre>Status of Oracle Cluster Registry is as follows :</pre>
<pre>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Version&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; :&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 2</pre>
<pre>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Total space (kbytes)&nbsp;&nbsp;&nbsp;&nbsp; :&nbsp;&nbsp;&nbsp;&nbsp; 295452</pre>
<pre>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Used space (kbytes)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; :&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 5600</pre>
<pre>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Available space (kbytes) :&nbsp;&nbsp;&nbsp;&nbsp; 289852</pre>
<pre>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; ID&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : 1930338735</pre>
<pre>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Device/File Name&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; : /dev/oracle/ocr</pre>
<pre>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Device/File integrity check succeeded</pre>
<pre>&lt;br /&gt;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Device/File not configured</pre>
<pre>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Cluster registry integrity check succeeded</pre>
<p>Now the configuration looks ok again, but the error remains on node 2 (we do this as user oracle):<br />
<blockquote>(nodeb01 /app/oracle/crs/log/nodeb01) $ srvctl remove service -d ARES -s aressrv<br />PRKR-1007 : getting of cluster database ARES configuration failed, PROC-5: User does not have permission to perform a cluster registry operation on this key. Authentication error [User does not have permission to perform this operation] [0]<br />PRKO-2005 : Application error: Failure in getting Cluster Database Configuration for: ARES</p></blockquote>
<p>However doing the same command <b>as root</b> on node 2 succeeds:<br />
<blockquote>-bash-3.00# srvctl remove service -d ARES -s aressrv<br />aressrv PREF: ARES1 AVAIL: ARES2<br />Service aressrv is disabled.<br />Remove service aressrv from the database ARES? (y/[n]) y</p></blockquote>
<p>After this, managing the resources as user oracle succeeds again:<br />
<blockquote>(nodeb01 /app/oracle/crs/log/nodeb01) $ srvctl add service -d ARES -s aressrv2 -r ARES1<br />(nodeb01 /app/oracle/crs/log/nodeb01) $ srvctl remove service -d ARES -s aressrv2<br />aressrv2 PREF: ARES1 AVAIL:<br />Remove service aressrv2 from the database ARES? (y/[n]) y</p></blockquote>
<p>At this point, unfortunately the internals end. At the moment of my testing, I had no time to investigate this further, and since then I had no time to make and test a similar setup (that&#8217;s why this blog posting took so long, I would have loved to do more research on this). However I remember I have done some more testing in some place at some customer site (but I have no tracscript of that, so no details to write here) and I can still tell the following:</p>
<p>For some reason, the ownership of the ARES resource in OCR seems to be changed from oracle to root. A way to get out of this as well is using the following commands:<br />
<blockquote>&nbsp;crs_getperm <br />&nbsp;crs_setperm  -o oracle | -g dba</p></blockquote>
<p>This allows to change ownership back to oracle, and then all will become ok again. </p>
<p>I can&#8217;t say where it went wrong. Maybe I have done something as root, instead of oracle, without knowing (however I double checked my transcripts). I think it went wrong at the moment where I first tried to stop crs as root on node 2 and then did an &#8220;ocrconfig -overwrite&#8221; as root on node 2. I wonder if something has then been sent to node 1 (who is ocr master), i.e. as root, that may have changed some permission in the ocr&#8230;? If anyone has time and resources to investigate this further, please don&#8217;t hesitate to do so, and inform me about the results. In this way, you may gain perpetual honour in my personal in-memory list of great Oracle guys.</p>
<h5>Conclusion</h5>
<p>Altthough crs is very robust and 2 storage boxes are ok, there may be a situation where you get unexpected error messages. Hopefully this chapter will help you in getting out of this without problems, and strengthen your confidence in Oracle RAC.</p>
<p>Let&#8217;s make a final conclusion in the next chapter&#8230;</p>
<p>
<div class="zemanta-pixie"><img class="zemanta-pixie-img" alt="" src="http://img.zemanta.com/pixy.gif?x-id=26f1f002-5cc0-8558-b287-ed7397ebcd19" /></div>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/geertdepaep.wordpress.com/84/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/geertdepaep.wordpress.com/84/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/geertdepaep.wordpress.com/84/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/geertdepaep.wordpress.com/84/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/geertdepaep.wordpress.com/84/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/geertdepaep.wordpress.com/84/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/geertdepaep.wordpress.com/84/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/geertdepaep.wordpress.com/84/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/geertdepaep.wordpress.com/84/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/geertdepaep.wordpress.com/84/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/geertdepaep.wordpress.com/84/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/geertdepaep.wordpress.com/84/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/geertdepaep.wordpress.com/84/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/geertdepaep.wordpress.com/84/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=geertdepaep.wordpress.com&amp;blog=2187749&amp;post=84&amp;subd=geertdepaep&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://geertdepaep.wordpress.com/2009/10/12/the-ultimate-story-about-ocr-ocrmirror-and-2-storage-boxes-chapter-5/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/4fbfb550fb3f85bbaf5284bb99e4df59?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">pier00</media:title>
		</media:content>

		<media:content url="http://img.zemanta.com/pixy.gif?x-id=26f1f002-5cc0-8558-b287-ed7397ebcd19" medium="image" />
	</item>
		<item>
		<title>The ultimate story about OCR, OCRMIRROR and 2 storage boxes &#8211; Chapter 4</title>
		<link>http://geertdepaep.wordpress.com/2009/10/07/the-ultimate-story-about-ocr-ocrmirror-and-2-storage-boxes-chapter-4/</link>
		<comments>http://geertdepaep.wordpress.com/2009/10/07/the-ultimate-story-about-ocr-ocrmirror-and-2-storage-boxes-chapter-4/#comments</comments>
		<pubDate>Wed, 07 Oct 2009 14:00:03 +0000</pubDate>
		<dc:creator>pier00</dc:creator>
				<category><![CDATA[High availability]]></category>

		<guid isPermaLink="false">http://geertdepaep.wordpress.com/?p=83</guid>
		<description><![CDATA[Scenario 4: Loss of ocrmirror from the non-OCR MASTER This is a vollow-up of chapter 3. Let&#8217;s try to do the same thing as scenario 3, however now hiding the lun from a node NOT being the OCR MASTER, while crs is running on both nodes. What happens? So we hide the ocrmirror lun from [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=geertdepaep.wordpress.com&amp;blog=2187749&amp;post=83&amp;subd=geertdepaep&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<h3>Scenario 4: Loss of ocrmirror from the non-OCR MASTER</h3>
<p>This is a vollow-up of <a href="http://geertdepaep.wordpress.com/2009/10/02/the-ultimate-story-about-ocr-ocrmirror-and-2-storage-boxes-chapter-3/">chapter 3</a>.<br />
Let&#8217;s try to  do the same thing as scenario 3, however now hiding the lun from a node NOT  being the OCR MASTER, while crs is running on both nodes.</p>
<p>What happens?<br />
<span id="more-83"></span></p>
<p>So we hide the  ocrmirror lun from node 1, because after the previous test node 2 is still the  master. At the moment of hiding the lun nothing apears in any logfile on any  node. This is an interesting fact, because when we removed the ocrmirror from  the ocr master in scenario 3, we got the messages &#8220;problem writing the buffer  &#8230;&#8221; and &#8220;Vote information on disk 0 [/dev/oracle/ocr] is adjusted from [1/2] to  [2/2]&#8221; immediately in the crsd logfile. So this indicates that only the  ocr master reads/writes(?) constantly in the ocr/ocrmirror and hence, detects IO  errors immediately. The non-ocr-master doesn&#8217;t do anything.</p>
<p>To prove that  the ocrmirror is really invisble for the non-ocr-master, we do on node 1:</p>
<blockquote><p>(nodea01 /app/oracle/bin) $ dd if=/dev/oracle/ocrmirror of=/dev/null  bs=64k count=1<br />
dd: /dev/oracle/ocrmirror: open: I/O  error</p></blockquote>
<p>ocrcheck on node 2 (the master) has no problem, as it  still sees both devices:</p>
<pre>         Device/File Name         : /dev/oracle/ocr
                                    Device/File integrity check succeeded
         Device/File Name         : /dev/oracle/ocrmirror
                                    Device/File integrity check succeeded
         Cluster registry integrity check succeeded</pre>
<p>But  doing ocrcheck on node 1 gives:</p>
<blockquote><p>PROT-602: Failed to retrieve data from the cluster  registry</p></blockquote>
<p>and its alert file says:</p>
<blockquote><p>2008-07-23 09:33:43.221<br />
[client(14867)]CRS-1011:OCR cannot  determine that the OCR content contains the latest updates. Details in  /app/oracle/crs/log/nodea01/client/ocrcheck_14867.log.</p></blockquote>
<p>and the  associated client logfile shows:</p>
<blockquote><p>Oracle Database 10g CRS Release 10.2.0.4.0 Production Copyright  1996, 2008 Oracle. All rights reserved.<br />
2008-07-23 09:33:43.210:  [OCRCHECK][1]ocrcheck starts&#8230;<br />
2008-07-23 09:33:43.220: [  OCRRAW][1]proprioini: <span style="color:#ff6600;">disk 0 (/dev/oracle/ocr) doesn&#8217;t have enough votes  (1,2)</span><br />
2008-07-23 09:33:43.221: [ OCRRAW][1]proprinit: Could not open raw  device<br />
2008-07-23 09:33:43.221: [ default][1]a_init:7!: Backend init  unsuccessful : [26]<br />
2008-07-23 09:33:43.221: [OCRCHECK][1]Failed to access  OCR repository: [PROC-26: <span style="color:#ff6600;">Error while accessing the physical  storage</span>]<br />
2008-07-23 09:33:43.221: [OCRCHECK][1]Failed to initialize  ocrchek2<br />
2008-07-23 09:33:43.221: [OCRCHECK][1]Exiting  [status=failed]&#8230;</p></blockquote>
<p>Now let&#8217;s see if node 1 <strong>really</strong> has a  problem with this or not?</p>
<pre>(nodea01 /app/oracle/bin) $ crsstat&lt;br /&gt;HA Resource                                   Target     State&lt;br /&gt;-----------                                   ------     -----&lt;br /&gt;ora.ARES.ARES1.inst                           ONLINE     ONLINE on nodea01&lt;br /&gt;ora.ARES.ARES2.inst                           ONLINE     ONLINE on nodeb01&lt;br /&gt;ora.ARES.db                                   ONLINE     ONLINE on nodeb01&lt;br /&gt;ora.AMIGO.AMIGO1.inst                         ONLINE     ONLINE on nodea01&lt;br /&gt;ora.AMIGO.AMIGO2.inst                         ONLINE     ONLINE on nodeb01&lt;br /&gt;ora.AMIGO.db                                  ONLINE     ONLINE on nodeb01&lt;br /&gt;ora.nodea01.ASM1.asm                          ONLINE     ONLINE on nodea01&lt;br /&gt;ora.nodea01.LSNRARES_NODEA01.lsnr             ONLINE     ONLINE on nodea01&lt;br /&gt;ora.nodea01.LSNRAMIGO_NODEA01.lsnr            ONLINE     ONLINE on nodea01&lt;br /&gt;ora.nodea01.gsd                               ONLINE     ONLINE on nodea01&lt;br /&gt;ora.nodea01.ons                               ONLINE     ONLINE on nodea01&lt;br /&gt;ora.nodea01.vip                               ONLINE     ONLINE on nodea01&lt;br /&gt;ora.nodeb01.ASM2.asm                          ONLINE     ONLINE on nodeb01&lt;br /&gt;ora.nodeb01.LSNRARES_NODEB01.lsnr             ONLINE     OFFLINE&lt;br /&gt;ora.nodeb01.LSNRAMIGO_NODEB01.lsnr            ONLINE     OFFLINE&lt;br /&gt;ora.nodeb01.gsd                               ONLINE     ONLINE on nodeb01&lt;br /&gt;ora.nodeb01.ons                               ONLINE     ONLINE on nodeb01&lt;br /&gt;ora.nodeb01.vip                               ONLINE     ONLINE on nodea01</pre>
<p>Obviously  not. I seem to be able to query the ocr without any problem. Even modifying the  OCR succeeds from node 1:</p>
<blockquote><p>srvctl add service -d ARES -s aressrv -r ARES1 -a  ARES2</p></blockquote>
<p>gives in the crs logfile:</p>
<blockquote><p>2008-07-23 09:41:04.656: [ CRSRES][46723] Resource Registered:  ora.ARES.aressrv.cs<br />
2008-07-23 09:41:05.786: [ CRSRES][46724] Resource  Registered: ora.ARES.aressrv.ARES1.srv</p></blockquote>
<p>This seems to  indicate again that all OCR manipulation (read + write) goes though the ocr  master (node 2 who still sees both ocr and ocrmirror). Still during all these actions nothing has appeared in any logfile of  node 2 (the master who still sees both luns).</p>
<p>Now let&#8217;s do an interesting test and stop the master. We may assume that node 1 will become the new master then, however node 1 at this time sees only one device with one vote. So it cannot run CRS like that&#8230;</p>
<p>Let&#8217;s see what happens. After stopping crs on node 2, the crs logfile on node 1 shows:</p>
<blockquote><p>2008-07-23 09:44:08.406: [ OCRMAS][25]th_master:13: <span style="color:#ff8000;">I AM THE NEW OCR MASTER</span> at incar 2. Node Number  1<br />
2008-07-23 09:44:08.415: [ OCRRAW][25]proprioo: for disk 0  (/dev/oracle/ocr), id match (1), my id set (1385758746,1866209186) total id sets  (2), 1st set (1385758746,1866209186), 2nd set (1385758746,1866209186) my votes  (1), total votes (2)<br />
2008-07-23 09:44:08.418: [ OCRRAW][25]propriowv_bootbuf:  <span style="color:#ff8000;">Vote information on disk 0 [/dev/oracle/ocr] is adjusted  from [1/2] to [2/2]</span></p></blockquote>
<p>This makes sense. Node 1 becomes the  master because node 2 is leaving. However it evaluates its configuration and sees  an ocr with one vote and no ocrmirror. This violates rule 3 and hence it updates  the vote count (he can do that, he is the new master), and luckily he does NOT decide to crash&#8230;</p>
<p>And indeed all seems as  expected now on node 1:</p>
<pre>         Device/File Name         : /dev/oracle/ocr&lt;br /&gt;                                    Device/File integrity check succeeded&lt;br /&gt;         Device/File Name         : /dev/oracle/ocrmirror&lt;br /&gt;                                    Device/File unavailable&lt;br /&gt;</pre>
<p>The  situation is now that the unavailable ocrmirror still has one vote (because node 1 could not  update its vote count) and the ocr has just received 2 votes from node 1.</p>
<p>Now  we restart crs again on node 2 and we see in its logfile:</p>
<blockquote><p>2008-07-23 09:47:19.583: [ OCRRAW][1]proprioo: for disk 0 (<span style="color:#ff8000;">/dev/oracle/ocr</span>), id match (1), my id set  (1385758746,1866209186) total id sets (1), 1st set (1385758746,1866209186), 2nd  set (0,0) <span style="color:#ff8000;">my votes (2)</span>, total votes (2)<br />
2008-07-23  09:47:19.583: [ OCRRAW][1]proprioo: for disk 1 (<span style="color:#ff8000;">/dev/oracle/ocrmirror</span>), id match (1), my id set  (1385758746,1866209186) total id sets (2), 1st set (1385758746,1866209186), 2nd  set (1385758746,1866209186) <span style="color:#ff8000;">my votes (1)</span>, total votes  (2)</p></blockquote>
<p>Node 2 may be a little confused, because when both ocr and ocrmirror are available, he would expect each of them to have one vote&#8230;</p>
<p>Now lets do an ocrcheck again on node 2 and get:</p>
<pre>         Device/File Name         : /dev/oracle/ocr&lt;br /&gt;                                    Device/File integrity check succeeded&lt;br /&gt;         Device/File Name         : /dev/oracle/ocrmirror&lt;br /&gt;                                    Device/File needs to be synchronized with the other device&lt;br /&gt;</pre>
<p>Aha, this is the right output when ocrmirror has 2 votes and ocr has 1 vote.</p>
<p>So let&#8217;s do as what Oracle tells in the output above, let&#8217;s synchronize the ocrmirror again with the other device:</p>
<p>So we do an &#8220;ocrconfig -replace ocrmirror /dev/oracle/ocrmirror&#8221; on node 2 (who can see  both ocr and ocrmirror). Bad luck, this fails, because node 2 is not the master. Node 1 has  become the master (see above) and node 1 cannot see ocrmirror. So node 1 cannot verify or correct the vote count on ocrmirror. Hence this last command gives  in the alert file of node 1:</p>
<blockquote><p>2008-07-23 09:57:26.251: [ OCROSD][35]utdvch:0:<span style="color:#ff6600;">failed to open OCR  file/disk /dev/oracle/ocrmirror</span>, errno=5, os err string=I/O error<br />
2008-07-23  09:57:26.251: [ OCRRAW][35]dev_replace: master could not verify the new disk  (8)<br />
[ OCRSRV][35]proas_replace_disk: Failed in changing configurations in the  Master 8</p></blockquote>
<p>After making the lun visible again on node 1 and  repeating the last command, all succeeds again without error: the crs logfile of node 1 then shows:</p>
<blockquote><p>2008-07-23 10:04:48.419: [ OCRRAW][33]proprioo: for disk 0  (/dev/oracle/ocr), id match (1), my id set (1385758746,1866209186) total id sets  (1), 1st set (1385758746,1866209186), 2nd set (0,0) my votes (2), total votes  (2)<br />
2008-07-23 10:04:48.419: [ OCRRAW][33]propriogid:1: INVALID  FORMAT<br />
2008-07-23 10:04:48.484: [ OCRRAW][33]propriowv_bootbuf: <span style="color:#ff6600;">Vote  information on disk 1 [/dev/oracle/ocrmirror] is adjusted from [0/0] to  [1/2]</span><br />
2008-07-23 10:04:48.485: [ OCRRAW][33]propriowv_bootbuf: <span style="color:#ff6600;">Vote  information on disk 0 [/dev/oracle/ocr] is adjusted from [2/2] to  [1/2]</span><br />
2008-07-23 10:04:48.557: [ OCRMAS][25]th_master: Deleted ver keys from  cache (master)<br />
2008-07-23 10:04:48.557: [ OCRMAS][25]th_master: Deleted ver  keys from cache (master)</p></blockquote>
<p>and in the same file on node 2 we see:</p>
<blockquote><p>2008-07-23 10:04:48.492: [ OCRRAW][40]proprioo: for disk 0  (/dev/oracle/ocr), id match (1), my id set (1385758746,1866209186) total id sets  (2), 1st set (1385758746,1866209186), 2nd set (1385758746,1866209186) my votes  (1), total votes (2)<br />
2008-07-23 10:04:48.493: [ OCRRAW][40]proprioo: for disk  1 (/dev/oracle/ocrmirror), id match (1), my id set (1385758746,1866209186) total  id sets (2), 1st set (1385758746,1866209186), 2nd set (1385758746,1866209186) my  votes (1), total votes (2)<br />
2008-07-23 10:04:48.504: [ OCRMAS][25]th_master:  Deleted ver keys from cache (non master)<br />
2008-07-23 10:04:48.504: [  OCRMAS][25]th_master: Deleted ver keys from cache (non  master)</p></blockquote>
<p>The fact that the logfile on node 2 shows these messages  indicates that the vote update which is done by the master (node 1) is  propagated in some way to the other nodes, who in term update some kind of local  cache (I think).</p>
<h5>Conclusion of Scenario 4</h5>
<p>Hiding the lun from the non-ocr-master  still can&#8217;t confuse CRS. However it takes longer for the cluster to detect that  there is a problem with the storage, as only the master is able to detect io  errors on the ocr/ocrmirror. But in the end it can be recovered again without downtime.</p>
<p>So you should be convinced now that we can&#8217;t confuse CRS, right?&#8230;  Then you don&#8217;t know me yet, I still have scenario 5. Read on in the next chapter.</p>
<div class="zemanta-pixie"><img class="zemanta-pixie-img" src="http://img.zemanta.com/pixy.gif?x-id=56b76717-1c12-8ad9-a3c5-f9a220bcfbd9" alt="" /></div>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/geertdepaep.wordpress.com/83/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/geertdepaep.wordpress.com/83/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/geertdepaep.wordpress.com/83/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/geertdepaep.wordpress.com/83/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/geertdepaep.wordpress.com/83/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/geertdepaep.wordpress.com/83/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/geertdepaep.wordpress.com/83/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/geertdepaep.wordpress.com/83/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/geertdepaep.wordpress.com/83/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/geertdepaep.wordpress.com/83/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/geertdepaep.wordpress.com/83/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/geertdepaep.wordpress.com/83/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/geertdepaep.wordpress.com/83/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/geertdepaep.wordpress.com/83/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=geertdepaep.wordpress.com&amp;blog=2187749&amp;post=83&amp;subd=geertdepaep&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://geertdepaep.wordpress.com/2009/10/07/the-ultimate-story-about-ocr-ocrmirror-and-2-storage-boxes-chapter-4/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/4fbfb550fb3f85bbaf5284bb99e4df59?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">pier00</media:title>
		</media:content>

		<media:content url="http://img.zemanta.com/pixy.gif?x-id=56b76717-1c12-8ad9-a3c5-f9a220bcfbd9" medium="image" />
	</item>
		<item>
		<title>The ultimate story about OCR, OCRMIRROR and 2 storage boxes &#8211; Chapter 3</title>
		<link>http://geertdepaep.wordpress.com/2009/10/02/the-ultimate-story-about-ocr-ocrmirror-and-2-storage-boxes-chapter-3/</link>
		<comments>http://geertdepaep.wordpress.com/2009/10/02/the-ultimate-story-about-ocr-ocrmirror-and-2-storage-boxes-chapter-3/#comments</comments>
		<pubDate>Fri, 02 Oct 2009 15:36:39 +0000</pubDate>
		<dc:creator>pier00</dc:creator>
				<category><![CDATA[High availability]]></category>

		<guid isPermaLink="false">http://geertdepaep.wordpress.com/?p=82</guid>
		<description><![CDATA[Scenario 3: Loss of OCRmirror from the OCR MASTER only This is a followup of chapter 2. As we have seen in scenario 1, the OCR MASTER will update the votecount. Now let&#8217;s hide the ocrmirror from only 1 node: the node being the OCR MASTER, while the other node continues to see the ocrmirror. [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=geertdepaep.wordpress.com&amp;blog=2187749&amp;post=82&amp;subd=geertdepaep&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<h3>Scenario 3: Loss of OCRmirror from the OCR MASTER only</h3>
<p>This is a followup of <a href="http://geertdepaep.wordpress.com/2009/09/19/the-ultimate-story-about-ocr-ocrmirror-and-2-storage-boxes-chapter-2/">chapter 2</a>.</p>
<p>As we have  seen in scenario 1, the OCR MASTER will update the votecount. Now let&#8217;s hide the  ocrmirror from only 1 node: the node being the OCR MASTER, while the other node continues to  see the ocrmirror. Will CRS get confused about this?</p>
<p>Note: while doing this  test, crs is running on both nodes.<br />
<span id="more-82"></span></p>
<p>In this scenario, node 2 is the OCR  MASTER. In fact I haven&#8217;t found any command to query who is the master. The only  way to find out is to compare the crsd logfiles on all nodes to find the most  recent message &#8220;I AM THE NEW OCR MASTER&#8221;. If anyone knows a better way for  determining this, please let me know.</p>
<p>So when hiding the ocrmirror from  node 2, we see in its alert file, as expected:</p>
<blockquote><p>2008-07-23 09:14:53.921<br />
[crsd(8215)]CRS-1006:The OCR location  <span style="color:#ff8000;">/dev/oracle/ocrmirror is inaccessible</span>. Details in  /app/oracle/crs/log/nodeb01/crsd/crsd.log.</p></blockquote>
<p>and in its  logfile:</p>
<blockquote><p>2008-07-23 09:14:53.920: [ OCROSD][14]utwrite:3: problem writing the  buffer 1a33000 buflen 4096 retval -1 phy_offset 143360 retry 0<br />
2008-07-23  09:14:53.920: [ OCROSD][14]utwrite:4: problem writing the buffer errno 5  errstring I/O error<br />
2008-07-23 09:14:53.922: [ OCRRAW][34]propriowv_bootbuf:  <span style="color:#ff8000;">Vote information on disk 0 [/dev/oracle/ocr] is  adjusted from [1/2] to [2/2]</span></p></blockquote>
<p>Nothing appears in the  logfiles of the non-ocr-master, i.e. node 1. So until now this situation is still identical as in  scenario 1: it is the ocr master who updates the votecount after loosing the  other ocr.</p>
<p>The ocrcheck on node 2 (master) now gives:</p>
<pre>         Device/File Name         : /dev/oracle/ocr
                                    Device/File integrity check succeeded
         Device/File Name         : /dev/oracle/ocrmirror
                                    &lt;span style="color: rgb(255, 128, 0);"&gt;Device/File unavailable&lt;/span&gt;
         Cluster registry integrity check succeeded</pre>
<p>But  the output on node 1 (non-master) is different:</p>
<pre>         Device/File Name         : /dev/oracle/ocr&lt;br /&gt;                                    Device/File integrity check succeeded&lt;br /&gt;         Device/File Name         : /dev/oracle/ocrmirror                                    &lt;br /&gt;                                    &lt;span style="color: rgb(255, 128, 0);"&gt;Device/File needs to be synchronized with the other device&lt;/span&gt;&lt;br /&gt;         Cluster registry integrity check succeeded</pre>
<p>This  makes sense, because node 2 cannot see the device (device/file unavailable)  while node 1 sees both devices with different vote count (2 votes for ocr and 1  vote for ocrmirror, so it asks to resync just as in scenario 1 after the  ocrmirror was visible again).</p>
<p>So now I want to try to confuse CRS. I will  try to resync the ocrmirror again from node 1. So this will update the vote count of each device to 1. Technically this is possible  because node 1 can see both devices, but if it succeeds node 2 will be left with  one ocr device having one vote., and we know from rule 3 that CRS cannot run in  that case. Will crs then crash on node 2?&#8230;</p>
<p>So we do on <span style="text-decoration:underline;">node 1</span>:</p>
<pre>-bash-3.00# ocrconfig -replace ocrmirror /dev/oracle/ocrmirror</pre>
<p>Bad  luck, it fails with</p>
<pre>PROT-21: Invalid parameter</pre>
<p>Very clear, right&#8230; The interesting part however appears in  the crsd logfile of <span style="text-decoration:underline;">node 2</span>:</p>
<blockquote><p>2008-07-23 09:19:34.712: [ OCROSD][32]utdvch:0:failed to open OCR  file/disk /dev/oracle/ocrmirror, errno=5, os err string=I/O error<br />
2008-07-23  09:19:34.712: [ OCRRAW][32]dev_replace: master could not verify the new disk  (8)<br />
[ OCRSRV][32]proas_replace_disk: Failed in changing configurations in the  Master 8</p></blockquote>
<p>So this learns us that, when the ocrconfig command is  done on node 1 not being the master, that it will send this to node 2 being the  master and node 2 will execute it. What NOT happens is that crs crashes, nor  that node 1 takes over the mastership of node 2. Nice to know.<br />
Very  unlogical however is that, when doing the last command above, the crs alert file  of node 2 shows</p>
<blockquote><p>2008-07-23 09:19:34.711<br />
[crsd(8215)]CRS-1007:The OCR/OCR mirror  location was replaced by /dev/oracle/ocrmirror.</p></blockquote>
<p>This is WRONG.  The ocrmirror was not replaced. The message should be: &#8220;<strong>Trying</strong> to replace the OCR/OCR mirror location by /dev/oracle/ocrmirror&#8221;. It is just  that you know it.<br />
The logs on node 1 are correct. Find the latest log  in the &#8220;client&#8221; directory and read:</p>
<blockquote><p>Oracle Database 10g CRS Release 10.2.0.4.0 Production Copyright  1996, 2008 Oracle. All rights reserved.<br />
2008-07-23 09:19:34.694: [  OCRCONF][1]ocrconfig starts&#8230;<br />
2008-07-23 09:19:34.716: [  OCRCLI][1]proac_replace_dev:[/dev/oracle/ocrmirror]: Failed. Retval  [8]<br />
2008-07-23 09:19:34.716: [ OCRCONF][1]<span style="color:#ff8000;">The  input OCR device</span> either is identical to the other device or <span style="color:#ff8000;">cannot be opened</span><br />
2008-07-23 09:19:34.716: [  OCRCONF][1]Exiting [status=failed]&#8230;</p></blockquote>
<p>Conclusion: we cannot  confuse the crs!</p>
<p>After making the ocrmirror visible again and reissuing  the replace command on node 1, we get in the crs logfile on node 2 (master):</p>
<blockquote><p>2008-07-23 09:27:15.384: [ OCRRAW][32]proprioo: for disk 0  (/dev/oracle/ocr), id match (1), my id set (1385758746,1866209186) total id sets  (2), 1st set (1385758746,1866209186), 2nd set (1385758746,1866209186) my votes  (2), total votes (2)<br />
2008-07-23 09:27:15.384: [ OCRRAW][32]propriogid:1:  INVALID FORMAT<br />
2008-07-23 09:27:15.516: [ OCRRAW][32]propriowv_bootbuf: Vote  information on disk 1 [<span style="color:#ff9900;">/dev/oracle/ocrmirror</span>] is adjusted from [0/0] to <span style="color:#ff9900;"> [1/2]</span><br />
2008-07-23 09:27:15.517: [ OCRRAW][32]propriowv_bootbuf: Vote  information on disk 0 [<span style="color:#ff9900;">/dev/oracle/ocr</span>] is adjusted from [2/2] to  <span style="color:#ff9900;">[1/2]</span><br />
2008-07-23 09:27:15.518: [ OCRMAS][25]th_master: Deleted ver keys from  cache (master)<br />
2008-07-23 09:27:15.628: [ OCRMAS][25]th_master: Deleted ver  keys from cache (master)</p></blockquote>
<p>and the crs logfile of node 1  (non-master):</p>
<blockquote><p>2008-07-23 09:27:15.543: [ OCRRAW][36]proprioo: for disk 0  (/dev/oracle/ocr), id match (1), my id set (1385758746,1866209186) total id sets  (2), 1st set (1385758746,1866209186), 2nd set (1385758746,1866209186) my votes  (1), total votes (2)<br />
2008-07-23 09:27:15.543: [ OCRRAW][36]proprioo: for disk  1 (/dev/oracle/ocrmirror), id match (1), my id set (1385758746,1866209186) total  id sets (2), 1st set (1385758746,1866209186), 2nd set (1385758746,1866209186) my  votes (1), total votes (2)<br />
2008-07-23 09:27:15.571: [ OCRMAS][25]th_master:  Deleted ver keys from cache (non master)<br />
2008-07-23 09:27:15.572: [  OCRMAS][25]th_master: Deleted ver keys from cache (non  master)</p></blockquote>
<p>and the client logfile of node 1:</p>
<blockquote><p>Oracle Database 10g CRS Release 10.2.0.4.0 Production Copyright  1996, 2008 Oracle. All rights reserved.<br />
2008-07-23 09:27:15.346: [  OCRCONF][1]ocrconfig starts&#8230;<br />
2008-07-23 09:27:15.572: [  OCRCONF][1]Successfully replaced OCR and set block 0<br />
2008-07-23 09:27:15.572:  [ OCRCONF][1]Exiting [status=success]&#8230;</p></blockquote>
<p>and all is ok  again. Each device has one vote again, and we are back in the &#8216;normal&#8217; situation.</p>
<h5>Conclusion</h5>
<p>We cannot confuse the CRS when ocr or ocrmirror disappears  from the ocr master node only.</p>
<p>But what is it disappears from the non-master node&#8230;? That&#8217;s stuff for the next chapter.</p>
<div class="zemanta-pixie"><img class="zemanta-pixie-img" src="http://img.zemanta.com/pixy.gif?x-id=97fc21f8-f4ea-81ba-af14-fd9c2f512178" alt="" /></div>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/geertdepaep.wordpress.com/82/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/geertdepaep.wordpress.com/82/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/geertdepaep.wordpress.com/82/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/geertdepaep.wordpress.com/82/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/geertdepaep.wordpress.com/82/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/geertdepaep.wordpress.com/82/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/geertdepaep.wordpress.com/82/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/geertdepaep.wordpress.com/82/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/geertdepaep.wordpress.com/82/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/geertdepaep.wordpress.com/82/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/geertdepaep.wordpress.com/82/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/geertdepaep.wordpress.com/82/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/geertdepaep.wordpress.com/82/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/geertdepaep.wordpress.com/82/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=geertdepaep.wordpress.com&amp;blog=2187749&amp;post=82&amp;subd=geertdepaep&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://geertdepaep.wordpress.com/2009/10/02/the-ultimate-story-about-ocr-ocrmirror-and-2-storage-boxes-chapter-3/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/4fbfb550fb3f85bbaf5284bb99e4df59?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">pier00</media:title>
		</media:content>

		<media:content url="http://img.zemanta.com/pixy.gif?x-id=97fc21f8-f4ea-81ba-af14-fd9c2f512178" medium="image" />
	</item>
		<item>
		<title>The ultimate story about OCR, OCRMIRROR and 2 storage boxes &#8211; Chapter 2</title>
		<link>http://geertdepaep.wordpress.com/2009/09/19/the-ultimate-story-about-ocr-ocrmirror-and-2-storage-boxes-chapter-2/</link>
		<comments>http://geertdepaep.wordpress.com/2009/09/19/the-ultimate-story-about-ocr-ocrmirror-and-2-storage-boxes-chapter-2/#comments</comments>
		<pubDate>Sat, 19 Sep 2009 19:29:55 +0000</pubDate>
		<dc:creator>pier00</dc:creator>
				<category><![CDATA[High availability]]></category>

		<guid isPermaLink="false">http://geertdepaep.wordpress.com/?p=81</guid>
		<description><![CDATA[Scenario 2: loss of ocrmirror, both nodes down (This is the follow-up of chapter 1) Let&#8217;s investigate the vote count a little further by doing the following test: First stop crs on both nodes Then make the lun with ocrmirror unavailable to both nodes What happens? Let&#8217;s check the ocr status before starting crs on [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=geertdepaep.wordpress.com&amp;blog=2187749&amp;post=81&amp;subd=geertdepaep&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<h3>Scenario 2: loss of ocrmirror, both nodes down</h3>
<p>(This is the follow-up of <a href="http://geertdepaep.wordpress.com/2009/09/01/the-ultimate-story-about-ocr-ocrmirror-and-2-storage-boxes-chapter-1/">chapter 1</a>)</p>
<p>Let&#8217;s investigate the  vote count a little further by doing the following test:</p>
<ul>
<li>First stop crs on both nodes</li>
<li>Then make the lun with ocrmirror unavailable to both  nodes</li>
</ul>
<p>What happens?<br /><span id="more-81"></span></p>
<p>Let&#8217;s check the ocr status before starting crs on any  node:</p>
<blockquote><p>bash-3.00# ocrcheck<br />PROT-602: Failed to retrieve data from the  cluster registry</p>
</blockquote>
<p>The crs alert file shows:</p>
<blockquote><p>2008-07-18 15:57:36.438<br />[client(24204)]CRS-1011:OCR cannot  determine that the OCR content contains the latest updates. Details in  /app/oracle/crs/log/nodea01/client/ocrcheck_24204.log.</p>
</blockquote>
<p>and  the mentioned ocrcheck_24204.log file:</p>
<blockquote><p>Oracle Database 10g CRS Release 10.2.0.4.0 Production Copyright  1996, 2008 Oracle.<br />All rights reserved.<br />2008-07-18 15:57:36.405:  [OCRCHECK][1]ocrcheck starts&#8230;<br />2008-07-18 15:57:36.437: [  OCRRAW][1]proprioini: <span style="color:rgb(255,128,0);">disk 0 (/dev/oracle/ocr)  doesn&#8217;t<br />have enough votes (1,2)</span><br />2008-07-18 15:57:36.438: [  OCRRAW][1]proprinit: Could not open raw device<br />2008-07-18 15:57:36.438: [  default][1]a_init:7!: Backend init unsuccessful : [26]<br />2008-07-18  15:57:36.439: [OCRCHECK][1]Failed to access OCR repository: <span style="color:rgb(255,128,0);">[PROC-26: Error while accessing the physical  storage]</span><br />2008-07-18 15:57:36.439: [OCRCHECK][1]Failed to initialize  ocrchek2<br />2008-07-18 15:57:36.439: [OCRCHECK][1]Exiting  [status=failed]&#8230;</p>
</blockquote>
<p>I didn&#8217;t try to start the CRS at this  time, however I am sure it would result in the same error messages. Note the  colored messages. The second one explains what the real problem is: one of the  ocr devices is unavailable: error while accessing the physical storage. This is  exactly the information you need to troubleshoot a failing crs start. The other  message tells us more about the internals: the remaining ocr has only 1 vote,  which isn&#8217;t enough. So that&#8217;s rule 3 in the world of CRS. So read and remember for once and for all:</p>
<ol>
<li>Rule 1: CRS can start if it finds 2 ocr devices each having one vote (the  normal case)</li>
<li>Rule 2: CRS can start if it finds 1 ocr having 2 votes (the case after  loosing the ocrmirror).</li>
<li>Rule 3: CRS CANNOT start if it finds only one ocr device having only 1  vote</li>
</ol>
<p>Now if this is a production environment and we really want  to get the cluster + databases up, how do we proceed? Well we can do so by  manually telling the cluster that the remaining ocr is valid and up-to-date.  Note however that this is an important decision. It is up to you to know that  the remaining ocr is valid. If you have been playing too much with missing luns,  adding services, missing the other lun etc&#8230; it may be that the contents of the  &#8216;invisible&#8217; ocrmirror are maybe more recent than those of the visible ocr. If in  that case you tell crs that the ocr is valid, you may loose important  information from your ocrmirror. Anyway in most cases you will know very well  what to do, and issue <i>as root</i>:</p>
<blockquote><p>ocrconfig -overwrite</p>
</blockquote>
<p>Now find the most recent file  in $ORA_CRS_HOME/log/nodename/client and see that it contains:</p>
<blockquote><p>Oracle Database 10g CRS Release 10.2.0.4.0 Production Copyright  1996, 2008 Oracle.<br />All rights reserved.<br />2008-07-18 15:59:56.828: [  OCRCONF][1]ocrconfig starts&#8230;<br />2008-07-18 15:59:58.644: [  OCRRAW][1]propriowv_bootbuf: <span style="color:rgb(255,128,0);">Vote information on  disk<br />0 [/dev/oracle/ocr] is adjusted from [1/2] to [2/2]</span><br />2008-07-18  15:59:58.644: [ OCRCONF][1]Successfully overwrote OCR configuration  on<br />disk<br />2008-07-18 15:59:58.644: [ OCRCONF][1]Exiting  [status=success]&#8230;</p>
</blockquote>
<p>So now we are in the situation of scenario  1: one ocr device available having 2 votes. This gives:</p>
<pre>Status of Oracle Cluster Registry is as follows :
         Version                  :          2
         Total space (kbytes)     :     295452
         Used space (kbytes)      :       5112
         Available space (kbytes) :     290340
         ID                       : 1930338735
         Device/File Name         : /dev/oracle/ocr
                                    Device/File integrity check succeeded
         Device/File Name         : /dev/oracle/ocrmirror
                                    Device/File unavailable

         Cluster registry integrity check succeeded</pre>
<p>And  the crs startup happens without problem:</p>
<pre>-bash-3.00# crsctl start crs&lt;br /&gt;Attempting to start CRS stack&lt;br /&gt;The CRS stack will be started shortly</pre>
<p>Note  however that you still have to recover from this as in scenario 1 using  &#8220;ocrconfig -replace ocrmirror /dev/&#8230;&#8221; once the storage box containing the ocrmirror is available  again.</p>
<h5>Conclusion of scenario 2</h5>
<p>When loosing an ocr or ocrmirror while crs  is down on both nodes, Oracle is not able to update the vote count of the  remaining ocr (no crs processes are running to do this). As a consequence it is up to you to do that by using the  &#8220;overwrite&#8221; option of ocrconfig. After this, CRS can start as normal and later  on you can recover from this when the ocrmirror becomes available again or when  you can use another new device for ocrmirror.</p>
<p>So this looks great, let&#8217;s buy that additional storage box now.</p>
<p>But still I am not  satisfied yet. Until now we had &#8216;clean errors&#8217;. I.e. <span style="text-decoration:underline;">both</span> nodes were up or down, and the  storage disappeared from <span style="text-decoration:underline;">both</span>  nodes at the same time. Let&#8217;s <i>play</i> a little more in the next chapters&#8230;</p>
<div class="zemanta-pixie"><img class="zemanta-pixie-img" alt="" src="http://img.zemanta.com/pixy.gif?x-id=9d2ba1ef-b8ab-8aac-9510-3488c1a68683" /></div>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/geertdepaep.wordpress.com/81/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/geertdepaep.wordpress.com/81/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/geertdepaep.wordpress.com/81/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/geertdepaep.wordpress.com/81/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/geertdepaep.wordpress.com/81/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/geertdepaep.wordpress.com/81/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/geertdepaep.wordpress.com/81/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/geertdepaep.wordpress.com/81/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/geertdepaep.wordpress.com/81/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/geertdepaep.wordpress.com/81/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/geertdepaep.wordpress.com/81/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/geertdepaep.wordpress.com/81/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/geertdepaep.wordpress.com/81/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/geertdepaep.wordpress.com/81/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=geertdepaep.wordpress.com&amp;blog=2187749&amp;post=81&amp;subd=geertdepaep&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://geertdepaep.wordpress.com/2009/09/19/the-ultimate-story-about-ocr-ocrmirror-and-2-storage-boxes-chapter-2/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/4fbfb550fb3f85bbaf5284bb99e4df59?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">pier00</media:title>
		</media:content>

		<media:content url="http://img.zemanta.com/pixy.gif?x-id=9d2ba1ef-b8ab-8aac-9510-3488c1a68683" medium="image" />
	</item>
		<item>
		<title>The ultimate story about OCR, OCRMIRROR and 2 storage boxes &#8211; Chapter 1</title>
		<link>http://geertdepaep.wordpress.com/2009/09/01/the-ultimate-story-about-ocr-ocrmirror-and-2-storage-boxes-chapter-1/</link>
		<comments>http://geertdepaep.wordpress.com/2009/09/01/the-ultimate-story-about-ocr-ocrmirror-and-2-storage-boxes-chapter-1/#comments</comments>
		<pubDate>Tue, 01 Sep 2009 19:44:42 +0000</pubDate>
		<dc:creator>pier00</dc:creator>
				<category><![CDATA[High availability]]></category>

		<guid isPermaLink="false">http://geertdepaep.wordpress.com/?p=90</guid>
		<description><![CDATA[Scenario 1: loss of ocrmirror, both nodes up (This is the followup of article &#8220;Introduction&#8220;) Facts CRS is running on all nodes The storage box containing the OCRmirror is made unavailable to both hosts (simulating a crash of one storage box). What happens? The crs alertfile ($ORA_CRS_HOME/log/hostname/alert.log) of node 1 shows: 2008-07-18 15:30:23.176 [crsd(6563)]CRS-1006:The OCR [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=geertdepaep.wordpress.com&amp;blog=2187749&amp;post=90&amp;subd=geertdepaep&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<h3>Scenario 1: loss of ocrmirror, both nodes up</h3>
<p>(This is the followup of article &#8220;<a href="http://geertdepaep.wordpress.com/2009/08/28/the-ultimate-story-about-ocr-ocrmirror-and-2-storage-boxes-introduction/">Introduction</a>&#8220;)</p>
<h5>Facts</h5>
<ul>
<li>CRS is running on all nodes</li>
<li>The storage box containing the OCRmirror is made unavailable to both hosts (simulating a crash of one storage box).</li>
</ul>
<p>What happens?<br />
<span id="more-90"></span></p>
<p>The crs alertfile  ($ORA_CRS_HOME/log/hostname/alert.log) of node 1 shows:</p>
<blockquote><p>2008-07-18 15:30:23.176<br />
[crsd(6563)]CRS-1006:The OCR location  /dev/oracle/ocrmirror is inaccessible. Details in  /app/oracle/crs/log/nodea01/crsd/crsd.log.</p></blockquote>
<p>And the CRS  logfile of node 1 shows:</p>
<blockquote><p>2008-07-18 15:30:23.176: [ OCROSD][14]utwrite:3: problem writing the  buffer 1c03000 buflen 4096 retval -1 phy_offset 102400 retry 0<br />
2008-07-18  15:30:23.176: [ OCROSD][14]utwrite:4: problem writing the buffer errno 5  errstring I/O error<br />
2008-07-18 15:30:23.177: [ OCRRAW][768]propriowv_bootbuf:  <span style="color:#ff8000;">Vote information on disk 0 [/dev/oracle/ocr] is  adjusted from [1/2] to [2/2]</span></p></blockquote>
<p>There is nothing in the  crs alertfile or crsd logfile of node 2 (allthough node 2 can&#8217;t see the lun  either).<br />
On both nodes we have:</p>
<pre>(/app/oracle) $ ocrcheck
Status of Oracle Cluster Registry is as follows :
         Version                  :          2
         Total space (kbytes)     :     295452
         Used space (kbytes)      :       5112
         Available space (kbytes) :     290340
         ID                       : 1930338735
         Device/File Name         : /dev/oracle/ocr
                                    Device/File integrity check succeeded
         Device/File Name         : /dev/oracle/ocrmirror
                                    &lt;font color="#ff9900"&gt;Device/File unavailable&lt;/font&gt;
         Cluster registry integrity check succeeded</pre>
<p>CRS  continues to work <span style="color:#ff8000;">normally</span> on both nodes</p>
<h5>Discussion</h5>
<p>This test indicates that the loss of the ocrmirror leaves  the cluster running normally. In other words, a crash of a storage box would  allow us to continue our production normally. Great!</p>
<p>However I&#8217;m not  easily satisfied and hence I still have a lot of questions: how to recover from  this, what happens internally, can we now change/update the ocr, &#8230;? Lets&#8217;  investigate.</p>
<p>The most interesting things are colored in the output above.  The fact that the ocrmirror device file is unavailable makes sense. Remember  however the other message: vote count updated from 1 to 2.<br />
Let&#8217;s see what  happens if we now stop and start CRS on node 1 (while ocrmirror is still  unavailable):<br />
Stopping CRS on node 1 happens as usual, no error messages.  However at the time of stopping CRS on node 1, we see a very interesting message  in the crsd logfile of <strong>node 2</strong>:</p>
<blockquote><p>2008-07-18 15:34:38.504: [ OCRMAS][23]th_master:13: <span style="color:#ff8000;">I AM THE NEW OCR MASTER</span> at incar 2. Node Number  2<br />
2008-07-18 15:34:38.511: [ OCRRAW][23]proprioo: for disk 0  (/dev/oracle/ocr), id match (1), my id set (1385758746,1866209186) total id sets  (1), 1st set (138575874 6,1866209186), 2nd set (0,0) <span style="color:#ff8000;">my votes (2), total votes (2)</span><br />
2008-07-18  15:34:38.514: [ OCROSD][23]utread:3: problem reading buffer 162e000 buflen 4096  retval -1 phy_offset 106496 retry 0<br />
2008-07-18 15:34:38.514: [  OCROSD][23]utread:4: problem reading the buffer errno 5 errstring I/O  error<br />
2008-07-18 15:34:38.559: [ OCRMAS][23]th_master: Deleted ver keys from  cache (master)</p></blockquote>
<p>I am the new master??? So it looks as if node 1  was the master until we stopped CRS there. This makes a link to the fact that,  when the lun became unavailable, that only node 1 wrote messages in its  logfiles. At that time, nothing was written into the logfile of node 2, because  node 2 was not the master! A very interesting concept: <span style="text-decoration:underline;">in a RAC cluster, one node is the the crs  master and is responsible for updating the vote count in the OCR.</span> I never  read that in the doc&#8230;. Also note that the new master also identifies that the  ocr has 2 votes now: &#8220;my votes (2)&#8221;.</p>
<p>Also, at the time of stopping CRS on  node 1, the crs alert file of node 2 showed:</p>
<blockquote><p>2008-07-18 15:34:38.446<br />
[evmd(18282)]CRS-1006:The OCR location  <span style="color:#ff8000;">/dev/oracle/ocrmirror is inaccessible</span>. Details in  /app/oracle/crs/log/nodeb01/evmd/evmd.log.<br />
2008-07-18  15:34:38.514<br />
[crsd(18594)]CRS-1006:The OCR location /dev/oracle/ocrmirror is  inaccessible. Details in  /app/oracle/crs/log/nodeb01/crsd/crsd.log.<br />
2008-07-18  15:34:38.558<br />
[crsd(18594)]CRS-1005:The OCR upgrade was completed. Version has  changed from 169870336 to 169870336. Details in  /app/oracle/crs/log/nodeb01/crsd/crsd.log.<br />
2008-07-18  15:34:55.153</p></blockquote>
<p>So it looks as if node 2 is checking again the  availability of the ocrmirror and sees it is not available.</p>
<p>Now let&#8217;s  start crs on node 1 again, maybe he becomes master again?&#8230; Not really. The  only thing we see in the crsd logfile is:</p>
<blockquote><p>2008-07-18 15:39:19.603: [ CLSVER][1] Active Version from  OCR:10.2.0.4.0<br />
2008-07-18 15:39:19.603: [ CLSVER][1] Active Version and  Software Version are same<br />
2008-07-18 15:39:19.603: [ CRSMAIN][1] Initializing  OCR<br />
2008-07-18 15:39:19.619: [ OCRRAW][1]proprioo: for disk 0  (/dev/oracle/ocr), id match (1), my id set (1385758746,1866209186) total id sets  (1), 1st set (1385758746,1866209186), 2nd set (0,0) my votes (2), total votes  (2)</p></blockquote>
<h5>Recovery</h5>
<p>Now how do we get things back to normal? Let&#8217;s first make  the lun visible again on the san switch. At that time nothing happens in any  logfile, so CRS doesn&#8217;t seem to poll to see if the ocrmirror is back. However  when we execute now an ocrcheck, we get:</p>
<pre>Status of Oracle Cluster Registry is as follows :&lt;br /&gt;Version                  :          2&lt;br /&gt;Total space (kbytes)     :     295452&lt;br /&gt;Used space (kbytes)      :       5112&lt;br /&gt;Available space (kbytes) :     290340&lt;br /&gt;ID                       : 1930338735&lt;br /&gt;Device/File Name         : /dev/oracle/ocr&lt;br /&gt;Device/File integrity check succeeded&lt;br /&gt;Device/File Name         : /dev/oracle/ocrmirror&lt;br /&gt;&lt;span style="color: rgb(255, 128, 0);"&gt;Device/File needs to be synchronized with the other device&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Cluster registry integrity check succeeded</pre>
<p>Again,  this makes sense. While the ocrmirror was unavailable, you may have added  services, instances or whatever, so the contents of the (old) ocrmirror may be  different from those of the current ocr. In our case however, nothing was  changed on cluster level, so theoretically the contents of ocr and ocrmirror  should still be the same. Still we get the message above. Anyway, the way to  synchronize this ocr is to issue <em>as root</em>:</p>
<blockquote><p>ocrconfig -replace ocrmirror  /dev/oracle/ocrmirror</p></blockquote>
<p>This will copy the contents of the ocr  over the ocrmirror being located at /dev/oracle/ocrmirror. In other words, it  will create a new ocrmirror in location /dev/oracle/ocrmirror as a copy of the  existing ocr. Be careful with the syntax; do not use &#8220;-replace <span style="text-decoration:underline;">ocr</span>&#8221; when  the ocr<span style="text-decoration:underline;">mirror</span> is corrupt.<br />
At  that time, we see in the crs logfile on both nodes:</p>
<blockquote><p>2008-07-18 15:51:06.254: [ OCRMAS][25]th_master: Deleted ver keys  from cache (non master)</p>
<p>2008-07-18 15:51:06.263: [ OCRRAW][30]proprioo:  for disk 0 (<span style="color:#ff8000;">/dev/oracle/ocr</span>), id match (1),  my id set (1385758746,1866209186) total id sets (2), 1st set  (1385758746,1866209186), 2nd set (1385758746,1866209186) <span style="color:#ff8000;">my votes (1)</span>, total votes (2)</p>
<p>2008-07-18  15:51:06.263: [ OCRRAW][30]proprioo: for disk 1 (<span style="color:#ff8000;">/dev/oracle/ocrmirror</span>), id match (1), my id set  (1385758746,1866209186) total id sets (2), 1st set (1385758746,1866209186), 2nd  set (1385758746,1866209186) <span style="color:#ff8000;">my votes (1)</span>,  total votes (2)</p>
<p>2008-07-18 15:51:06.364: [ OCRMAS][25]th_master: Deleted  ver keys from cache (non master)</p></blockquote>
<p>and in the alert file:</p>
<blockquote><p>2008-07-18 15:51:06.246<br />
[crsd(13848)]CRS-1007:<span style="color:#ff8000;">The OCR/OCR mirror location was replaced by  /dev/oracle/ocrmirror</span>.</p></blockquote>
<p>Note again the highlighted  messages above: each ocr again has 1 vote. And all is ok again:</p>
<pre>Status of Oracle Cluster Registry is as follows :&lt;br /&gt;Version                  :          2&lt;br /&gt;Total space (kbytes)     :     295452&lt;br /&gt;Used space (kbytes)      :       5112&lt;br /&gt;Available space (kbytes) :     290340&lt;br /&gt;ID                       : 1930338735&lt;br /&gt;Device/File Name         : /dev/oracle/ocr&lt;br /&gt;Device/File integrity check succeeded&lt;br /&gt;Device/File Name         : /dev/oracle/ocrmirror&lt;br /&gt;Device/File integrity check succeeded&lt;br /&gt;Cluster registry integrity check succeeded</pre>
<h5>Conclusion of scenario 1</h5>
<p>Loosing the storage box containing the  ocrmirror is no problem (the same is true for loosing ocr while ocrmirror remains available). Moreover it can be recovered without having to stop the  cluster (the restart of crs on node 1 above was for educational purposes only).  This corresponds with what is told in the RAC FAQ on Metalink Note 220970.1: &#8220;If  the corruption happens while the Oracle Clusterware stack is up and running,  then the corruption will be tolerated and the Oracle Clusterware will continue  to funtion without interruptions&#8221; (however I think that the logfiles above give  you much more insight in what really happens).</p>
<p>However another important  concept is the story of the votecount. The test above shows that CRS is able to  start if it finds 2 ocr devices each having one vote (the normal case) or if it  finds 1 ocr having 2 votes (the case after loosing the ocrmirror). Note that at  the moment of the failure, the vote count of the ocr could be increased by  oracle from 1 to 2, because CRS was running.</p>
<p>In the next chapter, we will do this over again, but with both nodes down&#8230;</p>
<div class="zemanta-pixie"><img class="zemanta-pixie-img" src="http://img.zemanta.com/pixy.gif?x-id=f53e83da-ad0e-8e9f-a158-e92ab0f7931b" alt="" /></div>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/geertdepaep.wordpress.com/90/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/geertdepaep.wordpress.com/90/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/geertdepaep.wordpress.com/90/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/geertdepaep.wordpress.com/90/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/geertdepaep.wordpress.com/90/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/geertdepaep.wordpress.com/90/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/geertdepaep.wordpress.com/90/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/geertdepaep.wordpress.com/90/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/geertdepaep.wordpress.com/90/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/geertdepaep.wordpress.com/90/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/geertdepaep.wordpress.com/90/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/geertdepaep.wordpress.com/90/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/geertdepaep.wordpress.com/90/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/geertdepaep.wordpress.com/90/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=geertdepaep.wordpress.com&amp;blog=2187749&amp;post=90&amp;subd=geertdepaep&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://geertdepaep.wordpress.com/2009/09/01/the-ultimate-story-about-ocr-ocrmirror-and-2-storage-boxes-chapter-1/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/4fbfb550fb3f85bbaf5284bb99e4df59?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">pier00</media:title>
		</media:content>

		<media:content url="http://img.zemanta.com/pixy.gif?x-id=f53e83da-ad0e-8e9f-a158-e92ab0f7931b" medium="image" />
	</item>
		<item>
		<title>The ultimate story about OCR, OCRMIRROR and 2 storage boxes &#8211; Introduction</title>
		<link>http://geertdepaep.wordpress.com/2009/08/28/the-ultimate-story-about-ocr-ocrmirror-and-2-storage-boxes-introduction/</link>
		<comments>http://geertdepaep.wordpress.com/2009/08/28/the-ultimate-story-about-ocr-ocrmirror-and-2-storage-boxes-introduction/#comments</comments>
		<pubDate>Fri, 28 Aug 2009 18:57:01 +0000</pubDate>
		<dc:creator>pier00</dc:creator>
				<category><![CDATA[High availability]]></category>

		<guid isPermaLink="false">http://geertdepaep.wordpress.com/?p=71</guid>
		<description><![CDATA[Some time ago I wrote a blog about stretched clusters and the OCR. The final conclusion at that time was that there was no easy way to get your OCR safe on both storages, and hence I disrecommended clusters with 2 storage boxes. However, after some more investigation I may have to change my mind. [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=geertdepaep.wordpress.com&amp;blog=2187749&amp;post=71&amp;subd=geertdepaep&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Some time ago I wrote <a href="http://geertdepaep.wordpress.com/2008/01/24/my-experiences-with-ocrmirror-voting-disks-and-stretched-clusters/" target="_blank">a blog about stretched clusters and the OCR</a>. The final conclusion at that time was that there was no easy way to get your OCR safe on both storages, and hence I disrecommended clusters with 2 storage boxes. However, after some more investigation I may have to change my mind. I did extended testing on the OCR and in this blog I want to share my experiences.<span id="more-71"></span></p>
<p>This is the setup:</p>
<ul>
<li>2-node RAC cluster (10.2.0.4 on Solaris), located in 2 server rooms</li>
<li>2 storage boxes, one in each server room</li>
<li>ASM mirroring of all data (diskgroups with normal redudancy)</li>
<li>One voting disk on one storage box, 2nd voting disk on the other box, 3rd voting disk on nfs on a server in a 3rd location (outside the 2 server rooms)</li>
</ul>
<p>For the components above, this setup is safe against server room failure:</p>
<ul>
<li>The data is mirrored in ASM and will remain available on the other box.</li>
<li>The cluster can continue because it still sees 2 voting disks (one in the surviving server room and one on nfs).</li>
</ul>
<p>But what about the OCR?</p>
<p>We did as what looks logical: OCR on storage box 1 and OCRmirror on storage box 2, resulting in:</p>
<pre>         Device/File Name         : /dev/oracle/ocr
                                    Device/File integrity check succeeded
         Device/File Name         : /dev/oracle/ocrmirror
                                    Device/File integrity check succeeded</pre>
<p>Now we can start playing. For the unattended reader, &#8220;playing&#8221; means: closing ports on the fibre switches in such a way that a storage box becomes totally unavailable to the servers. This simulates a storage box failure.</p>
<p>The result is a story of 5 chapters and a conclusion. Please standby for the next upcoming blog posts.</p>
<div class="zemanta-pixie"><img class="zemanta-pixie-img" src="http://img.zemanta.com/pixy.gif?x-id=cb4142d1-67b1-81e4-85dc-d3b4617f91e1" alt="" /></div>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/geertdepaep.wordpress.com/71/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/geertdepaep.wordpress.com/71/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/geertdepaep.wordpress.com/71/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/geertdepaep.wordpress.com/71/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/geertdepaep.wordpress.com/71/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/geertdepaep.wordpress.com/71/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/geertdepaep.wordpress.com/71/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/geertdepaep.wordpress.com/71/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/geertdepaep.wordpress.com/71/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/geertdepaep.wordpress.com/71/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/geertdepaep.wordpress.com/71/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/geertdepaep.wordpress.com/71/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/geertdepaep.wordpress.com/71/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/geertdepaep.wordpress.com/71/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=geertdepaep.wordpress.com&amp;blog=2187749&amp;post=71&amp;subd=geertdepaep&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://geertdepaep.wordpress.com/2009/08/28/the-ultimate-story-about-ocr-ocrmirror-and-2-storage-boxes-introduction/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/4fbfb550fb3f85bbaf5284bb99e4df59?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">pier00</media:title>
		</media:content>

		<media:content url="http://img.zemanta.com/pixy.gif?x-id=cb4142d1-67b1-81e4-85dc-d3b4617f91e1" medium="image" />
	</item>
		<item>
		<title>Apex beats Access</title>
		<link>http://geertdepaep.wordpress.com/2008/09/21/apex-beats-access/</link>
		<comments>http://geertdepaep.wordpress.com/2008/09/21/apex-beats-access/#comments</comments>
		<pubDate>Sun, 21 Sep 2008 19:18:23 +0000</pubDate>
		<dc:creator>pier00</dc:creator>
				<category><![CDATA[Apex]]></category>

		<guid isPermaLink="false">http://geertdepaep.wordpress.com/2008/09/21/apex-beats-access/</guid>
		<description><![CDATA[This year I became part of the parents council of my childrens school. And like every small organisation in Belgium (football clubs, gymnastics, schools, &#8230;) it is the time of year to do the annual mussel-event for fund raising (mmm, delicious). My school happened to use an Access-IIS-ASP application to do the billing when people [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=geertdepaep.wordpress.com&amp;blog=2187749&amp;post=45&amp;subd=geertdepaep&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>This year I became part of the parents council of my childrens school. And like every small organisation in Belgium (football clubs, gymnastics, schools, &#8230;) it is the time of year to do the annual <a href="http://en.wikipedia.org/wiki/Mussel">mussel</a>-event for fund raising (mmm, delicious). My school happened to use an Access-IIS-ASP application to do the billing when people leave the event (do the calculation for their food and drink consumptions). But o fortuna, the evening before the event, the application turned out to systematically crash when entering data, and murphy oh murphy, the author of it had just become father and could not be reached. Panic of course.<span id="more-45"></span></p>
<p>Wasn&#8217;t it that I happened to have an old laptop with Oracle XE (and hence Apex) installed and I said: give me two hours. The data model was very simple: one table with all food, drinks and prices, a second table for the history of all orders and payments and then some screens calculating amounts multiplied by price and summing it up.</p>
<p>And then you see how great Apex is. In two hours indeed, I could make a fully functional, nice looking and robust application with an operational and an admin part, including various reports. It worked like a charm and the school team was astounished that such a core part of their administration could be made in such little time. Considering the alternative they would have considered otherwise (Excel, not keeping any history of all consumptions, so preventing us to plan our inventory for next year), this was a great solution. I hope it may open the way for more Apex applications in the school, because Apex is really a great tool.</p>
<p>P.S. I won&#8217;t use my blog for not telling that I won&#8217;t go to OOW with no flight number and not staying in any hotel, as I think that it is quite a waste of bandwidth spending complete blog posts on the practical arrangements of OOW trips. I am really waiting for the real technical posts and experiences of all those people there.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/geertdepaep.wordpress.com/45/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/geertdepaep.wordpress.com/45/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/geertdepaep.wordpress.com/45/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/geertdepaep.wordpress.com/45/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/geertdepaep.wordpress.com/45/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/geertdepaep.wordpress.com/45/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/geertdepaep.wordpress.com/45/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/geertdepaep.wordpress.com/45/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/geertdepaep.wordpress.com/45/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/geertdepaep.wordpress.com/45/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/geertdepaep.wordpress.com/45/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/geertdepaep.wordpress.com/45/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/geertdepaep.wordpress.com/45/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/geertdepaep.wordpress.com/45/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=geertdepaep.wordpress.com&amp;blog=2187749&amp;post=45&amp;subd=geertdepaep&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://geertdepaep.wordpress.com/2008/09/21/apex-beats-access/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/4fbfb550fb3f85bbaf5284bb99e4df59?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">pier00</media:title>
		</media:content>
	</item>
		<item>
		<title>A really recommended ASM patch &#8211; failing lun</title>
		<link>http://geertdepaep.wordpress.com/2008/08/06/a-really-recommended-asm-patch-failing-lun/</link>
		<comments>http://geertdepaep.wordpress.com/2008/08/06/a-really-recommended-asm-patch-failing-lun/#comments</comments>
		<pubDate>Wed, 06 Aug 2008 15:22:30 +0000</pubDate>
		<dc:creator>pier00</dc:creator>
				<category><![CDATA[High availability]]></category>

		<guid isPermaLink="false">http://geertdepaep.wordpress.com/2008/08/06/a-really-recommended-asm-patch-failing-lun/</guid>
		<description><![CDATA[The following is a real life experience about failing disks/luns and how ASM reacts to this. We used 10.2.0.4 on Solaris with a HDS storage box and MPXio. We made an ASM diskgroup of 2 mirrorred disks. Then we made the luns unavailable to the hosts (hide lun). The result was not really what we [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=geertdepaep.wordpress.com&amp;blog=2187749&amp;post=16&amp;subd=geertdepaep&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>The following is a real life experience about failing disks/luns and how ASM reacts to this. We used 10.2.0.4 on Solaris with a HDS storage box and MPXio. We made an ASM diskgroup of 2 mirrorred disks. Then we made  the luns <strong>un</strong>available to the hosts (hide lun). The result was not really what we expected.<span id="more-16"></span>After hiding the lun, the ASM alert file shows the following &#8216;normal&#8217; messages:</p>
<blockquote><p>15:26:57+: Errors in file /app/oracle/admin/+ASM/bdump/+asm1_gmon_10611.trc:<br />
15:26:57+: ORA-27091: unable to queue I/O<br />
15:26:57+: ORA-27072: File I/O error<br />
15:26:57+: SVR4 Error: 5: I/O error<br />
15:26:57+: Additional information: 4<br />
15:26:57+: Additional information: 2048<br />
15:26:57+: Additional information: -1</p></blockquote>
<p>and some time later</p>
<blockquote><p>15:32:23+: WARNING: offlining disk 0.4042332304 (ARESDATAA) with mask 0&#215;3<br />
15:32:23+: WARNING: offlining disk 0.4042332304 (ARESDATAA) with mask 0&#215;3</p></blockquote>
<p>and</p>
<blockquote><p>15:32:33+: WARNING: kfk failed to open a disk[/dev/oracle/asm/aresdata-a]<br />
15:32:33+: Errors in file /app/oracle/admin/+ASM/udump/+asm1_ora_18313.trc:<br />
15:32:33+: ORA-15025: could not open disk &#8216;/dev/oracle/asm/aresdata-a&#8217;<br />
15:32:33+: ORA-27041: unable to open file<br />
15:32:33+: SVR4 Error: 5: I/O error<br />
15:32:33+: Additional information: 3<br />
15:32:33+: NOTE: PST update: grp = 1, dsk = 0, mode = 0&#215;6<br />
15:32:35+: NOTE: PST update: grp = 1, dsk = 0, mode = 0&#215;4<br />
15:32:35+: NOTE: group ARESDATA: relocated PST to: disk 0001 (PST copy 0)<br />
15:32:35+: NOTE: PST update: grp = 1, dsk = 0, mode = 0&#215;4<br />
15:32:35+: NOTE: cache closing disk 0 of grp 1: ARESDATAA<br />
15:32:35+: NOTE: cache closing disk 0 of grp 1: ARESDATAA<br />
15:33:50+: WARNING: PST-initiated drop disk 1(780265568).0(4042332304) (ARESDATAA)<br />
15:33:50+: NOTE: PST update: grp = 1<br />
15:33:50+: NOTE: group ARESDATA: relocated PST to: disk 0001 (PST copy 0)<br />
15:33:50+: NOTE: requesting all-instance membership refresh for group=1<br />
15:33:50+: NOTE: membership refresh pending for group 1/0x2e81e860 (ARESDATA)<br />
15:33:50+: SUCCESS: refreshed membership for 1/0x2e81e860 (ARESDATA)<br />
15:33:53+: SUCCESS: PST-initiated disk drop completed<br />
15:33:53+: SUCCESS: PST-initiated disk drop completed<br />
15:33:56+: NOTE: starting rebalance of group 1/0x2e81e860 (ARESDATA) at power 1<br />
15:33:56+: Starting background process ARB0<br />
15:33:56+: ARB0 started with pid=21, OS id=19285<br />
15:33:56+: NOTE: assigning ARB0 to group 1/0x2e81e860 (ARESDATA)<br />
15:33:56+: NOTE: F1X0 copy 1 relocating from 0:2 to 1:2<br />
15:33:56+: NOTE: F1X0 copy 2 relocating from 1:2 to 0:2<br />
15:33:56+: NOTE: F1X0 copy 3 relocating from 65534:4294967294 to 65534:4294967294<br />
15:33:56+: NOTE: X-&gt;S down convert bast on F1B3 bastCount=2<br />
&#8230;<br />
15:34:14+: NOTE: group ARESDATA: relocated PST to: disk 0001 (PST copy 0)<br />
15:34:14+: WARNING: offline disk number 0 has references (1394 AUs)<br />
15:34:14+: NOTE: PST update: grp = 1<br />
15:34:14+: NOTE: group ARESDATA: relocated PST to: disk 0001 (PST copy 0)</p></blockquote>
<p>However, every time we do a query on the v$asm_disk view, we experience hangs of 30..90 seconds. Same when adding a tablespace in a database. It looks as if ASM, whenever trying to access the failed disk, waits for OS timeouts. I assume that every operation that needs access to ASM disks (e.g. data file autoextend that needs to allocate space in the asm disk, creation of archivelog, &#8230;) suffers from these timeout. Not really acceptable for a production environment. I do want to mention that dd, and even orion for asynch IO, detect the error immediately without waiting for any timeout.</p>
<p>You can clearly see the ASM waits when you truss the server process of the asm sqlplus session when you do a select on v$asm_disk. For each failed disk you get:</p>
<blockquote><p>9377/1:		17.3106	open(&#8220;/dev/oracle/asm/aresdata-a&#8221;, O_RDWR|O_DSYNC) Err#5 EIO<br />
9377/1:		18.3195	nanosleep(0xFFFFFFFF7FFF8FE0, 0xFFFFFFFF7FFF8FD0) = 0<br />
9377/1:		18.3198	open(&#8220;/dev/oracle/asm/aresdata-a&#8221;, O_RDWR|O_DSYNC) Err#5 EIO<br />
9377/1:		19.3295	nanosleep(0xFFFFFFFF7FFF8FE0, 0xFFFFFFFF7FFF8FD0) = 0<br />
9377/1:		19.3298	open(&#8220;/dev/oracle/asm/aresdata-a&#8221;, O_RDWR|O_DSYNC) Err#5 EIO<br />
9377/1:		20.3395	nanosleep(0xFFFFFFFF7FFF8FE0, 0xFFFFFFFF7FFF8FD0) = 0<br />
9377/1:		20.3398	open(&#8220;/dev/oracle/asm/aresdata-a&#8221;, O_RDWR|O_DSYNC) Err#5 EIO<br />
9377/1:		21.3495	nanosleep(0xFFFFFFFF7FFF8FE0, 0xFFFFFFFF7FFF8FD0) = 0<br />
9377/1:		21.3497	open(&#8220;/dev/oracle/asm/aresdata-a&#8221;, O_RDWR|O_DSYNC) Err#5 EIO<br />
9377/1:		22.3595	nanosleep(0xFFFFFFFF7FFF8FE0, 0xFFFFFFFF7FFF8FD0) = 0<br />
9377/1:		22.3598	open(&#8220;/dev/oracle/asm/aresdata-a&#8221;, O_RDWR|O_DSYNC) Err#5 EIO<br />
9377/1:		22.3605	open(&#8220;/dev/oracle/asm/aresdata-a&#8221;, O_RDONLY|O_DSYNC) Err#5 EIO<br />
9377/1:		23.3695	nanosleep(0xFFFFFFFF7FFF8FE0, 0xFFFFFFFF7FFF8FD0) = 0<br />
9377/1:		23.3697	open(&#8220;/dev/oracle/asm/aresdata-a&#8221;, O_RDONLY|O_DSYNC) Err#5 EIO<br />
9377/1:		24.3795	nanosleep(0xFFFFFFFF7FFF8FE0, 0xFFFFFFFF7FFF8FD0) = 0<br />
9377/1:		24.3798	open(&#8220;/dev/oracle/asm/aresdata-a&#8221;, O_RDONLY|O_DSYNC) Err#5 EIO<br />
9377/1:		25.3895	nanosleep(0xFFFFFFFF7FFF8FE0, 0xFFFFFFFF7FFF8FD0) = 0<br />
9377/1:		25.3897	open(&#8220;/dev/oracle/asm/aresdata-a&#8221;, O_RDONLY|O_DSYNC) Err#5 EIO<br />
9377/1:		26.3995	nanosleep(0xFFFFFFFF7FFF8FE0, 0xFFFFFFFF7FFF8FD0) = 0<br />
9377/1:		26.3998	open(&#8220;/dev/oracle/asm/aresdata-a&#8221;, O_RDONLY|O_DSYNC) Err#5 EIO<br />
9377/1:		27.4095	nanosleep(0xFFFFFFFF7FFF8FE0, 0xFFFFFFFF7FFF8FD0) = 0<br />
9377/1:		27.4097	open(&#8220;/dev/oracle/asm/aresdata-a&#8221;, O_RDONLY|O_DSYNC) Err#5 EIO<br />
9377/1:		27.4105	write(5, &#8221; * * *   2 0 0 8 &#8211; 0 7 -&#8221;.., 27)	= 27<br />
9377/1:		27.4106	write(5, &#8220;\n&#8221;, 1)				= 1<br />
9377/1:		27.4109	write(5, &#8221; W A R N I N G :   k f k&#8221;.., 62)	= 62<br />
9377/1:		27.4109	write(5, &#8220;\n&#8221;, 1)				= 1<br />
9377/1:		27.4111	close(6)					= 0<br />
9377/1:		27.4111	open(&#8220;/app/oracle/admin/+ASM/bdump/alert_+ASM2.log&#8221;, O_WRONLY|O_APPEND|O_CREAT|O_LARGEFILE, 0660) = 6<br />
9377/1:		27.4118	time()						= 1217417698<br />
9377/1:		27.4119	writev(6, 0xFFFFFFFF7FFF8080, 3)		= 88</p></blockquote>
<p>So it tries to access each disk 6 times in read write mode and 6 times again in read only mode. A loss of 12 valuable seconds&#8230;.</p>
<p>At the same time, the os messages file generates the following messages every second:</p>
<blockquote><p>Jul 30 11:01:05 node1 scsi: [ID 107833 kern.warning] WARNING: /scsi_vhci/ssd@g60060e800562f400000062f4000000d5 (ssd63):<br />
Jul 30 11:01:05 node1 offline or reservation conflict</p>
<p>Jul 30 11:01:06 node1 scsi: [ID 107833 kern.warning] WARNING: /scsi_vhci/ssd@g60060e800562f400000062f4000000d5 (ssd<br />
63):<br />
Jul 30 11:01:06 node1 offline or reservation conflict</p></blockquote>
<p>We would expect that ASM is intelligent enough to detect that the disk failed, but obviously it keeps trying to access it including the waits and timeouts.<br />
FYI, the state of the disks after the failure has become:</p>
<pre>DISKGROUP  PATH                            Total   Used  %Usd ST  Header    FAILGROUP   STATE      DiskName
---------- ------------------------------ ------ ------ ----- --- --------- ----------- ---------- ---------------
ARESDATA   /dev/oracle/asm/aresdata-b      46068   1396     3 ONL MEMBER    B           MOUNTED    ARESDATAB

ARESDATA                                   46068   1396     3 OFF UNKNOWN   A           MOUNTED    ARESDATAA</pre>
<p>I opened an SR on Metalink and I uploaded all possible traces I could generate (*). And guess what, due to some reason (maybe (*)), I immediately came to an excellent engineer who identified the problem immediately as a known bug, and asked development to provide a patch for 10.2.0.4 (which did not exist yet at that time). It took only 5 days for the patch to be available, and that patch solves the problem completely. After applying it, every select on v$asm_disk returns immediately.</p>
<p>This is it:</p>
<blockquote><p>Patch 6278034<br />
Description WHEN SWITCHING OFF ONE ARRAY CONTAINING ONE FAILGROUP, THE PERFORMANCE TURNS BAD<br />
Product RDBMS Server<br />
Select a Release 10.2.0.310.2.0.4<br />
Platform: Sun Solaris SPARC (64-bit)<br />
Last Updated 04-AUG-2008<br />
Size 97K (99336 bytes)<br />
Classification General</p></blockquote>
<p>The patch exist as well for 10.2.0.3. I would recommend to install it on your oracle_home where asm runs. However I have no idea if the problem is applicable as well to non-Solaris environments.</p>
<p>Note: To resync the disks after the lun is available again, use the ALTER DISKGROUP ADD FAILGROUP x DISK &#8216;/dev/path/somedevice&#8217; NAME some_new_name [FORCE] command. Not so straightforward, it turns out that trying to offline or drop the disk will not work. I.e.:</p>
<p>==================== OVERVIEW OF ASM DISKS ======================================</p>
<p><span style="font-size:xx-small;font-family:'Courier New',Courier,monospace;">DISKGROUP  PATH                            Total   Used  %Usd ST  Header    FAILGROUP   STATE      DiskName        MOUNT_S<br />
&#8212;&#8212;&#8212;- &#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212; &#8212;&#8212; &#8212;&#8212; &#8212;&#8211; &#8212; &#8212;&#8212;&#8212; &#8212;&#8212;&#8212;&#8211; &#8212;&#8212;&#8212;- &#8212;&#8212;&#8212;&#8212;&#8212; &#8212;&#8212;-<br />
ARESARCHA  /dev/oracle/asm/aresarch-a      30708  11226    36 ONL MEMBER    AREAARCHA   NORMAL     AREAARCHA       CACHED<br />
ARESARCHB  /dev/oracle/asm/aresarch-b      30708  11202    36 ONL MEMBER    ARESARCHB   NORMAL     ARESARCHB       CACHED<br />
ARESDATA   /dev/oracle/asm/aresdata-a      46068   1412     3 ONL MEMBER    A           NORMAL     ARESDATAA       CACHED</span></p>
<p><span style="font-size:xx-small;font-family:'Courier New',Courier,monospace;">ARESDATA                                       0      0       OFF UNKNOWN   B           HUNG       ARESDATAB       MISSING</span></p>
<p>Trying to add it with the same name as before:<br />
SQL&gt; alter diskgroup ARESDATA add failgroup B disk &#8216;/dev/oracle/asm/aresdata-b&#8217; name ARESDATAB force;<br />
alter diskgroup ARESDATA add failgroup B disk &#8216;/dev/oracle/asm/aresdata-b&#8217; name ARESDATAB force<br />
*<br />
ERROR at line 1:<br />
ORA-15032: not all alterations performed<br />
ORA-15010: name is already used by an existing ASM disk</p>
<p>Adding it using a new name:</p>
<p>SQL&gt; alter diskgroup ARESDATA add failgroup B disk &#8216;/dev/oracle/asm/aresdata-b&#8217; name ARESDATAB2 force;<br />
alter diskgroup ARESDATA add failgroup B disk &#8216;/dev/oracle/asm/aresdata-b&#8217; name ARESDATAB2 force<br />
*<br />
ERROR at line 1:<br />
ORA-15032: not all alterations performed<br />
ORA-15034: disk &#8216;/dev/oracle/asm/aresdata-b&#8217; does not require the FORCE option</p>
<p>SQL&gt; alter diskgroup ARESDATA add failgroup B disk &#8216;/dev/oracle/asm/aresdata-b&#8217; name ARESDATAB2;</p>
<p>Diskgroup altered.</p>
<p>I assume that the need to use the &#8216;force&#8217; option depends on the kind of error you got.</p>
<p>Sometimes I see that the rebalance does not start automatically. Then you get the following status:<br />
DISKGROUP  PATH                            Total   Used  %Usd ST  Header    FAILGROUP   STATE      DiskName        MOUNT_S<br />
&#8212;&#8212;&#8212;- &#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212; &#8212;&#8212; &#8212;&#8212; &#8212;&#8211; &#8212; &#8212;&#8212;&#8212; &#8212;&#8212;&#8212;&#8211; &#8212;&#8212;&#8212;- &#8212;&#8212;&#8212;&#8212;&#8212; &#8212;&#8212;-<br />
ARESARCHA  /dev/oracle/asm/aresarch-a      30708  11322    36 ONL MEMBER    AREAARCHA   NORMAL     AREAARCHA       CACHED<br />
ARESARCHB  /dev/oracle/asm/aresarch-b      30708  11298    36 ONL MEMBER    ARESARCHB   NORMAL     ARESARCHB       CACHED<br />
ARESDATA   /dev/oracle/asm/aresdata-a      46068   1412     3 ONL MEMBER    A           NORMAL     ARESDATAA       CACHED<br />
/dev/oracle/asm/aresdata-b      46068      2     0 ONL MEMBER    B           NORMAL     ARESDATAB2      CACHED</p>
<p>ARESDATA                                       0      0       OFF UNKNOWN   B           FORCING    ARESDATAB       MISSING</p>
<p>In that case, start it manually using:</p>
<p>SQL&gt; alter diskgroup ARESDATA rebalance;</p>
<p>Diskgroup altered.</p>
<br /><img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/geertdepaep.wordpress.com/16/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/geertdepaep.wordpress.com/16/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/geertdepaep.wordpress.com/16/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/geertdepaep.wordpress.com/16/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/geertdepaep.wordpress.com/16/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/geertdepaep.wordpress.com/16/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/geertdepaep.wordpress.com/16/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/geertdepaep.wordpress.com/16/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/geertdepaep.wordpress.com/16/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/geertdepaep.wordpress.com/16/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/geertdepaep.wordpress.com/16/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/geertdepaep.wordpress.com/16/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/geertdepaep.wordpress.com/16/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/geertdepaep.wordpress.com/16/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/geertdepaep.wordpress.com/16/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/geertdepaep.wordpress.com/16/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=geertdepaep.wordpress.com&amp;blog=2187749&amp;post=16&amp;subd=geertdepaep&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://geertdepaep.wordpress.com/2008/08/06/a-really-recommended-asm-patch-failing-lun/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/4fbfb550fb3f85bbaf5284bb99e4df59?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">pier00</media:title>
		</media:content>
	</item>
		<item>
		<title>Oracle VM and multiple local disks</title>
		<link>http://geertdepaep.wordpress.com/2008/06/09/oracle-vm-and-multiple-local-disks/</link>
		<comments>http://geertdepaep.wordpress.com/2008/06/09/oracle-vm-and-multiple-local-disks/#comments</comments>
		<pubDate>Mon, 09 Jun 2008 20:06:46 +0000</pubDate>
		<dc:creator>pier00</dc:creator>
				<category><![CDATA[High availability]]></category>
		<category><![CDATA[Oracle vm repositories disks]]></category>

		<guid isPermaLink="false">http://geertdepaep.wordpress.com/2008/06/09/oracle-vm-and-multiple-local-disks/</guid>
		<description><![CDATA[For my Oracle VM test environment I have a server available with multiple internal disks of different size and speed. So I was wondering if it is possible to have all these disks used together for my virtual machines in Oracle VM. If all disks would have been the same size and speed, I could [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=geertdepaep.wordpress.com&amp;blog=2187749&amp;post=13&amp;subd=geertdepaep&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>For my Oracle VM test environment I have a server available with multiple internal disks of different size and speed. So I was wondering if it is possible to have all these disks used together for my virtual machines in Oracle VM.</p>
<p>If all disks would have been the same size and speed, I could easily use the internal raid controller to put them in mirror, stripe or raid5 and end up with one large volume, alias disk, for my Oracle VM. However due to the differences in characteristics of the disks (speed/size) this is not a good idea. So I started to look in Oracle VM Manager (the java console) to see what is possible.</p>
<p>It turned out soon to me that Oracle VM is designed for a different architecture: in fact the desired setup is to dispose of a (large) SAN box with shared storage that is available to multiple servers. Then all these servers can be put in a server pool, sharing the same storage. This setup allows live migration of running machines to another physical server. Of course this makes sense because it fits nicely in the concept of grid computing: if any physical server fails, just restart your virtual machine on another one, and add machines according to your performance needs. But it doesn&#8217;t help me: I don&#8217;t have got one storage with multiple servers, but I have one server with multiple disks.</p>
<p>So I started to browse a little in all the executables of the OVM installation, and I found under /usr/lib/ovs the ovs-makerepo script. According to me the architecture is as follows (as far as I can find on the internet, because there is not much clear documentation on this): when installing OVM, you have a /boot a / and a swap partition (just as in traditional linux) and OVM requires one large partition to be used for virtual machines, which will be mounted under /OVS. In this partition you find subdirectories &#8220;running_pool&#8221; which contains all the virtual machines that you have created and that you can start, and a subdirectory &#8220;seed_pool&#8221; which contains templates you can start from for creating new machines. There is also &#8220;local&#8221;, &#8220;remote&#8221; and &#8220;publish_pool&#8221;, however they were irrelevant for me at the moment and I didn&#8217;t try to figure out what they are used for.</p>
<p>With this in mind I can install Oracle VM on my first disk and end up with 4 partitions on /dev/sda:</p>
<pre>
   Filesystem 1K-blocks     Used Available Use% Mounted on
   /dev/sda1     248895    25284    210761  11% /boot
   (sda2 is swap)
   /dev/sda3    4061572   743240   3108684  20% /
   /dev/sda4   24948864 22068864   2880000  89% /OVS
</pre>
<p>With this in mind I now want to add the space on my second disk (/dev/sdb) to this setup. So first I create one large partition on the disk using fdisk. Then I create an ocfs file system on it as follows:</p>
<pre>
[root@nithog ovs]# mkfs.ocfs2 /dev/sdb1
mkfs.ocfs2 1.2.7
Filesystem label=
Block size=4096 (bits=12)
Cluster size=4096 (bits=12)
Volume size=72793694208 (17771898 clusters) (17771898 blocks)
551 cluster groups (tail covers 31098 clusters, rest cover 32256 clusters)
Journal size=268435456
Initial number of node slots: 4
Creating bitmaps: done
Initializing superblock: done
Writing system files: done
Writing superblock: done
Writing backup superblock: 4 block(s)
Formatting Journals: done
Writing lost+found: done
mkfs.ocfs2 successful
</pre>
<p>Initially I created the file system as ext3 which worked well. However there was one strange thing. This is what you get:</p>
<ul>
<li>Create a new (paravirtualized) (linux) virtual machine in this new (ext3-based) repository (see later how exactly)</li>
<li>Specify a disk of e.g. 2Gb</li>
<li>Complete the wizard</li>
<li>This prepares a machine where you can start using the linux installer on the console to install the machine (do not start to install yet)</li>
<li>Now look in &#8230;/running_pool/machine_name and see a file of 2Gb</li>
<li>Now do du -sk on &#8230;/running_pool/machine and see that only 20Kb is used</li>
<li>From the moment you start to partition your disk inside the virtual machine, the output of &#8220;du -sk&#8221; grows the same amount as the data you really put in it. So it behaves a bit like &#8216;dynamic provisioning&#8217;.</li>
<li>Note however that ls -l shows a file of 2Gb at any time</li>
</ul>
<p>I don&#8217;t know for the moment if this behaviour is caused by the fact that the file system is ext3, but anyway, I leave it up to you to judge if this is an advantage or a disadvantage.</p>
<p>Now when trying to add my new sdb1 partition as an extra repository, I got:</p>
<p>          <em>Usage:</em>        </p>
<pre>
[root@nithog ~]# /usr/lib/ovs/ovs-makerepo
 usage: /usr/lib/ovs/ovs-makerepo &lt;source&gt; &lt;shared&gt; &lt;description&gt;
        source: block device or nfs path to filesystem
        shared: filesystem shared between hosts?  1 or 0
        description: descriptive text to be displayed in manager
</pre>
<p>          <em>Execution:</em>        </p>
<pre>
   [root@nithog ovs]# /usr/lib/ovs/ovs-makerepo /dev/sdb1 0 "Repo on disk 2"
   ocfs2_hb_ctl: Unable to access cluster service while starting heartbeat mount.ocfs2:
   Error when attempting to run /sbin/ocfs2_hb_ctl: "Operation not permitted"
   Error mounting /dev/sdb1
</pre>
<p>Seems like the script expects something like a cluster, but I just have a standalone node&#8230; I think that this script is intended to add a shared repository to a cluster of nodes. No problem, let&#8217;s try to convert our standalone machine to a one-node cluster: create the file /etc/ocfs2/cluster.conf:</p>
<pre>
cluster:
        node_count = 1
        name = ocfs2
node:
        ip_port = 7777
        ip_address = 10.7.64.160
        number = 1
        name = nithog
        cluster = ocfs2
</pre>
<p>Note that the indented lines MUST start with a &lt;TAB&gt; and then the parameter with its value. After creating this file I could do:</p>
<pre>
   [root@nithog ovs]# /etc/init.d/o2cb online ocfs2
   Starting O2CB cluster ocfs2: OK
and then
   [root@nithog ovs]# /usr/lib/ovs/ovs-makerepo /dev/sdb1 0 "Repo on disk 2"
   Initializing NEW repository /dev/sdb1
   SUCCESS: Mounted /OVS/877DECC5B658433D9E0836AFC8843F1B
   Updating local repository list.
   ovs-makerepo complete
</pre>
<p>As you can see, an extra subdirectory is created in the /OVS file system, with a strange UUID as its name. Under this directory my new file system /dev/sdb1 is mounted. This file system is a real new repository, because under /OVS/877DECC5B658433D9E0836AFC8843F1B you find as well the running_pool and seed_pool directories. It is also listed in /etc/ovs/repositories (but it is NOT recommended to edit this file manually).</p>
<p>Then I looked in the Oracle VM Manager (the java based web gui) but I didn&#8217;t find anything of this new repository. It looks as if this gui is not (yet) designed to handle multiple repositories. However I started to figure out if my new disk could really be used for virtual machines, and my results are:</p>
<ul>
<li>When creating a new virtual machine, you have no chance of specifying in which repository it has to come</li>
<li>It seems to come in the repository where there is the most amount of free space (but I should do more testing to get 100% certainty)</li>
<li>When adding a new disk to an existing virtual machine (an extra file on oracle-vm level) the file will come in the same repository, even the same directory as where the initial files of your virtual machine are located. If there is NOT enough free space on the disk, Oracle VM will NOT put your file in another repository on another disk.</li>
<li>You can move the datafiles of your virtual machine to any other location while the machine is not running, and while changing the reference to the file in /etc/xen/&lt;machine_name&gt;</li>
<li>So actually it looks that on xen-level you can put your vm datafiles in any directory; the concept of the repositories seems to be oracle-vm specific.</li>
<li>So if you create a new virtual machine and Oracle puts it in the wrong repository, it is not difficult at all to move it afterwards to another filesystem/repostory. It just requires a little manual intervention. However it seems recommended to keep your machines always in an oracle-vm repository, in the running_pool, because only in that way it can be managed by the Oracle-vm gui.</li>
</ul>
<p>I am sure that there are many things that have an abvious explanation, but I have to admit that I didn&#8217;t read the manuals of ocfs and oracle vm completely from the start to the end. Also I think that Oracle</p>
<p>Conclusion: Oracle VM seems to be capable of having multiple repositories on different disks, but the GUI is not ready to handle them. But with a minimum of manual intervention, it is easy to do all desired tasks in command-line mode.</p>
<br /><img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/geertdepaep.wordpress.com/13/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/geertdepaep.wordpress.com/13/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/geertdepaep.wordpress.com/13/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/geertdepaep.wordpress.com/13/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/geertdepaep.wordpress.com/13/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/geertdepaep.wordpress.com/13/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/geertdepaep.wordpress.com/13/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/geertdepaep.wordpress.com/13/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/geertdepaep.wordpress.com/13/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/geertdepaep.wordpress.com/13/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/geertdepaep.wordpress.com/13/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/geertdepaep.wordpress.com/13/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/geertdepaep.wordpress.com/13/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/geertdepaep.wordpress.com/13/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/geertdepaep.wordpress.com/13/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/geertdepaep.wordpress.com/13/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=geertdepaep.wordpress.com&amp;blog=2187749&amp;post=13&amp;subd=geertdepaep&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://geertdepaep.wordpress.com/2008/06/09/oracle-vm-and-multiple-local-disks/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/4fbfb550fb3f85bbaf5284bb99e4df59?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">pier00</media:title>
		</media:content>
	</item>
	</channel>
</rss>
