Fix Solr Core Locking

When you have Solr running on the Azure web app, Solr indexes get locked intermittently and need the app restart to resolve it the locking error as shown below.

sitecore_master_index_rebuild: org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: Index dir 'C:\home\site\wwwroot\server\solr\sitecore_master_index_rebuild\data\index/' of core 'sitecore_master_index_rebuild' is already locked. The most likely cause is another Solr server (or another solr core in this server) also configured to use this directory; other possible causes may be specific to lockType: native

I have applied the solution provided in here Stack Exchange Question with no luck. So I had to revert it and took advantage of Azure’s built-in Auto-Heal feature to address this issue without manual intervention.

What is Azure Auto-Heal?

Auto-healing isĀ a mitigation action you can take when your app is having unexpected behavior. You can set your own rules based on request count, slow request, memory limit, and HTTP status code to trigger mitigation actions. Use the tool to temporarily mitigate an unexpected behavior until you find the root cause.

As we know the Solr Core locking is a known issue, I have configured the Auto-Heal on our Solr web app to recycle the web app when the Locking issue happens.

Follow the steps to configure Auto-Heal on the Solr web app.

Login to Azure Portal and go to Solr web app

Go to Diagnose and solve problems

Select Diagnostic Tools

Select Auto-Heal under Proactive tools

Turn on Custom Auto-Heal Rules Enabled

Under Define Conditions select Status Codes

Click on Add Status Codes rules

Set After how many requests you want this condition to kick in? as 2

Set What should be the status code for these requests? as 500

Set What is the time interval (in seconds) in which the above condition should be met? as 60

Click Ok

Set Configure Action as Recycle Process

Click on Start Time under Override when Action executes (Optional)

Select Modify Startup time and set Specify the time (in seconds) after process startup after which auto-heal should kick in ? as 300

Your settings for Auto-Heal as follows

Click Save

This will restart the web app and Auto-Heal will trigger after 300 seconds as in configuration. The Auto-Heal will recycle the app when it notices 2 requests with a response status code of 500 (Internal Server Error).

When the index locking issue occurs, the requests will throw an error 500 response making the Auto-Heal come into action and recycle the app to resolve it.

Spread the love

Leave a Reply

Your email address will not be published. Required fields are marked *