When you have Solr running on the Azure web app, Solr indexes get locked intermittently and need the app restart to resolve it the locking error as shown below.
sitecore_master_index_rebuild: org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: Index dir 'C:\home\site\wwwroot\server\solr\sitecore_master_index_rebuild\data\index/' of core 'sitecore_master_index_rebuild' is already locked. The most likely cause is another Solr server (or another solr core in this server) also configured to use this directory; other possible causes may be specific to lockType: native
I have applied the solution provided in sitecore.stackexchange.com here Stack Exchange Question with no luck. So I had to revert it and took advantage of Azure’s built-in Auto-Heal feature to address this issue without manual intervention.
What is Azure Auto-Heal?
Auto-healing is a mitigation action you can take when your app is having unexpected behavior. You can set your own rules based on request count, slow request, memory limit, and HTTP status code to trigger mitigation actions. Use the tool to temporarily mitigate an unexpected behavior until you find the root cause.
As we know the Solr Core locking is a known issue, I have configured the Auto-Heal on our Solr web app to recycle the web app when the Locking issue happens.
Follow the steps to configure Auto-Heal on the Solr web app.
Login to Azure Portal and go to Solr web app
Go to Diagnose and solve problems
Select Diagnostic Tools
Select Auto-Heal under Proactive tools
Turn on Custom Auto-Heal Rules Enabled
Under Define Conditions select Status Codes
Click on Add Status Codes rules
Set After how many requests you want this condition to kick in? as 2
Set What should be the status code for these requests? as 500
Set What is the time interval (in seconds) in which the above condition should be met? as 60
Set Configure Action as Recycle Process
Click on Start Time under Override when Action executes (Optional)
Select Modify Startup time and set Specify the time (in seconds) after process startup after which auto-heal should kick in ? as 300
Your settings for Auto-Heal as follows
This will restart the web app and Auto-Heal will trigger after 300 seconds as in configuration. The Auto-Heal will recycle the app when it notices 2 requests with a response status code of 500 (Internal Server Error).
When the index locking issue occurs, the requests will throw an error 500 response making the Auto-Heal come into action and recycle the app to resolve it.