Linux : R1Soft Backup Server manager stuck on “This may take several minutes” forever

By | July 31, 2021

I came across the following issue recently, where R1Soft backup server service was just restarted and showing the page that indicate that the service is currently in the starting process ;

Idera Server Backup Manager

This may take several minutes

If you are the administrator, check the log files for details.
Otherwise, please check back soon or contact your administrator for help.

Which indeed usually only take a few minutes, but just stayed that way forever – even after 72 hours, and the server resources was dormant (not much CPU or disk I/O activity).

These kind of issues are usually caused by disk safe corruption, malformation, pending (uncommitted) rollback actions following unclean service or server shutdown.

1. To find out, first have a look at the R1Soft server logs:

/usr/sbin/r1soft/log/server.log

2. You may find some entries similar to this one if this is the case:

2021-07-26 21:28:46,137 ERROR - Failed to update Disk Safe cache entry for Disk Safe at '/bkFS1/DiskSafes/SERVER/3abcc46d-b24c-4885-881b-e2cfc68af862'
com.r1soft.backup.server.facade.DiskSafeException: Errors encountered while cleaning recovery point
at com.r1soft.backup.server.facade.DiskSafeFacade.cleanupDSMetaData(DiskSafeFacade.java:7250)
at com.r1soft.backup.server.cache.DiskSafeCache.initCache(DiskSafeCache.java:192)
at com.r1soft.backup.server.cache.DiskSafeCache.start(DiskSafeCache.java:116)
at com.r1soft.StorageCacheWrapper.start(StorageCacheWrapper.java:78)
at com.r1soft.Main.Rb(Main.java:452)
at com.r1soft.Main.run(Main.java:319)
Caused by: com.r1soft.backup.server.disksafe.metadata.MetaDataException: Error while aquiring Disk Safe cache lock for '/bkFS1/DiskSafes/SERVER/3abcc46d-b24c-4885-881b-e2cfc68af862'
at com.r1soft.backup.server.disksafe.metadata.DiskSafeManager.openDiskSafe(DiskSafeManager.java:278)
at com.r1soft.backup.server.facade.DiskSafeFacade.cleanupDSMetaData(DiskSafeFacade.java:7243)
... 5 more
Caused by: com.r1soft.backup.server.disksafe.metadata.DiskSafeCacheEntry$DiskSafeCacheLockException
at com.r1soft.backup.server.disksafe.metadata.DiskSafeCacheEntry.lock(DiskSafeCacheEntry.java:86)
at com.r1soft.backup.server.disksafe.metadata.DiskSafeCacheEntry.lock(DiskSafeCacheEntry.java:59)
at com.r1soft.backup.server.disksafe.metadata.DiskSafeManager.openDiskSafe(DiskSafeManager.java:262)
... 6 more

3. Stop the R1Soft server service:

/etc/init.d/cdp-server stop

4. Browse to the concerned (faulty) disk safe metadata folder – Example:

/path/to/SERVER/disksafe/3abcc46d-b24c-4885-881b-e2cfc68af862/metadata2

5. Look for any “*.rollback” files:

ls -la | grep rollback

5.1. If you see any file(s) ending with .rollback extension, just rename them appending a “.bak” extension in the end – Example:

mv MetadataControlPanelInstanceUserData.rollback MetadataControlPanelInstanceUserData.rollback.bak

[Repeat the step above for all files and disk safes in the same situation]

6. Edit the R1Soft Server configuration to start the Backup Manager with all disk safe “Offline”:

/usr/sbin/r1soft/conf/server.properties

6.1. Set the following parameter to “true”:

close-all-disk-safes-on-start-up=true

7. Start the R1Soft Backup Server service:

/etc/init.d/cdp-server start

The Backup Manager (WebUI) should now load properly, you will notice though a red banner on top saying that the server is in maintenance mode (this is expected because of the step 6.1 above).

8. Login to the Backup Manager and go to:

Settings > Disk Safes

9. On each disk safes, in the Action drop down menu, click on Open one after the other. If the issue was really only because of incomplete rollback actions, they should technically open normally unless they are corrupted. In case of disk safe corruption and there is no way to open or repair it, you will need to discard the old disk safe and create a new one.

10. Disable the maintenance mode using the top red bar link for that purpose.

10.1. Edit the R1Soft Server configuration to start the Backup Manager with all disk safe normally (Online):

/usr/sbin/r1soft/conf/server.properties

And change back the following option:

close-all-disk-safes-on-start-up=false

The issue should be resolved now. If your case was due to corruption rather than pending transaction, at least you regained access to the console at this point and you can work on creating new ones.

If your issue was due to pending rollback actions, make sure that the service is being shutdown properly every time to avoid that in the future. If corruption occurred, I highly recommend you investigate your storage (faulty drives, controller, backplane or even cables could be related).