Checking the FTP cluster

Checking the cluster is "comparing the recursive cluster file listing against the recusive file listings of all our nodes". The story is short: list gives us the first listing and listserver the second. But notice: since listserver might have problems with your FTP servers you should read about it.

Comparing the lists is also simple, syscheck uses sort and join from the operating system for this. The comparison can give one of three possible results for each file:

  1. The node file is on the list of our cluster files: This is exactly what we want, no further action is required.

  2. The node file is not on the cluster list: This is an orphaned file, the cluster entry is already deleted but for some reason the data file survived (perhaps we lost the corresponding info/link file). The data file should be deleted.

  3. There is no matching node file for a file in the cluster list: The data file was lost and the refering entry from the info/link file should be removed.

syscheck prints one line for each orphaned or missing file together with other information about the cluster nodes.

Repairing

syscheck is only a system checker, it does not repair. To repair you have to feed syscheck's output into sysrepair.

sysrepair creates an info/link for each orphaned under lost+found (this directory is created if it doesn't extist) to make these file re-accessable by the cluster. For missing files the corresponding node information is removed from the info/link file. If the remaing node count is zero the file is lost and the info/link is removed.

Some additional notes

  1. syscheck ignores unmatching files on nodes from which syscheck doesn't have file records in the node listing. In the case that listserver returns nothing for a node syscheck assumes that listserver is not compatible to the node.

  2. The two step scheme enables you inspect the error list manually. You can also remove wrong diagnostics from the list before running syscheck.

  3. After a complete run of syscheck and sysrepair your cluster should be error free.