How to Stop, Suspend, and Resume the Server Cluster

Stopping, suspending, and resuming servers in a Cluster can be done programmatically, through the Studio, or by a combination of both.

From the command line (clarify_ctl)

Server Cluster components can be monitored and controlled through the use of the clarify_ctl script, which is located in the utils directory of a Clarify Server installation.

This script (provided as both .bat &.sh) allows for stopping, suspending, resuming, and status checking of remote servers as part of a Server Cluster. This method of server control is often used as part of routine maintenance, update, and backup processes. Using the clarify_ctl script is generally the recommended method to manage servers in a Cluster, as it is more effective in ensuring processes have completed or stopped gracefully prior to a server or an entire Cluster stopping or suspending.

The script can be used at different levels:
  • At the Cluster level to manage cluster-wide operations; an example may be a graceful shutdown of an entire Server Cluster.
  • At the Queue level to manage operations on a specific queue; an example may be suspending a specific inbound processing Event.
  • At the individual Application level to manage operations on specified processes; an example may be suspending or resuming a specific Active Server Node. This is generally used for troubleshooting, testing, or specific maintenance.

Usage

To use clarify_ctl, open a command prompt or terminal window, positioned to the utils directory.

Run one of the following scripts per your OS:
  • clarify_ctl.bat (Windows)
  • clarify_ctl.sh (Linux)
Usage example: clarify_ctl -Parameter

Parameter Description
listQueues Returns a list of queues, their status, and current size.
  • Active Processes lists the number of jobs currently processing activities for that queue.
  • Queue Size indicates the number of activities waiting in that queue.
suspendQueue <queueName> Suspends the specified queue. Suspended queues prevent queued activities from being processed.
resumeQueue <queueName> Resumes the specified queue. Allows queued activities in this queue to be performed.
listSchedule Returns a list of scheduled Business Processes, their next scheduled time, and Process Schedule attributes.
clusterStatus Returns the status (suspended or active) of the entire Server Cluster.
suspendCluster <timeout> Attempts to suspend the Server Cluster within the specified timeout (in seconds). The timeout parameter is an integer, in seconds, after which the suspendCluster command will stop waiting for a response and return the state of the server as is. If no response, it will return an integer other than 0, indicating a predictable error that other scripts or Java code, etc, that call that method can then use to determine what to do next.
Note: The Cluster will still suspend, even after the timeout expires.
shutdownCluster Performs a hard stop on the Clarify Cluster. Any processes still running will be terminated.
Note: Unlike the previous shutdown command in ebi_ctl, this does not attempt to suspend first. The intent of this command is to simply stop the cluster immediately. If you wish to complete current tasks/jobs first - without losing the information being processed - then you should use the suspendCluster command before shutdownCluster.
resumeCluster Resumes a suspended Server Cluster.
workerStatus Displays the number of active and total workers (a type of internal application).
nodeStatus <host> Returns the status (suspended or active) of all services on specified Server Node.
stopNode <host> First suspends, and then stops Clarify from running on the specified Server Node. Timeout works the same as suspendCluster.
startNode <host> Starts a Clarify service on the specified Server Node.
help Displays script help, showing parameters and their description.

From the Studio

Unlike a single server or local test server, the Server Cluster is best managed using the clarify_ctl script, instead of the Server Environment toolbar on the Studio’s Admin Console. However, the toolbar can be used to suspend and resume servers in a Cluster – as long as you understand that suspend functionality differs slightly when done from the Studio.

Gracefully shutting down one Server Node in a Cluster

The act of gracefully shutting down of a Server Node means that an Active Server (and its running services) has been suspended. This should be done from the command line.

Using clarify_ctl, enter the following commands in this sequence:

stopNode <host> <timeout>

nodeStatus <host>

The stopNode command attempts to suspend the Active Server and then stops Clarify from running on the specified Server Node (host).

The nodeStatus command returns the status (suspended or active) of all services on the specified Server Node. This is how to confirm that the Server Node is no longer running.

Gracefully shutting down an entire Server Cluster

The act of gracefully shutting down an entire Server Cluster means that all Server Nodes in a Server Cluster have been suspended, and thereby all Active Servers (and their running services) have been suspended too. This should be done from the command line, but can also be done from the Studio.

Using clarify_ctl, enter the following commands in this sequence:

suspendCluster <timeout>

shutdownCluster

The suspendCluster command attempts to suspend the Server Cluster within the specified timeout (in seconds). If the suspend is not complete before timeout, a non-zero status will be returned. However, the Cluster will still suspend.

An alternate step to suspendCluster can be accomplished by using the Suspend button on the Server Environment toolbar on the Studio’s Admin Console. Be aware however that while this does suspend the Cluster, it does not wait for current jobs/tasks/processes to complete (as does the suspendCluster command).

The shutdownCluster command performs a hard stop on the Clarify Cluster. Any processes still running will be terminated!

Note: Unlike the shutdown command in ebi_ctl (used for single server), this does not attempt to suspend first. The command stops the Server Cluster immediately. If you wish to complete current tasks/jobs first - without losing the information being processed - then you should use the suspendCluster command before shutdownCluster.

An alternate step to shutdownCluster can be accomplished by stopping the clarify service on each Server Node in the Cluster. Be aware however that this performs a hard stop, and is a change in behavior from previous Cluster functionality.

Resuming a Server Node

This should be done from the command line.

Using clarify_ctl, enter the following commands in this sequence:

StartNode

nodeStatus <host>

The StartNode command starts Clarify on the specified Server Node.

The nodeStatus command returns the status (suspended or active) of all services on the specified Server Node. This is how to confirm that the Server Node is now running.

Resuming an entire Server Cluster

  1. Start the clarify service for each Server Node.

    For Windows, the service will be stopped, and requires a restart. For Linux, this service may appear to be running, but still requires a restart.

  2. Resume the Server Cluster, using either method:
    • Using the resumeCluster command from clarify_ctl
    • Clicking the Start/Resume Server button in the Server Environment toolbar on the Studio’s Admin Console
  3. Run the clusterStatus command from clarify_ctl. This returns the status (suspended or active) of the entire Server Cluster. This is how to confirm that the Server Cluster is now running.