How to Stop, Suspend, and Resume the Server Cluster
Stopping, suspending, and resuming servers in a Cluster can be done programmatically, through the Studio, or by a combination of both.
From the command line (clarify_ctl)
Server Cluster components can be monitored and controlled through the use of the clarify_ctl script, which is located in the utils directory of a Clarify Server installation.
This script (provided as both .bat &.sh) allows for stopping, suspending, resuming, and status checking of remote servers as part of a Server Cluster. This method of server control is often used as part of routine maintenance, update, and backup processes. Using the clarify_ctl script is generally the recommended method to manage servers in a Cluster, as it is more effective in ensuring processes have completed or stopped gracefully prior to a server or an entire Cluster stopping or suspending.
- At the Cluster level to manage cluster-wide operations; an example may be a graceful shutdown of an entire Server Cluster.
- At the Queue level to manage operations on a specific queue; an example may be suspending a specific inbound processing Event.
- At the individual Application level to manage operations on specified processes; an example may be suspending or resuming a specific Active Server Node. This is generally used for troubleshooting, testing, or specific maintenance.
Usage
To use clarify_ctl, open a command prompt or terminal window, positioned to the utils directory.
- clarify_ctl.bat (Windows)
- clarify_ctl.sh (Linux)
Parameter | Description |
---|---|
listQueues | Returns a list of queues, their status, and current size.
|
suspendQueue <queueName> | Suspends the specified queue. Suspended queues prevent queued activities from being processed. |
resumeQueue <queueName> | Resumes the specified queue. Allows queued activities in this queue to be performed. |
listSchedule | Returns a list of scheduled Business Processes, their next scheduled time, and Process Schedule attributes. |
clusterStatus | Returns the status (suspended or active) of the entire Server Cluster. |
suspendCluster <timeout> | Attempts to suspend the Server Cluster within the specified timeout (in
seconds). The timeout parameter is an integer, in seconds, after which the
suspendCluster command will stop waiting for a response and return the state of
the server as is. If no response, it will return an integer other than 0,
indicating a predictable error that other scripts or Java code, etc, that call
that method can then use to determine what to do next. Note: The Cluster will
still suspend, even after the timeout expires. |
shutdownCluster | Performs a hard stop on the Clarify Cluster. Any processes still running will
be terminated. Note: Unlike the previous shutdown command in ebi_ctl, this does not
attempt to suspend first. The intent of this command is to simply stop the
cluster immediately. If you wish to complete current tasks/jobs first - without
losing the information being processed - then you should use the suspendCluster
command before shutdownCluster. |
resumeCluster | Resumes a suspended Server Cluster. |
workerStatus | Displays the number of active and total workers (a type of internal application). |
nodeStatus <host> | Returns the status (suspended or active) of all services on specified Server Node. |
stopNode <host> | First suspends, and then stops Clarify from running on the specified Server Node. Timeout works the same as suspendCluster. |
startNode <host> | Starts a Clarify service on the specified Server Node. |
help | Displays script help, showing parameters and their description. |
From the Studio
Unlike a single server or local test server, the Server Cluster is best managed using the clarify_ctl script, instead of the Server Environment toolbar on the Studio’s Admin Console. However, the toolbar can be used to suspend and resume servers in a Cluster – as long as you understand that suspend functionality differs slightly when done from the Studio.
Gracefully shutting down one Server Node in a Cluster
The act of gracefully shutting down of a Server Node means that an Active Server (and its running services) has been suspended. This should be done from the command line.
Using clarify_ctl, enter the following commands in this sequence:
stopNode <host> <timeout>
nodeStatus <host>
The stopNode command attempts to suspend the Active Server and then stops Clarify from running on the specified Server Node (host).
The nodeStatus command returns the status (suspended or active) of all services on the specified Server Node. This is how to confirm that the Server Node is no longer running.
Gracefully shutting down an entire Server Cluster
The act of gracefully shutting down an entire Server Cluster means that all Server Nodes in a Server Cluster have been suspended, and thereby all Active Servers (and their running services) have been suspended too. This should be done from the command line, but can also be done from the Studio.
Using clarify_ctl, enter the following commands in this sequence:
suspendCluster <timeout>
shutdownCluster
The suspendCluster command attempts to suspend the Server Cluster within the specified timeout (in seconds). If the suspend is not complete before timeout, a non-zero status will be returned. However, the Cluster will still suspend.
An alternate step to suspendCluster can be accomplished by using the Suspend button on the Server Environment toolbar on the Studio’s Admin Console. Be aware however that while this does suspend the Cluster, it does not wait for current jobs/tasks/processes to complete (as does the suspendCluster command).
The shutdownCluster command performs a hard stop on the Clarify Cluster. Any processes still running will be terminated!
Note: Unlike the shutdown command in ebi_ctl (used for single server), this does not attempt to suspend first. The command stops the Server Cluster immediately. If you wish to complete current tasks/jobs first - without losing the information being processed - then you should use the suspendCluster command before shutdownCluster.
An alternate step to shutdownCluster can be accomplished by stopping the clarify service on each Server Node in the Cluster. Be aware however that this performs a hard stop, and is a change in behavior from previous Cluster functionality.
Resuming a Server Node
This should be done from the command line.
Using clarify_ctl, enter the following commands in this sequence:
StartNode
nodeStatus <host>
The StartNode command starts Clarify on the specified Server Node.
The nodeStatus command returns the status (suspended or active) of all services on the specified Server Node. This is how to confirm that the Server Node is now running.
Resuming an entire Server Cluster
- Start the clarify service for each Server Node.
For Windows, the service will be stopped, and requires a restart. For Linux, this service may appear to be running, but still requires a restart.
- Resume the Server Cluster, using either method:
- Using the resumeCluster command from clarify_ctl
- Clicking the Start/Resume Server button in the Server Environment toolbar on the Studio’s Admin Console
- Run the clusterStatus command from clarify_ctl. This returns the status (suspended or active) of the entire Server Cluster. This is how to confirm that the Server Cluster is now running.