August 2007 Archives

A customer I work with divides responsibilities for their ALUI system between many distinct roles without many overlapping abilities. The DBAs, server administrators, and portal administrators aren't allowed to cross to the other person's zone. So if the portal administrator who has no direct access to the server realizes the search server is hung, then that person needs to coordinate with the server administrator to restart it.

Such division of labor has its strengths, but in our case, the server administrator trusts the judgment of the portal administrator when it comes to restarting services--at least, in the development environment. So we decided to cut out the middle man.

We did this through the portal's external operation feature. The portal tells you "An external operation is used to execute batch files, shell scripts, or other system programs from the portal. Common examples of external operations include running shell scripts which can perform document queries, portal pings, or e-mail snapshot query results to subscribed users. Once created, these actions are scheduled as portal job operations."

Set Proper Cross-Server Security
In the case that your automation service runs on a different machine than the other service that you want to control, you'll need to make sure to set up the security properly. External operations run as whatever credential it is that runs the automation service. By default that runs as the local system account. If you keep this setting, then you need to give the server's machine account rights to access and restart services on the target boxes. If the server is WEB01, then the way to assign it rights on the target machine is through the account name WEB01$ with a dollar sign after its name.You may prefer though to run the automation service as a domain-based service account that has elevated privileges across all servers. In such a case, no additional machine-based security will need to be applied.

Create the Batch File
To start we had to make sure we had the tools that we intended to use: sc.exe and sleep.exe. Sc lets you control services on local and remote servers. Sleep lets you wait between commands. Some of the latest Windows servers install these tools by default. If your server doesn't have them, then install the Windows Server Resource kit.

Then we wrote a batch file named restart.aluidev-03.service.bat.

set TargetService=SearchClusterUI
set ServiceHost=aluidev-03
set PathToSC=c:\WINNT\system32
set PathToSleep=e:\progra~1\Reskit

set LogFile=restart.service.log

echo %date% %time% Entering sequence to restart %TargetService% >> %LogFile%
%PathToSC%\sc.exe \\%ServiceHost% stop %TargetService%
%PathToSleep%\sleep.exe 10
%PathToSC%\sc.exe \\%ServiceHost% start %TargetService%
%PathToSleep%\sleep.exe 30
%PathToSC%\sc.exe \\%ServiceHost% start %TargetService%
%PathToSleep%\sleep.exe 60
%PathToSC%\sc.exe \\%ServiceHost% start %TargetService%

echo %date% %time% Restarted %TargetService% >> %LogFile%

The script restarts the BEA ALI Search Cluster Manager service on the aluidev-03 server. A few comments on it:

  • The sc command requires that you use the service name rather than the display name. So we went to the Services console, looked at the properties of the desired service, and we found the service name is SearchClusterUI.
  • We call the sc and sleep commands using fully qualified paths because the portal external opration won't have the Path environment variables.
  • After stopping, we try starting several times. The reason for this is that some services take longer than others to stop. If the service stops quickly, then our first start command will restart it. But if it stops slowly, we may not be able to start it on the first couple attempts. But after a total of 100 seconds? It should be ready to start.
  • For auditing purposes, we log each time this batch file is used.

We select one of the ALUI automation servers as the host for our script and place the batch file in its %pthome%\ptportal\6.1\scripts directory. After we test the script and see that it works, we're ready to move on.

Create the Portal Objects
In the portal we now create an External Operation. We set the command as follows:

".\restart.aluidev-03.service.bat"

We then schedule this External Operation via a Job, and then we make sure the folder for that Job object is registered in the Automation Service to be run by the automation server with our batch file.

We also make absolutely sure the security on the External Operation and Job are set such that only administrators can use them.

Can the Parameters be Passed to the Batch File?
The server administrator realized that using the above steps, he would need to create a separate batch file for every service on every server. Sounds like a pain, so couldn't he just pass the parameters from the External Operation to the batch file? Technically, yes, this is possible. But it can get risky. You probably don't want the portal administrator to be able to arbitrarily restart any service on those other machines in your rack.

But you might allow them to just restart services on a specific server. This might be reasonable if you have a dedicated development server. Anyway, to do that you would first edit your External Operation to pass in a parameter of the desired service name:

".\restart.aluidev-03.service.bat" "SearchClusterUI"

Then you would edit the previous batch file so that it takes its service name from the parameter. The batch file would then begin with this:

set TargetService=$1

Enjoy, and make sure to use this power judiciously!

Deploying Portlets with the Server API

| | Comments (0)

Portlet developers write code that integrates to the portal with varying depth. Let's not consider the simplest side of the spectrum where a pagelet has no ALUI-specific code and is only a portlet because the portal registered it. Examples of these are the Google Gadgets I previously wrote about.

We'll consider only portlet integrations that use an API to leverage ALUI features. But API, an acronym that stands for Application Programming Interface, needs some disambiguation since ALUI uses it in at least three ways related to portlets.

[1] First, there's the "BEA ALI API Service," a component installed on your ALUI system. Some portlets and other applications send SOAP requests through this service to get information into and out of the portal.

[2] Next is the AquaLogic IDK API. This used to be called the EDK, or just the remote API. A portlet can include the libraries for this API in its bin or lib directory. Such a portlet may run on a network external to the core portal, and can even be hosted by a third-party. These IDK libraries pass information through HTTP headers that only make sense in the request and response context of a portal. Portlets using the IDK can do things like get and set preferences from the portal database. Some IDK callls use the API service for deeper portal interaction such as creating certain types of portal objects. Usually developers require nothing more than the IDK to write their portlets.

A request from the portal to a portlet may include the following headers, which illustrates details available to the IDK:

CSP-Protocol-Version: 1.3
CSP-Can-Set: Gadget-User,User
CSP-Gateway-Specific-Config: PT-User-Name=Administrator,PT-User-ID=1,PT-Stylesheet-URI=http://www.a.com/imageserver/plumtree/common/public/css/mainstyle-en.css,PT-Hostpage-URI=http%u003A%u002F%u002Flocalhost%u002Fportal%u002Fserver%u002Ept%u003Fopen%u003Dspace%u0026name%u003DMyPage%u0026id%u003D9%u0026cached%u003Dtrue%u0026in%u005Fhi%u005Fuserid%u003D1%u0026control%u003DSetPage%u0026PageID%u003D208%u0026,PT-Community-ID=0,PT-Gadget-ID=226,PT-Gateway-Version=2.5,PT-Content-Mode=1,PT-Return-URI=http://localhost/portal/server.pt/gateway/PTARGS_16_0_0_0_43/a?b=c&,PT-Imageserver-URI=http://www.a.com/imageserver/,PT-User-Charset=UTF-8,PT-SOAP-API-URI=http://bbenac01:11905/ptapi/services/QueryInterfaceAPI,PT-Portal-UUID={bcdd12067bd44a26-10ff664aba20},PT-Class-ID=43,PT-Guest-User=0
CSP-Aggregation-Mode: Multiple
CSP-Global-Gadget-Pref: MaxObjectsToExamine=100

A response might have something like the following, sending a command to the portal about how the portlet should display (a very simple example):

CSP-Display-Mode: Hosted

[3] Finally, we have the core server API. This is the full library of DLLs or JARs that installs on a server that hosts components such as the ALUI Portal, Admin Portal, API Service, or Automation Service. The server API runs the core portal features, and developers use it for their portlets when they require more than what the IDK offers. Such portlets might create and delete users, audit portlet objects, or modify experience definitions.

Deploying Server-API Portlets

A portlet or other application that uses the core server API needs can easily be deployed on the same server as an ALUI portal or admin portal server since that box has all the server API libraries. It should work just as well to install the server API portlet on an automation server, but I recently found it didn't. Why not?

It turns out the ALUI 6.1.0.1 installer (that I was testing with) chooses not to drop some important files on the server when you install just the Automation Server component. This is a small annoyance easily overcome. At least on Windows, these are the required steps:

1. Make sure your Automation Server is installed and running properly.
2. Copy plumtree\ptportal\6.0\bin\assemblies from a portal server onto the automation server
3. Edit plumtree\settings\common\serverconfig.xml so that the adonet-license-file-directory value is the path to that assemblies directory.

For your Java portlets that use the server API? There's a good chance that they'll run on an Automation Server without needing any extra steps. The reason is that first, the Automation Server always runs on Java, and it already installs \ptportal\6.1\lib\java, which has the same libraries that the missing assemblies directory has. Second, the serverconfig.xml value for adonet-license-file-directory only applies to ADO connections.

My untested bet is that a machine with just the API service would behave like one with just the Automation Server component. The API service is always in Java.

Deploying Server-API Portlets on a Portlet Server

Some customers want to keep their portlets running on their portlet servers rather than moving them onto the portal server or the automation server. That's a fine idea, and it's accomplished by installing a core component on the portlet server. You can disable the service for that core component, and so essentially, this lets your portlet server run server API portlets without the overhead of that core server component.

Replicating the ALUI Grid Search Index

| | Comments (0)
The ALUI search component underwent a major redesign for the 6.1.0.0 release. In earlier versions, the indexing component of search was a single point of failure. The querying component though could be made highly available by replicating its index to secondary servers. Release 6.1 brought the important improvement of allowing both indexing and querying to be redundant. A customer can install two search servers to act as nodes of the same partition, and the index takes care of itself.

So there is no longer a need to write special scripts to replicate the 6.1 search index, right? Not if you are just trying to make your live index redundant. So the 6.1 product dropped the old "replicate" tool.

But larger customers have a use case that still requires index replication. If the customer has a failover datacenter for use when the primary system is unavailable for one reason or another, then the customer needs to replicate the search index to that site, and how do you this? It used to be very easy. Consider the following that did the job on ALUI 6.0 (in this case on RHEL). It was two easy commands:

    SEARCHHOME=/usr/local/plumtree/ptsearchserver/6.0
    
    echo ----------- copy the master search index to a backup directory
    $SEARCHHOME/bin/replicate -incr_backup aqlsearch 15244 $SEARCHHOME/indexmaster $SEARCHHOME/incr_backup
    
    echo ----------- restore the backup index to the failover search server
    $SEARCHHOME/bin/replicate -restore aqlsearchfail 15244 $SEARCHHOME/index $SEARCHHOME/incr_backup
    
      You can still attain the same result in the ALUI 6.1 releases, but hold onto your hat. It's a wild ride.

      • Copy the %searchcluster%\checkpoints and %searchcluster%\requests folders from the origin server to a temporary directory on the destination server
      • Make sure all search nodes and services are running properly
      • Run the following command to empty the checkpoints folder and set the requests folder to only have an indexQueue.segment:

               %searchhome%\bin\native\cadmin.exe purge --remove

      • Stop all nodes and services
      • Copy the origin checkpoints and requests folders over their corresponding destination folders, including the indexQueue.segement:
      • Replace the cluster.nodes file(s) in the checkpoints folder(s) with the one from destination cluster base. For example, copy %searchcluster%\cluster.nodes over %searchcluster%\checkpoints\0_106_30976\cluster.nodes
      • Start all nodes and services. The nodes will restore from the checkpoint. If you have a large index, the nodes may take a while to start (five minutes for 1.2 million objects). If you have multiple nodes, they may take another several minutes to move from stall/recover.

    Other migration methods may have trouble migrating checkpoints from a system with low-numbered checkpoint folders to a destination system with higher-numbered folders. This method however has no such problem.

    These steps have been tested only on systems that use a single partition. The number of nodes in the partition is not significant; these steps have been tested when restoring the cluster to destination systems with different numbers of nodes than the origin.

    The directories can be copied from the source while all services are up, and this should not cause synchronization issues later during the restore. So if you take a checkpoint, then later make several changes to the index, then later copy the checkpoints and requests folders to the destination, the destination will have all the changes in its restored index, including those made after the checkpoint.

    This process was created with significant input from Dax Farhang, the product manager responsible for the search product. If we're lucky, he'll cook this feature into the 6.5 ALUI product so that this post will become obsolete. In the meantime, enjoy.