Recently in BEA/Oracle Category

WCI Settings Files: rules for construction

| | Comments (0)
rules.jpgThe world is full of rules. I was amused at a local Austin grocery store to find rules against something that seem pretty obvious: food trays are not sleds. Other rules though can be harder to figure out. In case you need to know some of these less obvious rules:

I'm working on an effort to restructure WCI settings files, and a piece of this required understanding the rules for putting together a valid settings file. I hope to later explain the whole project, but until then, here's a subset of what I learned.

The Loose
WCI applications read in everything in the %WCI_HOME%\settings directory on startup. A default system would have these in c:\oracle\wci or some such location. That everything is read means WCI neither cares what your file names are nor what subfolders they may be in. For example, you can move .\settings\configuration.xml to .\settings\do-not-use\disabled.xml, and it will still work just fine. The system treats all information across all files as a single settings definition.

You can also break apart the out-of-the-box XML files into new smaller files, or you can rearrange their content entirely. This explains how it is that systems run WCI 10.3.0.0 equally well for fresh installs versus upgraded installs even though each has differently structured XML files (for example, fresh installs store settings in configuration.xml that upgraded installs keep only in portal\portalconfig.xml and common\serverconfig.xml).

You can add settings in the XML files that are not required and not used by the system. For example, you can have a context or a component defined but never used.

The Strict
Within the config files, however, you'll find tightly linked context, component, and client sections. Some rules are:
  1. A context cannot be defined more than once.
  2. A component name cannot be used more than once.
  3. A component cannot have a subscribed client that is not a defined context.
  4. A client cannot subscribe to two different contexts of the same component type.
An Example
Now is a great time for an example. The following file sits on my system as %WCI_HOME%\settings\example.xml. When the system starts, this file is read into the settings definition, though nothing in it will be used by my applications. The system runs just fine, and it will continue to do so unless I uncomment any of the sections of the config file that are designed to break the four strict rules I previously listed.

Download the file so you can load it in a readable XML parser, load it on your system, or tweak it. You can also try reading it in less readable format below.

Enjoy!

<?xml version="1.0" encoding="UTF-8"?>
<OpenConfig xmlns="http://www.plumtree.com/xmlschemas/config/1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
    <context name="example-context"/>
<!-- ERROR 1: uncomment the below client to create "context with this name already exists" error -->
<!--
    <context name="example-context"/>
    -->
    
    <!-- include the below context to illustrate that listed contexts need not be used -->
    <context name="example-context-unused"/>
    
    <component name="example-component" type="http://www.plumtree.com/config/component/types/example-type">
        <setting name="sometype:something">
            <value xsi:type="xsd:boolean">true</value>
        </setting>
        <clients>
            <client name="example-context"/>
            <!-- ERROR 2: uncomment the below client to create "context could not be opened" error -->
            <!--
            <client name="undeclared-context-breaks-system"/>
            -->
        </clients>
    </component>
    <!-- include the below component to illustrate that components need not have clients -->
    <component name="example-component-no-clients" type="http://www.plumtree.com/config/component/types/example-type">
        <setting name="sometype:something">
            <value xsi:type="xsd:boolean">true</value>
        </setting>
        <clients>
        </clients>
    </component>
    <!-- ERROR 3: uncomment the below component to create "component with this name already exists" error -->
    <!--
    <component name="example-component-no-clients" type="http://www.plumtree.com/config/component/types/example-type2">
        <setting name="sometype:something">
            <value xsi:type="xsd:boolean">true</value>
        </setting>
        <clients>
        </clients>
    </component>
    -->
    
    <!-- ERROR 4: uncomment the below component to create "context already subscribes to component of type" error -->
    <!--
    <component name="example-component-duplicate-type" type="http://www.plumtree.com/config/component/types/example-type">
        <setting name="sometype:something">
            <value xsi:type="xsd:boolean">true</value>
        </setting>
        <clients>
            <client name="example-context"/>
        </clients>
    </component>
    -->
</OpenConfig>


ALUI/WCI SSO Login Sequence and Log Files

| | Comments (0)
sequence.gifYou can't trust your web server logs to tell you how many pages your portal users view. When logging in, especially under SSO, the login sequence generates several "GET /portal/server.pt " lines. I dug into this today, and the results may be helpful as you look to infer portal usage from log files.

Yesterday I turned to IIS logs to determine some usage patterns in the portals I work with where users can enter through two different SSO systems. I started my search by looking at how many times SSOLogin.aspx occurred for each SSO system (hosted on different servers). When the results appeared material, today I wondered whether the load for the systems are different. Do the users of one SSO system have a more engaged portal session?

First I counted simply "GET /portal/server.pt" in the log files, and I though one set of users had far more pages per session than did the other. However, I then realized that gateway images were returned by my search pattern, so I added a space: "GET /portal/server.pt " This made the traffic look much more similar.

But I still didn't know how many actual pages the user sees. What happens in the login sequence?

What I found was:

* It is hard to identify actual pages per visit because the IIS log sometimes shows 3 and sometimes 4 requests per login.
* A user's login generates three lines in the IIS log with "GET /<virtualdirectory>/server.pt/ "  when the user enters the portal through http(s)://<portalhost>/
* A user's login generates four lines in the IIS log with "GET /<virtualdirectory>/server.pt/ "  when the user enters the portal through http(s)://<portalhost>/<virtualdirectory>/server.pt

The login sequence as found in IIS logs looks similar to this:

1. The unidentified user enters without specifying the <virtualdirectory>/server.pt, then redirects to the SSO login


2. The SSO-authenticated user is redirected to the portal from the WSSO login
/portal/server.pt 

3. The SSO-authenticated user is directed to the portal's SSOLogin sequence to process the SSO token and become portal-authenticated
/portal/sso/SSOLogin.aspx 

4. The portal-authenticated user runs a login sequence to determine the proper home page behavior
/portal/server.pt open=space&name=Login&dljr= 

5. The user lands on the proper home page
/portal/server.pt/community/superstuff/204 

I hope that's helpful.
Here's another workaround.

Download this post, the batch file it refers to, and the wget utility from
CachedPortletContent-Workaround.zip.

Overview
=========
This describes a way to get results similar to the ALUI portal's Cached Portlet Content feature of the ALUI portal. This is useful for users of Oracle's WebCenter Interaction 10gR3, a release that has a bug (No.  8689121) that causes this feature to otherwise be unavailable. As the bug database describes it, "WHEN "RUNNING PORTLETS AS JOBS", THE JOB WILL FAIL."

Cached Portlet Content Feature
=========
You can read about the Cached Portlet Content feature at http://download.oracle.com/docs/cd/E12529_01/ali65/AdministratorGuide_ALI_6-5/tsk_portlets_cachingcontent.html. As that page describes, "You might occasionally want to run a job to cache portlet content (for example, if the portlet takes a couple minutes to render). When the job runs, it creates a snapshot of the portlet content (in the form of a static HTML file) that can be displayed on a web site. The file is stored in the shared files directory (for example, C:\bea\ALUI\ptportal\6.5) in \StagedContent\Portlets\<portletID>\Main.html. You can then create another portlet that simply displays the static HTML."

Workaround
==========
The alternate way to get cached portlet content is to create an external operation that will call the URL of the desired content and then will save it to the automation server's file system. This uses wget.exe, a program that is standard on UNIX environments and that is distributed with this workaround for Windows. The port I use is from http://sourceforge.net/projects/unxutils/.

Installation
==========
1. Put wget.exe into the %WCI_HOME%\ptportal\10.3.0\scripts directory of your automation server (e.g. D:\bea\alui\ptportal\10.3.0\scripts). This application allows you to access web pages from the command line and then to save them to the file system.
2. Put the wget-extop.bat file into the %WCI_HOME%\ptportal\10.3.0\scripts directory of your automation server.
3. Test that it works by opening a command prompt on your automation server to %WCI_HOME%\ptportal\10.3.0\scripts, then running a command like this one:

"wget-extop.bat" http://www.target.com target-homepage

When that command finishes, you should see a success message similar to the following:

20:46:28 (104.98 KB/s) - `..\StagedContent\portlets\target-homepage\Main.html' saved [80621]

4. Make sure logging works properly. You should find a file in %WCI_HOME%\ptportal\10.3.0\scripts named wget-extop.log. Open that file and see that it recorded your recent action.

5. Make sure the action downloaded the webpage. You should find it in a location like %WCI_HOME%\ptportal\10.3.0\StagedContent\portlets\target-homepage\Main.html.

6. Open the portal and create an external operation object. On the main settings page, enter an Operating System Command like this:

"wget-extop.bat" http://www.target.com target-homepage

The command has three parts. First it names the batch file you'll use. Second, it gives the URL to download. Third it gives the identifer for this download that will be the directory in which the downloaded content will be stored. Be careful to use only characters in the identifer name that work as directory names. An identifer like "http://www.target.com" will not work because you cannot have slashes in a directory name. Your command may be this:

"wget-extop.bat" http://www.my-company.com/about.html about-our-company


7. In the portal, create a job that will run your external operation. Schedule it to run at the appropriate interval.

wget-extop.bat
==========
The contents of wget-extop.bat should be as follows:

@REM BEGIN WGET-EXTOP.BAT

set arg1=%1
set arg2=%2

md ..\StagedContent\portlets\%arg2%

echo %date% - %time% --- wget %arg1% -O ..\StagedContent\portlets\%arg2%\Main.html >> wget-extop.log
wget %arg1% -O ..\StagedContent\portlets\%arg2%\Main.html

@REM END WGET-EXTOP.BAT


Limitations
==========
This workaround does not offer all the features that the Cached Portlet Content feature normally has. The main reason for limitations is that this request uses wget rather than the portal engine to request content. The request therefore has no access to portlet preferences and so forth. While this workaround is sufficient in some cases, it does not claim to work in all.

Enjoy.

Bill Benac
October 2009

In software development, we can sometimes have maddening debates about whether something is a feature or a bug. This reminds me of an old Phish song: Windora Bug.

"Is that a wind? Or a bug? It's a Windora bug." In other words, it's both. While troubleshooting your system, you might want to listen to the mp3.

In WCI 10gR3, we find the collision of two reasonable features. I think together they make a bug. Or at least, a badly designed feature. So let's start with the old feature:

Sometimes agents outside the portal need to authenticate in. Users count as agents, and so do remote portlets. To allow the agent to log in without providing a password each time, the portal can send a login token that the agent can use for future portal connections. Two old examples of this are [1] when a person uses the "Remember my Password" feature of the portal login screen (usually valid for many days) and [2] when a remote portlet web service sends a login token to the remote service (usually valid for five minutes). The login token held on the remote tier by the agent can be decrypted by the server using its key. This works fine in both the old use cases I provided because the remote tier is handed this value by the portal server.

For whatever reason, you may decide every once in a while that there is a security issue related to saved passwords. The portal had a great feature in the old days to let you update the login token's root key and thereby invalidate these old login tokens forcing users to reauthenticate. The tool for the reset is in the administrative portal under the Portal Settings utility, and it looks something like this:

update-login-token-key.jpgWhen you click that "Update" button, it connects to the portal database and generates a new login token root key, stored in PTSERVERCONFIG with settingid 65.

The trouble comes in with the new feature. In 10gR3, the portal introduces new applications that encrypt passwords based on the login token root key, but this is done at configuration time in the remote application's Configuration Manager. The problem is that those applications are built apparently assuming that the login token root key will never change. The configuration manager requires that you provide the login token root key to it directly. Applications that do this include but are not limited to the Common Notification Service and Analytics. For example:

update-login-token-analytics.jpgThe upshot of all this is that if you choose to click that button in the Portal Settings utility, then you get a new login token root key that no longer matches the one relied on by your remote applications.

If this part of the portal were reconceived, then perhaps the database would have one login token root seed used for agents with a transient token such as those given to users and through remote web service calls that are used to let the agent come back. Those keys basically say, "you've been here before, and you can come back." Then the database might have a second root seed for applications that need permanent access to the portal. In that case, the update feature would be fine, and it would only apply to the key for transient agents.

Oh well. We have to live with it. So to avoid administrators accidentally breaking remote applications, I suggest you update the portal UI to explain the full effect of this particular feature (if you don't want to go through the headache of an involved UI modification to entirely remove it). I did this and now have the following:

update-login-token-key-new.jpgI got there by modifying this file on the admin servers:
d:\bea\alui\ptportal\10.3.0\i18n\en\ptmsgs_portaladminmsgs.xml
Within it I changed strings 2134, 2135, 2136, and 2964. My file has no other modifications in it from the vanilla 10.3.0 version. You can download it here.

Enjoy.


I've been working with the same technology stack for an amazingly long nine years. This has given me much opportunity to work with the same types of issues over and over, and in doing so, I've refined my approach quite a bit. Thus, here's a post that is essentially an improvement on a two year old post, How to Revive a Failed Search Node. I hope this post will offer both a better description of the problem and a better solution to it.

The WebCenter Interaction search product has two features that can interfere with each other. First, on the search cluster, you can schedule checkpoints to essentially wrap up and archive the search index to give you the ability to later restore it. Second, on the search nodes, at startup the node's index looks to the directories on the search cluster to synchronize in a copy of the latest index.

Customers running both checkpoints and multiple nodes periodically encounter trouble because the checkpoint process removes old search cluster request directories that the nodes want to access. So if you have one of your search nodes go down, but the other node keeps working and checkpoints continue to run on a daily schedule, then after a few days and by the time you realize one node had failed, then it won't start. It fails when it tries to access the numbered directory that had existed last time it had run properly. The errors in your %WCI_HOME%\ptsearchserver\[version]\[node]\logs  may look like these in such a case:

Cannot find queue segment for last committed index request: \\servername\SearchCluster\requests\1_1555_256\indexQueue.segment

Indeed, if you look at the path that was shown in the error, you'll find that the numbered folder no longer exists. Perhaps the latest folder will be SearchCluster\requests\1_1574_256.

The fix is to reset the search node so that it no longer expects that specific folder upon which it had been fixated. I wrote about a way to do this with several manual steps in my prior post. This time, however, and after encountering the problem perhaps tens of times, I'm sharing a batch file that I place on Windows search servers to automate the reset process (and this works on both ALUI 6.1 and WCI 10gR3):

set searchservice1=myserver0201
set search_home=c:\oracle\wci\ptsearchserver\10.3.0
@rem
@rem configure the above two variables
@rem
net stop %searchservice1%
c:
rmdir /s /q %search_home%\%searchservice1%\index\
mkdir %search_home%\%searchservice1%\index\1
echo 1 > %search_home%\%searchservice1%\index\ready
cd %search_home%\%searchservice1%\index\1
..\..\..\bin\native\emptyarchive lexicon archive
..\..\..\bin\native\emptyarchive spell spell
net start %searchservice1%

search-panel.jpgTo find the name of the search service that goes in the first parameter, open your Windows services panel, find your search node, right-click into its properties page, and find the "service name" value. This is not the same as the display name. The service name by default is [machine][node] as far as I can tell. So on my box (bbenac02) as the first node, my service name is bbenac0201. This is different from the display name, which defaults to something like "BEA ALI Search bbenac0201."

Enjoy!
phantom.jpgI'll stick with the Star Wars theme from my past post because today's issue is quite similar (even though I haven't bothered to watch all the movies in the series). Do you occasionally encounter errors that are tied to phantom users?

My customer tried propagating security into a WCI admin folder using the async job option today, but they got an error similar to this in the job log:

Sep 1, 2009 9:56:39 AM- *** Job Operation #1 of 1 with ClassID 20 and ObjectID 334 cannot be run, probably because the operation has been deleted.

In the error log on the automation server, we found something like this:

Error creating operation 20:302
com.plumtree.server.marshalers.PTException: -2147204095 - InternalSession.Connect(): UserID (205) not found.

Indeed, user 205 didn't exist. Where did the portal get the idea it should look for the user? It turns out that at the time the particular admin folder was created (folder ID 302), it was created by user 205. Later, that user was deleted from the system, but just as in my last post, sometimes when an object is deleted, references to that object are left in certain tables of the database. In this case, the deletion of a user does not trigger a removal of that user's ownership of certain objects like admin folders. I ran this query to look for all instances of the problem:

select folderid from ptadminfolders where ownerid not in (select objectid from ptusers)

The fix here is to set the ownership of that particular folder (and all others) to the administrative user:

update ptadminfolders set ownerid=1 where ownerid not in (select objectid from ptusers)

While we're thinking about this class of problem, we can look for others cases where a phantom user remains, since in some of these cases it will become a menace. The following is a list of queries that found phantom users at my current customer:

select folderid from ptadminfolders where ownerid not in (select objectid from ptusers)
select objectid from ptcards where ownerid not in (select objectid from ptusers)
select objectid from ptcrawlers where ownerid not in (select objectid from ptusers)
select objectid from ptcommunities where ownerid not in (select objectid from ptusers)
select objectid from ptdatasources where ownerid not in (select objectid from ptusers)
select objectid from ptdocumenttypes where ownerid not in (select objectid from ptusers)
select objectid from ptfilters where ownerid not in (select objectid from ptusers)
select objectid from ptgadgetbundles where ownerid not in (select objectid from ptusers)
select objectid from ptgadgets where ownerid not in (select objectid from ptusers)
select objectid from ptgcservers where ownerid not in (select objectid from ptusers)
select objectid from ptjobs where ownerid not in (select objectid from ptusers)
select objectid from ptwebservices where ownerid not in (select objectid from ptusers)

I suggest in each of the above cases that you replace the phantom user with the administrator user. This will cause no harm, and in some cases it allows you to avoid errors:

update ptadminfolders set ownerid=1 where ownerid not in (select objectid from ptusers)
update ptcards set ownerid=1 where ownerid not in (select objectid from ptusers)
update ptcrawlers set ownerid=1 where ownerid not in (select objectid from ptusers)
update ptcommunities set ownerid=1 where ownerid not in (select objectid from ptusers)
update ptdatasources set ownerid=1 where ownerid not in (select objectid from ptusers)
update ptdocumenttypes set ownerid=1 where ownerid not in (select objectid from ptusers)
update ptfilters set ownerid=1 where ownerid not in (select objectid from ptusers)
update ptgadgetbundles set ownerid=1 where ownerid not in (select objectid from ptusers)
update ptgadgets set ownerid=1 where ownerid not in (select objectid from ptusers)
update ptgcservers set ownerid=1 where ownerid not in (select objectid from ptusers)
update ptjobs set ownerid=1 where ownerid not in (select objectid from ptusers)
update ptwebservices set ownerid=1 where ownerid not in (select objectid from ptusers)


Again, like the problem I last wrote about with the phantom footer ID, this one with users is a bug. The fix would be to add to the deleteUser() method a command to clean up each of these tables. Since no fix is provided, you might set up a nightly job on your database to run these cleanup queries.

Enjoy!

PS: you might like this sed example that converts the list of select statements from this post (saved as select.txt) into the list of update statements:

sed -r s/.*("objectid|folderid")" from "(.*)("where.*")/"update "\2"set ownerid\=1 "\3/g select.txt


Create your own army (of users for testing)

| | Comments (0)
star_wars_clone_army.jpgRecently a colleague brought up the common question of how one might have sufficient users for load testing. There are many solutions to the problem, but one I put together all the way back in 2004 is a server API csharp application that creates bulk users.

I've updated the application for WCI 10gR3, and you can download it here.

From the readme file:

This is a small web application that can create and delete users in bulk. This may be useful in certain test situations.

To install:

    * Unzip the bulkusers directory on your web server.
    * Configure it as an application. It can be made an application from the properties page of the IIS console.
    * Be sure the new IIS application uses .NET 2.0.

To configure:

    * Create a folder in your portal that you will put these new users in. It is important that this folder only be used for this bulk users.
    * Note the folder id of the new folder you created. You might do this by clicking into the folder then examining the query string.
    * Open web.config for this web application. Put the appropriate values into the appSettings section so the web application will know how to connect, where to create users, group memberships, password, and so forth.

To use:

This web application is quite rudimentary in that all instructions are given through its query string. Examples are shown here:

    * To create 25 users, browse to http://server/bulkusers/index.aspx?action=create&count=25
    * To show all users in the folder, browse to http://server/bulkusers/index.asp?action=show
    * To delete all users in the folder (regardless of how they were created), browse to http://server/bulkusers/index.aspx?action=delete

You should be very aware of the consequence of running the delete command. It deletes all users in the folder you specify in web.config. If you make the mistake of using an existing user folder for these bulk users, then the delete command will delete the pre-existing users who probably shouldn't be deleted.

Bill Benac
Written December, 2004
Updated August, 2009

Clean up footers before running WCI 10gR3

| | Comments (1)
It turns out that ALUI 6.1 was more forgiving than WCI in at least one way: if a community was set to use a custom footer, and then if that custom footer was deleted, the community would continue to display properly. However, in WCI 10gR3, a bug exists such that when the custom footer isn't found, the page displays this unhelpful message under the portal's navigation:

Error - The server has experienced an error. Try again or contact your portal administrator if you continue experiencing problems.

When you open the HTML source of the page, you find this somewhat unhelpful message in all its impenetrable detail:


The problem is easy enough to fix as a one-off: Open the community editor, observe that no custom footer appears to be set, add a custom footer, save the community, edit the community, remove the custom footer, and you're good.

You can quickly check whether any communities in your system are affected by this bug by running this query to find those meeting the condition:

select objectid, name, footerid from ptcommunities where footerid not in (select objectid from ptgadgets) and footerid > 0

And fortunately, you can fix them all in one fell swoop by updating them to discontinue their attempted display of that bogus footer:

update ptcommunities set footerid=0 where  footerid not in (select objectid from ptgadgets) and footerid > 0

Finally, you can rerun the query to check for communities with the error, and you will now find all is well.

How should this be fixed in the product?

1) The portal should forgive communities that use a bad footer ID. A simple try/catch should do the trick.
2) The portal should change its steps when deleting a portlet When a portlet is removed from the portal, part of the cleanup should be to update the ptcommunities table to use footerid 0 in the place of the deleted gadgetid.

We'll see whether this is ever done. In the meantime, you need to get your portal running, and the approach I outlined here should do it.

Enjoy!
Sometimes you just need to change ports. When I was a kid, it was the port of New York:

port-newyork-400.jpg

In search of love, I tried the port of Syndey (from which I supported a Plumtree portal system):

port-syndey-400.jpg

Later I found true love in the port of Seattle (from which I supported a BEA ALUI portal system):

port-seattle-400.jpg

And now the local port is San Francisco (from which I support an Oracle WCI portal system).

port-sanfran-400.jpg

And speaking of those portals and changing ports, sometimes you realize that your Configuration Manager service would suit you better if it ran on a different port. You can change the port by following these steps:

1. Remove the existing configuration manager service:

%PT_HOME%\configmgr\2.0\bin\configmgr.bat remove

2. Backup then edit %PT_HOME%\configmgr\2.0\settings\config\private.xml. Set the "EAS:httpsport" to the desired port.

3. Backup then edit %PT_HOME%\configmgr\2.0\settings\config\wrapper.conf. Edit the wrapper.ntservice.name and wrapper.ntservice.displayname to reflect the desired port.

4. Install the configuration manager service:

%PT_HOME%\configmgr\2.0\bin\configmgr.bat install

5. Start the configuration manager service.

Enjoy!



Tunings for the LDAP IDS Sync

| | Comments (0)
gone_running.jpgAre you worried about your LDAP IDS sync's running time? If your system is relatively small, you may not think much about it. You run them each night, or maybe even several times during the day, and life is good. However, on systems that push beyond several hundred thousand users, the performance of this product may become important. An obscure setting can cut the time in half.

This week I reinstalled the IDS, and after running it with default settings, my sync ran in about eight hours. After some tunings though, the job usually finishes in three and a half hours.

Cached Objects
The most important tuning done in %WCI_HOME%\ptldapaws\2.2\settings\config\ldap\properties.xml. I increased the MAX_CACHED_USER_OBJECTS setting from the meager 20000 to instead 1000000.

Memory
With this increased cache setting, you may also find you need to increase memory allocation. Do this in
%WCI_HOME%\ptldapaws\2.2\settings\config\wrapper.conf:

# Initial Java Heap Size (in MB)
wrapper.java.initmemory=1024

# Maximum Java Heap Size (in MB)
wrapper.java.maxmemory=1024

Session Timeout
Another tuning we use that may be necessary when you run larger synchronization batches is to increase the session timeout period within the ldapws.war file's web.xml. The default session-timeout is 60 minutes, but we run at 600.

Enjoy!