Results tagged “bug” from Bill Benac

Here's another workaround.

Download this post, the batch file it refers to, and the wget utility from
CachedPortletContent-Workaround.zip.

Overview
=========
This describes a way to get results similar to the ALUI portal's Cached Portlet Content feature of the ALUI portal. This is useful for users of Oracle's WebCenter Interaction 10gR3, a release that has a bug (No.  8689121) that causes this feature to otherwise be unavailable. As the bug database describes it, "WHEN "RUNNING PORTLETS AS JOBS", THE JOB WILL FAIL."

Cached Portlet Content Feature
=========
You can read about the Cached Portlet Content feature at http://download.oracle.com/docs/cd/E12529_01/ali65/AdministratorGuide_ALI_6-5/tsk_portlets_cachingcontent.html. As that page describes, "You might occasionally want to run a job to cache portlet content (for example, if the portlet takes a couple minutes to render). When the job runs, it creates a snapshot of the portlet content (in the form of a static HTML file) that can be displayed on a web site. The file is stored in the shared files directory (for example, C:\bea\ALUI\ptportal\6.5) in \StagedContent\Portlets\<portletID>\Main.html. You can then create another portlet that simply displays the static HTML."

Workaround
==========
The alternate way to get cached portlet content is to create an external operation that will call the URL of the desired content and then will save it to the automation server's file system. This uses wget.exe, a program that is standard on UNIX environments and that is distributed with this workaround for Windows. The port I use is from http://sourceforge.net/projects/unxutils/.

Installation
==========
1. Put wget.exe into the %WCI_HOME%\ptportal\10.3.0\scripts directory of your automation server (e.g. D:\bea\alui\ptportal\10.3.0\scripts). This application allows you to access web pages from the command line and then to save them to the file system.
2. Put the wget-extop.bat file into the %WCI_HOME%\ptportal\10.3.0\scripts directory of your automation server.
3. Test that it works by opening a command prompt on your automation server to %WCI_HOME%\ptportal\10.3.0\scripts, then running a command like this one:

"wget-extop.bat" http://www.target.com target-homepage

When that command finishes, you should see a success message similar to the following:

20:46:28 (104.98 KB/s) - `..\StagedContent\portlets\target-homepage\Main.html' saved [80621]

4. Make sure logging works properly. You should find a file in %WCI_HOME%\ptportal\10.3.0\scripts named wget-extop.log. Open that file and see that it recorded your recent action.

5. Make sure the action downloaded the webpage. You should find it in a location like %WCI_HOME%\ptportal\10.3.0\StagedContent\portlets\target-homepage\Main.html.

6. Open the portal and create an external operation object. On the main settings page, enter an Operating System Command like this:

"wget-extop.bat" http://www.target.com target-homepage

The command has three parts. First it names the batch file you'll use. Second, it gives the URL to download. Third it gives the identifer for this download that will be the directory in which the downloaded content will be stored. Be careful to use only characters in the identifer name that work as directory names. An identifer like "http://www.target.com" will not work because you cannot have slashes in a directory name. Your command may be this:

"wget-extop.bat" http://www.my-company.com/about.html about-our-company


7. In the portal, create a job that will run your external operation. Schedule it to run at the appropriate interval.

wget-extop.bat
==========
The contents of wget-extop.bat should be as follows:

@REM BEGIN WGET-EXTOP.BAT

set arg1=%1
set arg2=%2

md ..\StagedContent\portlets\%arg2%

echo %date% - %time% --- wget %arg1% -O ..\StagedContent\portlets\%arg2%\Main.html >> wget-extop.log
wget %arg1% -O ..\StagedContent\portlets\%arg2%\Main.html

@REM END WGET-EXTOP.BAT


Limitations
==========
This workaround does not offer all the features that the Cached Portlet Content feature normally has. The main reason for limitations is that this request uses wget rather than the portal engine to request content. The request therefore has no access to portlet preferences and so forth. While this workaround is sufficient in some cases, it does not claim to work in all.

Enjoy.

Bill Benac
October 2009

Is that a [BLANK] or a Bug? Resetting Login Tokens.

|
In software development, we can sometimes have maddening debates about whether something is a feature or a bug. This reminds me of an old Phish song: Windora Bug.

"Is that a wind? Or a bug? It's a Windora bug." In other words, it's both. While troubleshooting your system, you might want to listen to the mp3.

In WCI 10gR3, we find the collision of two reasonable features. I think together they make a bug. Or at least, a badly designed feature. So let's start with the old feature:

Sometimes agents outside the portal need to authenticate in. Users count as agents, and so do remote portlets. To allow the agent to log in without providing a password each time, the portal can send a login token that the agent can use for future portal connections. Two old examples of this are [1] when a person uses the "Remember my Password" feature of the portal login screen (usually valid for many days) and [2] when a remote portlet web service sends a login token to the remote service (usually valid for five minutes). The login token held on the remote tier by the agent can be decrypted by the server using its key. This works fine in both the old use cases I provided because the remote tier is handed this value by the portal server.

For whatever reason, you may decide every once in a while that there is a security issue related to saved passwords. The portal had a great feature in the old days to let you update the login token's root key and thereby invalidate these old login tokens forcing users to reauthenticate. The tool for the reset is in the administrative portal under the Portal Settings utility, and it looks something like this:

update-login-token-key.jpgWhen you click that "Update" button, it connects to the portal database and generates a new login token root key, stored in PTSERVERCONFIG with settingid 65.

The trouble comes in with the new feature. In 10gR3, the portal introduces new applications that encrypt passwords based on the login token root key, but this is done at configuration time in the remote application's Configuration Manager. The problem is that those applications are built apparently assuming that the login token root key will never change. The configuration manager requires that you provide the login token root key to it directly. Applications that do this include but are not limited to the Common Notification Service and Analytics. For example:

update-login-token-analytics.jpgThe upshot of all this is that if you choose to click that button in the Portal Settings utility, then you get a new login token root key that no longer matches the one relied on by your remote applications.

If this part of the portal were reconceived, then perhaps the database would have one login token root seed used for agents with a transient token such as those given to users and through remote web service calls that are used to let the agent come back. Those keys basically say, "you've been here before, and you can come back." Then the database might have a second root seed for applications that need permanent access to the portal. In that case, the update feature would be fine, and it would only apply to the key for transient agents.

Oh well. We have to live with it. So to avoid administrators accidentally breaking remote applications, I suggest you update the portal UI to explain the full effect of this particular feature (if you don't want to go through the headache of an involved UI modification to entirely remove it). I did this and now have the following:

update-login-token-key-new.jpgI got there by modifying this file on the admin servers:
d:\bea\alui\ptportal\10.3.0\i18n\en\ptmsgs_portaladminmsgs.xml
Within it I changed strings 2134, 2135, 2136, and 2964. My file has no other modifications in it from the vanilla 10.3.0 version. You can download it here.

Enjoy.


How to Better Revive a Failed Search Node (and Why)

|
I've been working with the same technology stack for an amazingly long nine years. This has given me much opportunity to work with the same types of issues over and over, and in doing so, I've refined my approach quite a bit. Thus, here's a post that is essentially an improvement on a two year old post, How to Revive a Failed Search Node. I hope this post will offer both a better description of the problem and a better solution to it.

The WebCenter Interaction search product has two features that can interfere with each other. First, on the search cluster, you can schedule checkpoints to essentially wrap up and archive the search index to give you the ability to later restore it. Second, on the search nodes, at startup the node's index looks to the directories on the search cluster to synchronize in a copy of the latest index.

Customers running both checkpoints and multiple nodes periodically encounter trouble because the checkpoint process removes old search cluster request directories that the nodes want to access. So if you have one of your search nodes go down, but the other node keeps working and checkpoints continue to run on a daily schedule, then after a few days and by the time you realize one node had failed, then it won't start. It fails when it tries to access the numbered directory that had existed last time it had run properly. The errors in your %WCI_HOME%\ptsearchserver\[version]\[node]\logs  may look like these in such a case:

Cannot find queue segment for last committed index request: \\servername\SearchCluster\requests\1_1555_256\indexQueue.segment

Indeed, if you look at the path that was shown in the error, you'll find that the numbered folder no longer exists. Perhaps the latest folder will be SearchCluster\requests\1_1574_256.

The fix is to reset the search node so that it no longer expects that specific folder upon which it had been fixated. I wrote about a way to do this with several manual steps in my prior post. This time, however, and after encountering the problem perhaps tens of times, I'm sharing a batch file that I place on Windows search servers to automate the reset process (and this works on both ALUI 6.1 and WCI 10gR3):

set searchservice1=myserver0201
set search_home=c:\oracle\wci\ptsearchserver\10.3.0
@rem
@rem configure the above two variables
@rem
net stop %searchservice1%
c:
rmdir /s /q %search_home%\%searchservice1%\index\
mkdir %search_home%\%searchservice1%\index\1
echo 1 > %search_home%\%searchservice1%\index\ready
cd %search_home%\%searchservice1%\index\1
..\..\..\bin\native\emptyarchive lexicon archive
..\..\..\bin\native\emptyarchive spell spell
net start %searchservice1%

search-panel.jpgTo find the name of the search service that goes in the first parameter, open your Windows services panel, find your search node, right-click into its properties page, and find the "service name" value. This is not the same as the display name. The service name by default is [machine][node] as far as I can tell. So on my box (bbenac02) as the first node, my service name is bbenac0201. This is different from the display name, which defaults to something like "BEA ALI Search bbenac0201."

Enjoy!

The Phantom Menance: Errors From Obsolete User IDs

|
phantom.jpgI'll stick with the Star Wars theme from my past post because today's issue is quite similar (even though I haven't bothered to watch all the movies in the series). Do you occasionally encounter errors that are tied to phantom users?

My customer tried propagating security into a WCI admin folder using the async job option today, but they got an error similar to this in the job log:

Sep 1, 2009 9:56:39 AM- *** Job Operation #1 of 1 with ClassID 20 and ObjectID 334 cannot be run, probably because the operation has been deleted.

In the error log on the automation server, we found something like this:

Error creating operation 20:302
com.plumtree.server.marshalers.PTException: -2147204095 - InternalSession.Connect(): UserID (205) not found.

Indeed, user 205 didn't exist. Where did the portal get the idea it should look for the user? It turns out that at the time the particular admin folder was created (folder ID 302), it was created by user 205. Later, that user was deleted from the system, but just as in my last post, sometimes when an object is deleted, references to that object are left in certain tables of the database. In this case, the deletion of a user does not trigger a removal of that user's ownership of certain objects like admin folders. I ran this query to look for all instances of the problem:

select folderid from ptadminfolders where ownerid not in (select objectid from ptusers)

The fix here is to set the ownership of that particular folder (and all others) to the administrative user:

update ptadminfolders set ownerid=1 where ownerid not in (select objectid from ptusers)

While we're thinking about this class of problem, we can look for others cases where a phantom user remains, since in some of these cases it will become a menace. The following is a list of queries that found phantom users at my current customer:

select folderid from ptadminfolders where ownerid not in (select objectid from ptusers)
select objectid from ptcards where ownerid not in (select objectid from ptusers)
select objectid from ptcrawlers where ownerid not in (select objectid from ptusers)
select objectid from ptcommunities where ownerid not in (select objectid from ptusers)
select objectid from ptdatasources where ownerid not in (select objectid from ptusers)
select objectid from ptdocumenttypes where ownerid not in (select objectid from ptusers)
select objectid from ptfilters where ownerid not in (select objectid from ptusers)
select objectid from ptgadgetbundles where ownerid not in (select objectid from ptusers)
select objectid from ptgadgets where ownerid not in (select objectid from ptusers)
select objectid from ptgcservers where ownerid not in (select objectid from ptusers)
select objectid from ptjobs where ownerid not in (select objectid from ptusers)
select objectid from ptwebservices where ownerid not in (select objectid from ptusers)

I suggest in each of the above cases that you replace the phantom user with the administrator user. This will cause no harm, and in some cases it allows you to avoid errors:

update ptadminfolders set ownerid=1 where ownerid not in (select objectid from ptusers)
update ptcards set ownerid=1 where ownerid not in (select objectid from ptusers)
update ptcrawlers set ownerid=1 where ownerid not in (select objectid from ptusers)
update ptcommunities set ownerid=1 where ownerid not in (select objectid from ptusers)
update ptdatasources set ownerid=1 where ownerid not in (select objectid from ptusers)
update ptdocumenttypes set ownerid=1 where ownerid not in (select objectid from ptusers)
update ptfilters set ownerid=1 where ownerid not in (select objectid from ptusers)
update ptgadgetbundles set ownerid=1 where ownerid not in (select objectid from ptusers)
update ptgadgets set ownerid=1 where ownerid not in (select objectid from ptusers)
update ptgcservers set ownerid=1 where ownerid not in (select objectid from ptusers)
update ptjobs set ownerid=1 where ownerid not in (select objectid from ptusers)
update ptwebservices set ownerid=1 where ownerid not in (select objectid from ptusers)


Again, like the problem I last wrote about with the phantom footer ID, this one with users is a bug. The fix would be to add to the deleteUser() method a command to clean up each of these tables. Since no fix is provided, you might set up a nightly job on your database to run these cleanup queries.

Enjoy!

PS: you might like this sed example that converts the list of select statements from this post (saved as select.txt) into the list of update statements:

sed -r s/.*("objectid|folderid")" from "(.*)("where.*")/"update "\2"set ownerid\=1 "\3/g select.txt


Today Oracle released  WebCenter Interaction 10gR3, the first Oracle-branded incarnation of the BEA's Aqualogic User Interaction product. I was eager to get started on upgrading a customer's ALUI 6.5 MP1 system to 10gR3. I encountered an "unexpected consideration" in the installer that you might call a bug. In my experience, on Windows the installer fails unless you explicitly allocate virtual memory.

I installed on five lab servers that had been running ALUI 6.5 MP1. The first succeeded, but on the second two servers, the installers failed.

10gr3-setup-fails.jpg



















Hmm. I dug into the logs (and you have to wait a bit after this message before they are all fully written), and I found the following in \installlogs\versionpolicy_deployment.log:

 [trycatch] Caught exception: The config file I:\apps\plumtree\uninstall\ptportal\10.3.0\register\ERROR: Registry key does not exist\ContentsXML\inventory.xml must exist.

And later:

BUILD FAILED
I:\apps\plumtree\uninstall\ptportal\10.3.0\register\register.xml:4979: The following error occurred while executing this line:
I:\apps\plumtree\uninstall\ptportal\10.3.0\register\macrodefs\versionpolicy.xml:690: The following error occurred while executing this line:
I:\apps\plumtree\uninstall\ptportal\10.3.0\register\macrodefs\orainventory.xml:172: Oracle Universal Installer failed to properly register your ORACLE_HOME,
I:\apps\plumtree, under name OraWCIntgHome1.  Make sure (1) that you have proper permissions (on unix you would have needed to run orainstRoot.sh as root user, on windows you need write access to registry and ability to install to %ProgramFiles% directory), (2) that, on unix, you did not run installer as root user.  You can attempt to run OUI yourself with command line "I:\apps\plumtree\uninstall\ptportal\10.3.0\register/../../../oui/cd/Disk1/install/setup.exe" -ignoreSysprereqs -attachHome "ORACLE_HOME=I:\apps\plumtree" ORACLE_HOME_NAME=OraWCIntgHome1.   If that succeeds, then you can run the installer again.


Okay, so I ran the command it suggested in the command line, and it again failed, but this time it left open the Oracle Universal Installer console with this message in it:

Starting Oracle Univeral Installer...
Checking swap space: 0MB available, 500 MB required.


Ahh, so I dug into the virtual memory settings, and I found on one machine that the C:\ drive had no virtual memory assigned, and then a secondary drive had virtual memory set to "system managed size." On the other machine, the C:\ drive had vritual memory assigned, but it used "system managed size."

On my fourth server, I tried setting specific memory settings on the C:\ drive, and that worked. On my fifth server, I tried leaving "system managed size" on the C:\ drive but specific virtual memory size on a secondary drive.  Both of those worked fine.

So the trick seems to be simply, set virtual memory specifically. To do so:

  • WindowsKey-Break to open the System Properties Window
  • Go to the Advanced tab
  • Open the Performance settings
  • Go to the Advanced tab
  • In the Virtual Memory area, select Change
  • Specify "Custom Size" and enter intitial of 2046 and max of 4092
  • Click Set, then OK, then acknowledge you need to restart, then close apps, restart, and run the installer.
Properly set memory could look something like this:

10gr3-virtual-memory.jpg






























Good luck with your 10gR3 installs!

----

Added Nov 14: Joel asked where to find the installer. Follow these steps:

  1. Log in through http://edelivery.oracle.com/
  2. Search "Oracle Fusion Middleware" and "Win32"
  3. Click into Oracle® Application Server 10g Release 3 (10.1.3) Media, Pack for Microsoft Windows (32-bit)
  4. Search the results page for "Interaction" to find the WCI and related products



Find recent content on the main index or look in the archives to find all content.