-
Topic
-
Sorry this is a long message but it’s a difficult issue to explain.
—
We encountered a problem with Groupwise a few days ago that was very worrying.
Various users started reporting that some of their emails had gone ‘blank’. That is to say that the message body was blank and the attachments had gone. The message header was unaffected. Attempting to view the properties of one of these blank emails gave an error – D107.
There didn’t seem to be any connection between the users that were hit, nor which messages they lost. By running and re-running GWCheck we were able to determine that more emails were being ‘hit’ as time went on.
We shutdown the Groupwise server and restarted it, which seemed to halt further damage.
Further investigation turned up the fact that all the damaged emails were related to our Communications Log software and as such have a User-Defined-Field against the message.
By using GWCheck with Fix switched on we were able to change the damaged emails into Posted Items and then delete them and restore from the Groupwise backups.
The nature of this problem mimics a virus in that it was propagating across accounts (which are secure and require passwords or Netware authentication). However the fact that only certain emails were hit makes us think it was a development-caused error, plus we’ve never had a Groupwise virus before!
—
To explain some background, and the only possible cause we can think of, here’s a breakdown of what I am developing.
I have been developing a ‘Communications Log’ to help track our incoming and outgoing communications. The system utilises Formativ to watch for events in Groupwise and a VB application to handle the actual logging process (writing to a MySQL server)
To track the emails I add a field to them, called CommsLog. This field holds a unique reference so I can relate the email to the sql database.
Obviously to use a Field I have to add a Field Definition.
On the day when the problem occurred I ran a test against one of the larger mailboxes to ensure a search for a given Field wouldn’t be adversely affected by the mailbox size. To do this I created the Field Definition ‘CommsLog’, ran the search and then deleted the Field Definition. This was all done using Formativ.
With hindsight I shouldn’t have deleted the Field Definition as there were emails in the mailbox holding the field but losing the field information wouldn’t have mattered.
The Field Definition’s deletion appears to have caused a cascade across various accounts and emails. My best guess on what happened is this …
Deleting the FieldDef from the current Account caused the system to hunt down emails in the mailbox that were using the field. On finding an email the Field was deleted (damaging the email?). Then any internal recipients (To, CC, BC, etc.) were identified and the deletion request passed to their mailbox(es). The process then went on from there, hunting down emails and following the links to other copies in other mailboxes.
This would explain why the process appeared to be still ‘running’ long after the original Applet had completed.
Our attempts to recreate this problem on our Test Groupwise database have failed – deleting a Field Definition that is in use just causes any emails that had that Field to lose the Field and the data it contained.
This is what I would have expected, although I’d prefer the FieldDef delete method to fail if the Field is in use and require a ‘Force’ parameter to be honest.
That said, our test PO is tiny and is never ‘busy’ so maybe the size/throughput of our main PO comes into it (100 users or so, 90 odd Gigs)
Although I was using Formativ to access the Groupwise API I can’t really can’t say which of these may have caused the problem, if either. I can’t ‘try’ again with the main PO.
I’m not even sure the FieldDef deletion caused the problem at all. All I am sure of is that I need to know how to stop this happening again.
I will post a similar message in the Novell discussion foums. So far searching the various forums, tids and the internet in general hasn’t turned up a similar issue.
If this problem has occurred before and anyone has a better suggestion as to what may have caused it I am very open to suggestions.
Thanks
Simon
- You must be logged in to reply to this topic.