Recover a legacy #Glashfish Application Server from a disk full situation #Java LOG003: Read failure. Log exception at point 14.

By | July 14, 2021

There is this case when disk is full on a server instance running Glashfish Application Server. This sadly has an adverse effect due to some Glashfish bugs that were never fixed.

When disk is full several things get corrupted because Glashfish by default uses the disk to store some cached information.

First sign you have an issue due to disk full is when suddenly system stops. After a restart usually triggered automatically by some form of cluster or system orchestrator you end up with a stuck application server with the elusive error:

LOG003: Read failure. Log exception at point 14.

See bellow the complete error stack trace.


[#|2021-07-12T12:30:22.900+0300|SEVERE|glassfish3.1.2|javax.enterprise.system.core.transaction.com.sun.jts.CosTransactions|_ThreadID=148;_ThreadName=Thread-2;|JTS5022: Unexpected exception [{0}] from log.
com.sun.jts.CosTransactions.LogException: LOG003: Read failure. Log exception at point 14.
at com.sun.jts.CosTransactions.LogControl.openFile(LogControl.java:579)
at com.sun.jts.CosTransactions.Log.open(Log.java:223)
at com.sun.jts.CosTransactions.CoordinatorLog.openLog(CoordinatorLog.java:1163)
at com.sun.jts.CosTransactions.CoordinatorLog.formatLogRecords(CoordinatorLog.java:1042)
at com.sun.jts.CosTransactions.CoordinatorLog.write(CoordinatorLog.java:536)
at com.sun.jts.CosTransactions.TransactionState.setState(TransactionState.java:779)
at com.sun.jts.CosTransactions.TopCoordinator.rollback(TopCoordinator.java:2313)
at com.sun.jts.CosTransactions.CoordinatorTerm.rollback(CoordinatorTerm.java:561)
at com.sun.jts.CosTransactions.TerminatorImpl.rollback(TerminatorImpl.java:305)
at com.sun.jts.CosTransactions.CurrentImpl.rollback(CurrentImpl.java:776)
at com.sun.jts.jta.TransactionManagerImpl.rollback(TransactionManagerImpl.java:368)
at com.sun.enterprise.transaction.jts.JavaEETransactionManagerJTSDelegate.rollbackDistributedTransaction(JavaEETransactionManagerJTSDelegate.java:218)
at com.sun.enterprise.transaction.JavaEETransactionManagerSimplified.rollback(JavaEETransactionManagerSimplified.java:899)
at com.sun.ejb.containers.BaseContainer.completeNewTx(BaseContainer.java:5111)
at com.sun.ejb.containers.BaseContainer.postInvokeTx(BaseContainer.java:4901)
at com.sun.ejb.containers.MessageBeanContainer.afterMessageDeliveryInternal(MessageBeanContainer.java:1211)
at com.sun.ejb.containers.MessageBeanContainer.afterMessageDelivery(MessageBeanContainer.java:1186)
at com.sun.ejb.containers.MessageBeanListenerImpl.afterMessageDelivery(MessageBeanListenerImpl.java:86)
at com.sun.enterprise.connectors.inbound.MessageEndpointInvocationHandler.invoke(MessageEndpointInvocationHandler.java:143)
at com.sun.proxy.$Proxy412.afterDelivery(Unknown Source)
at com.sun.messaging.jms.ra.OnMessageRunner.run(OnMessageRunner.java:328)
at com.sun.enterprise.connectors.work.OneWork.doWork(OneWork.java:114)
at com.sun.corba.ee.impl.orbutil.threadpool.ThreadPoolImpl$WorkerThread.performWork(ThreadPoolImpl.java:497)
at com.sun.corba.ee.impl.orbutil.threadpool.ThreadPoolImpl$WorkerThread.run(ThreadPoolImpl.java:540)
|#]

In general to fix this you have to deal with two different issues that are caused by the disk full situation.

Glashfish disk transaction cache is corrupted.

By default in Glashfish the transaction service writes transactional activity into transaction logs so that transactions can be recovered. Because the disk is full, when that event occurs the application server will write some incomplete data on disk.

After the disk full is solved and a restartof the application server is performed, Glashfish will try to recover the transactions and will fail.

Solution is to clear the he transaction recovery cache from disk by deleting all files from under the configured “tx-log-dir” where transaction service dumps its transaction logs.

rm -fr …/glashfish/domains/mydomain/logs/server/tx

To avoid this in the future if your application does not depend on application server transaction recovery mechanism (you have your own internal implemented processing recovery features) turn off transaction logging by setting the disable-distributed-transaction-logging property to true and the automatic-recovery attribute to false.

see details here

Another alternative described in the link above is to store the transaction log in the database.

MQ transactions could not be committed

The embedded IMQ messaging service will also have issues as a result of the disk full.

If a broker fails, it is possible that a distributed transaction could be left in the PREPARED state without ever having been committed.

Until such a transaction is committed, its messages will not be delivered and its acknowledgements will not be processed.

Hence, as an administrator, you might need to monitor such transactions and commit them or roll them back manually.

The signs that you have this issue are very weird.

You may end up with messages in some IMQ destinations that are never consumed by a consumer.

To spot this issue just list the destinations after you solved the transaction log issue and you restarted the application server.

./imqcmd -b localhost:7012 list dst -u admin

List the transactions:

./imqcmd -b localhost:7012 list txn

Username: admin
Password: 
Listing all the transactions on the broker specified by:

-------------------------
Host         Primary Port
-------------------------
localhost    7012

Transactions that are owned by this broker
-------------------------------------------------------------------------------
Transaction ID        State      User name   # Msgs/# Acks   Creation time
-------------------------------------------------------------------------------
4104602709204656385   PREPARED   guest       0/1             7/3/21 7:57:08 AM
4104602709204673024   PREPARED   guest       0/1             7/3/21 7:57:08 AM
4104602709204673025   PREPARED   guest       0/1             7/3/21 7:57:08 AM
8341645759708816640   STARTED    guest       0/0             7/13/21 1:15:48 PM
8341645759713853440   STARTED    admin       0/0             7/13/21 1:16:08 PM
8341645759714126336   STARTED    guest       0/0             7/13/21 1:16:09 PM
8341645759714232576   STARTED    guest       0/0             7/13/21 1:16:09 PM
8341645759714246656   STARTED    guest       0/0             7/13/21 1:16:09 PM
8341645759715785216   STARTED    guest       0/0             7/13/21 1:16:15 PM
8341645759715926016   STARTED    guest       0/0             7/13/21 1:16:16 PM
8341645759726441984   STARTED    guest       0/0             7/13/21 1:16:57 PM
8341645759726698240   STARTED    guest       0/0             7/13/21 1:16:58 PM

   Transactions that involve remote brokers
   --------------------------------------------------
   Transaction ID   State   User name   Creation time
   --------------------------------------------------

Transactions that are owned by a remote broker
-----------------------------------------------
Transaction ID   State   # Acks   Remote broker
-----------------------------------------------

Successfully listed transactions.

Force commit or by hand each transaction in PREPARED state.

./imqcmd -b localhost:7012 commit txn -n 4104602709204656385
Username: admin
Password: 
Committing the transaction where:

-------------------
Transaction ID
-------------------
4104602709204656385

On the broker specified by:

-------------------------
Host         Primary Port
-------------------------
localhost    7012

Are you sure you want to commit this transaction? (y/n)[n] y

Successfully committed the transaction.

Do this commit or rollback for all the transactions in PREPARED state.

./imqcmd -b localhost:7012 rollback txn -n 4104602709204656385
...

After all is done list again the transactions and check there are no entries in PREPARED state.

Also check again if all the messages from available destinations (queues) were consumed.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.