Showing posts with label wildfly. Show all posts
Showing posts with label wildfly. Show all posts

Thursday, April 21, 2022

Narayana on the Cloud - Part 1

In the last few months, I have been working on how distributed transactions are recovered in WildFly when this Application Server (AS) is deployed in Kubernetes. This blog post is a reflection on how Narayana performs on the cloud and the features it is still missing for it to evolve into a native cloud transaction suite.

Some (very brief) context

Narayana started its journey more than 30 years ago! ArjunaCore was developed in the late 1980s. Even though the theoretical concept of cloud computing was introduced by John McCarthy in 1961 [1][2], at the time of ArjunaCore’s development it was still considered only as a theoretical possibility. However, in the past two decades, the implementation of cloud computing has increased exponentially, dramatically changing the world of technology. As a consequence, Narayana (and its ArjunaCore) needs to step up its game to become a cloud native transaction suite that can be used in different cloud environments. This is an ongoing conversation the Narayana team has started a long time ago (for a detailed summary of Narayana's Cloud Strategy see [3]).

Narayana was introduced to the cloud through WildFly (note 1) on Kubernetes (K8s). In my recent experience, I worked on WildFly and its K8s operator [4] and I think that the integration between Narayana and WildFly works very smoothly on K8s [5]. On the other hand, when the pod hosting WildFly needs to scale down, the ephemeral nature of K8s does not get along with Narayana very well. In fact, ArjunaCore/Narayana needs to have a stable ground to perform its magic (within or without WildFly). In particular, Narayana needs to have:

  • A stable and durable Object Store where objects’ states are held
  • A stable node identifier to uniquely mark transactions (which are initialised by the Transaction Manager (TM) with the same node identifier) and ensure that the Recovery Manager will only recover those transactions
  • A stable communication channel to allow participants of transactions to communicate with the TM

In all points above, “stable” indicates the ability to survive whatever happens to the host where Narayana is running (e.g., crashes). On the other hand, K8s is an ephemeral environment where pods do not need a stable storage and/or particular configurations that survive over multiple reboots. To overcome this “incompatibility”, K8s provides StatefulSet [6] through which applications can leverage a stable realm. Particularly in relation to Narayana, the employment of StatefulSet and the addition of a transaction recovery module to the WildFly K8s Operator [7] enables this AS to fully support transactions on K8s. Unfortunately, this solution is tailor-made for K8s and it cannot be easily ported in other cloud environments. Our target, though, is to evolve Narayana to become a cloud transaction suite, which means that Narayana should also support other cloud computing infrastructures.

Our take on this

The Narayana team thoroughly discussed the above limitations that prevent Narayana from becoming a native cloud application. A brief summary is presented here:

  • A stable and durable Object Store where objects’ states are held
    Narayana is able to use different kinds of object stores; in particular, it is possible to use a (SQL) database to create the object store [8]. RDBMS databases are widely available on cloud environments: these solutions already cover our stability needs providing a reliable storage solution that supports replications and that is able to scale up on demand. Moreover, using a “centralised” RDBMS database would easen the management of multiple Narayana instances, which can be connected to the same database. This might also become incredibly useful in the future when it comes to evolving Narayana to work with multiple instances behind a load balancer (i.e. in case of replication)
     
  • A stable communication channel to allow participants of transactions to communicate with the TM
    Most cloud providers (and platforms) already offer two options to tackle this problem: a stable IP address and a DNS. Although both methods still need some tweaking for each cloud provider, these solutions should provide a stable endpoint to communicate with Narayana’s TM over multiple reboots
     
  • A stable node identifier to uniquely mark transactions (which are initialised by the Transaction Manager (TM) with the same node identifier) and ensure that the Recovery Manager will only recover those transactions
    This is the actual sticky point this blog post is about. Although it seems straightforward to assign a unique node identifier to the TM, it is indeed the first real logic challenge to solve on the path to turn Narayana in a cloud transaction manager

We discussed different possible solutions to this last point but we are still trying to figure out how to address this issue. The main problem is that Narayana needs stable storage to save the node identifier and reload it after a reboot. As already said, cloud environments do not provide this option very easily as their ephemeral nature is more inclined to a stateless approach. Our first idea to solve this problem was, “why do we not store the node identifier in the object store? Narayana still needs a stable object store (and this constraint cannot be dropped) and RDBMS databases on the cloud already provide a base to start from”. The node identifier is a property of the transaction manager that gets initialised when Narayana/ArjunaCore starts (together with all the other properties). As a consequence, it is not possible to save the node identifier in the object store as the preferences for the object store are also loaded during the same initialisation process! In other words, if the node identifier is stored in the object store, how can Narayana/ArjunaCore know where the object store is without loading all properties? Which came first: the chicken or the egg? Nevertheless, introducing an order when properties are loaded might help in this regard (i.e. we force the egg to exist before the chicken). Nevertheless, there is still a problem: what happens if the object store is shared between different instances of Narayana/ArjunaCore? For example, it might be very likely that a Narayana administrator configures multiple Narayana instances to create their object stores in the same database. In this case, every Narayana instance would need a unique identifier to tell which node identifier in the object store is its own. Recursive problems are fun :-) Even if we solve all these problems, the assignment of the node identifier should not be possible outside of Narayana (e.g. using system properties) and it should become an exclusive (internal) operation of Narayana. Fortunately, this is easier than solving our previous “chicken and egg” problem as there are solutions to generate a (almost) unique distributed identifier locally [9]. As things stand, we should find an alternative solution to port the node identifier to the cloud.

Looking at this problem from a different point of view, I wonder if there are more recent solutions to replace and/or remove the node identifier from Narayana. With this in mind, the first question I ask myself is “Why do we need a node identifier?”. Behind the hood, Narayana uses a recovery manager to try to recover transactions that have not completed their lifecycle. This comes with a caveat though: it is essential that two different recovery managers do not try to recover the same in-doubt transaction at the same time. That is where the node identifier comes in handy! In fact, thanks to the unique node identifier (that gets embedded in every global transaction identifier), the recovery manager can recognise if it is responsible for the recovery of an in-doubt transaction stored in a remote resource (note 2). This concept is best illustrated by an example. Let’s consider two different Narayana instances that initiate two different transactions that enlist the same resource. In this scenario, both transaction managers store a record in the shared resource. Let’s assume that the first Narayana instance starts the transaction before the second instance. While the first transaction gets to the point where it has sent prepare() to its enlisted resources, it is possible that the recovery manager of the second Narayana instance queries the shared resource for in-doubt records. If Narayana’s recovery manager was not forced to recover only transactions initiated by the same Narayana instance’s TM, this hypothetical scenario would have ended with an error: the recovery manager of the second Narayana instance would have rolled back the transaction initiated by the first Narayana instance, assuming that it was one of its own in-doubt transaction!

Cloud environments are encouraging (all of) us to come up with an innovative solution to reduce the footprint of Narayana/ArjunaCore. In particular, the node identifier is the challenge we are currently facing and the first real step to push Narayana onto the cloud. I will share any updates the Narayana team comes up with…and in the meantime, feel free to reach out to the team through our public channels (for example Gitter or our Google group narayana-users) to propose your ideas or discuss with us your take on this fundamental issue.

Note

  1. WildFly supports transactions thanks to the integration with Narayana
  2. It is possible to tell the Recovery Manager that it will be responsible for the recovery of in-doubt transactions initiated by different transaction managers (which are identified with different node identifiers). The only caveat here is that two Recovery Managers should not recover the same in-doubt transaction at the same time. To assign the responsibility of multiple node identifiers to the same Recovery Manager, the property xaRecoveryNodes [10] in Narayana’s JTAEnvironmentBean should be used.

Bibliography

[1] J. Surbiryala and C. Rong, "Cloud Computing: History and Overview," 2019 IEEE Cloud Summit, 2019, pp. 1-7, doi: 10.1109/CloudSummit47114.2019.00007.

[2] Garfinkel, Simson L. and Harold Abelson. “Architects of the Information Society: 35 Years of the Laboratory for Computer Science at Mit.” (1999).

[3] https://jbossts.blogspot.com/2022/03/narayana-community-priorities.html

[4] https://github.com/wildfly/wildfly-operator

[5] https://issues.redhat.com/browse/EAP7-1394

[6] https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/

[7] https://github.com/wildfly/wildfly-operator/

[8] https://www.narayana.io/docs/project/index.html#d0e459

[9] https://groups.google.com/g/narayana-users/c/ttSff9HvXdA

[10] https://www.narayana.io//docs/product/index.html#d0e1032

Friday, October 19, 2018

Narayana integration with Agroal connection pool

Project Agroal defines itself as “The natural database connection pool”. And that’s what is it.

It was developed by Luis Barreiro. He works for WildFly as a performance engineer. This prefigures what you can expect – a well performing database connection pool. As Agroal comes from the porfolio of the WildFly projects it offers smooth integration with WildFly and with Narayana too.

In the previous posts we checked other connection pools that you can use with Narayana - either the transactional driver provided by Narayana or DBCP2 which is nicely integrated to be used with Narayana in Apache Tomcat. Another option is the use of the IronJacamar which lives in the long-termed brotherhood with Narayana. All those options are nicely documented in our quickstarts.

Agroal is a party member and you should consider to check it. Either when running standalone application with Narayana or when you run on WildFly. Let’s take a look how you can use it in the standalone application first.

Agroal with Narayana standalone

In case you want to use the Agroal JDBC pooling capabilities with Narayana in your application you need to configure the Agroal datasource to know

  • how to grab the instance of the Narayana transaction manager
  • where to find the synchronization registry
  • how to register resources to Narayana recovery manager

Narayana setup

First we need to gain all the mentioned Narayana objects which are then passed to Agroal which ensures the integration by calling the Narayana API at appropriate moments.

// gaining the transction manager and synchronization registry
TransactionManager transactionManager
    = com.arjuna.ats.jta.TransactionManager.transactionManager();
TransactionSynchronizationRegistry transactionSynchronizationRegistry
    = new com.arjuna.ats.internal.jta.transaction.arjunacore.TransactionSynchronizationRegistryImple();

// intitialization of recovery manager
RecoveryManager recoveryManager
    = com.arjuna.ats.arjuna.recovery.RecoveryManager.manager();
recoveryManager.initialize();
// recovery service provides binding for hooking the XAResource to recovery process
RecoveryManagerService recoveryManagerService
    = new com.arjuna.ats.jbossatx.jta.RecoveryManagerService();
recoveryManagerService.create();

Agroal integration

Now we need to pass the Narayana's object instances to Agroal. With that being done we can obtain a JDBC Connection which is backed by the transaction manager.


AgroalDataSourceConfigurationSupplier configurationSupplier
    = new AgroalDataSourceConfigurationSupplier()
  .connectionPoolConfiguration(cp -> cp
    .transactionIntegration(new NarayanaTransactionIntegration(
        transactionManager, transactionSynchronizationRegistry,
        "java:/agroalds1", false, recoveryManagerService))
      cf.connectionFactoryConfiguration(cf ->
        .jdbcUrl("jdbc:h2:mem:test")
        .principal(new NamePrincipal("testuser"))
        .credential(new SimplePassword("testpass"))
        .recoveryPrincipal(new NamePrincipal("testuser"))
        .recoveryCredential(new SimplePassword("testpass"))
        .connectionProviderClassName("org.h2.jdbcx.JdbcDataSource"))
      .maxSize(10)
    );
AgroalDataSource ds1 = AgroalDataSource.from(configurationSupplier);

transactionManager.begin();

conn1 = ds1.getConnection();
...

Those are steps needed for the standalone application to use Narayana and Agroal. The working code example could be seen in the Narayana quickstart at github.com/jbosstm/quickstart#agroal - AgroalDatasource.java

Agroal XA datasource in WildFly

If you want to use the power of Narayana in the WildFly application you need XA participants that Narayana can drive. From Agroal perspective you need to define a xa datasource which you use (linked via JNDI name) in your application.

DISCLAIMER: for you can use the Agroal capabilities integrated with Narayana you will need to run WildFly 15 or later. Currently only WildFly 14 is available so for testing this you need to build the WildFly from sources by yourself. The good message is that’s an easy task – see at https://github.com/wildfly/wildfly/#building.

Agroal datasource subsystem is not available by default in the standalone.xml file so you need to enable that extension. When you run the jboss cli commands then you do it like this

cd $JBOSS_HOME
./bin/jboss-cli.sh -c

#  jboss-cli is started, run following command there
/extension=org.wildfly.extension.datasources-agroal:add
/subsystem=datasources-agroal:add()
:reload

From now you can work with the datasources-agroal subsystem. For you can create the xa-datasource definition you need to have a driver which the datasource will use. The driver has to define it’s XA connection provider.

NOTE: if you want to check what are options for the Agroal configuration in the jboss cli then read the resource description with command /subsystem=datasources-agroal:read-resource-description(recursive=true)

Agroal driver definition works only with drivers deployed as modules. You can’t just copy the driver jar to $JBOSS_HOME/standalone/deployments directory but you need to create a module under $JBOSS_HOME/modules directory. See details either by creating module.xml by yourself or the recommended way is using the jboss cli with command

module add --name=org.postgresql
    --resources=/path/to/jdbc/driver.jar --dependencies=javax.api,javax.transaction.api

NOTE: The command uses the name of the module org.postgresql as I will demonstrate adding the xa datasource for the PostgreSQL database.

When the module is added we can declare the Agroal’s driver.

/subsystem=datasources-agroal/driver=postgres:add(
    module=org.postgresql, class=org.postgresql.xa.PGXADataSource)

We’ve used the class org.postgresql.xa.PGXADataSource as we want to use it as XA datasource. When class is not defined then standard jdbc driver for PostgresSQL is used (org.postgresql.Driver) as declared in the META-INF/services/java.sql.Driver file.

NOTE: If you would declare the driver without the XA datasource being defined and then you try to add it to XA datasource definition you will get an error

/subsystem=datasources-agroal/driver=non-xa-postgres:add(module=org.postgresql)
/subsystem=datasources-agroal/xa-datasource=AgroalPostgresql:add(
    connection-factory={driver=non-xa-postgres},...)
{
    "outcome" => "failed",
    "failure-description" => {"WFLYCTL0080: Failed services" => {"org.wildfly.data-source.AgroalPostgresql"
        => "WFLYAG0108: An xa-datasource requires a javax.sql.XADataSource as connection provider. Fix the connection-provider for the driver"}
},
    "rolled-back" => true
}

When the JDBC driver module is defined we can create the Agroal XA datasource. The bare minimum of attributes you have to define is shown in the following command

/subsystem=datasources-agroal/xa-datasource=AgroalPostgresql:add(
    jndi-name=java:/AgroalPostgresql, connection-pool={max-size=10}, connection-factory={
    driver=postgres, username=test, password=test,url=jdbc:postgresql://localhost:5432/test})

NOTE: this is the most simple way of define the credentials for the connection to database. If you consider more sophisticated method, than just username/password as clear strings saved in the standalone.xml, take a look at the Elytron capabilities.

To check if the WildFly Agroal datasource is able to connect to the database you can use test-connection command

/subsystem=datasources-agroal/xa-datasource=AgroalPostgresql:test-connection()

If you are insterested how the configuration looks as a xml element in standalone.xml configuration file then the Agroal subsystem with PostgreSQL XA datasource definition would look like

<subsystem xmlns="urn:jboss:domain:datasources-agroal:1.0">
    <xa-datasource name="AgroalPostgresql" jndi-name="java:/AgroalPostgresql">
        <connection-factory driver="postgres" url="jdbc:postgresql://localhost:5432/test"
            username="test" password="test"/>
        <connection-pool max-size="10"/>
    </xa-datasource>
    <drivers>
        <driver name="postgres" module="org.postgresql" class="org.postgresql.xa.PGXADataSource"/>
    </drivers>
</subsystem>

If you want use the Agroal non-xa datasource as commit markable resource (CMR) it’s possible too. You need to create a standard datasource and define it as connectable. For more information what the commit markable resource means and how it works check our previous blogpost about CMR.

<subsystem xmlns="urn:jboss:domain:datasources-agroal:1.0">
    <datasource name="AgroalPostgresql" connectable="true" jndi-name="java:/AgroalPostgresql"
            statistics-enabled="true">
        <connection-factory driver="postgres" url="jdbc:postgresql://localhost:5432/test"
            username="test" password="test"/>
        <connection-pool max-size="10"/>
    </datasource>
    <drivers>
        <driver name="postgres" module="org.postgresql" class="org.postgresql.Driver"/>
    </drivers>
</subsystem>

NOTE: In addition to this configuration of Agroal datasource you need to enable the CMR in the transaction subsystem too – check the blogpost for detailed info.

Summary

This blogpost showed way how to configure Agroal JDBC pooling library and how to integrate it with Narayana.
The code example is part of the Narayana quickstart and you can check it at https://github.com/jbosstm/quickstart/tree/master/agroal

Thursday, June 28, 2018

Narayana Commit Markable Resource: a faultless LRCO for JDBC datasources

CMR is neat Narayana feature enabling full XA transaction capability for one non-XA JDBC resource. This gives you a way to engage a database resource to XA transaction even the JDBC driver is not fully XA capable (or you just have a design restriction on it) while transaction data consistency is kept.

Last resource commit optimization (aka. LRCO)

Maybe you will say "adding one non-XA resource to a transaction is well-known LRCO optimization". And you are right. But just partially. The last resource commit optimization (abbreviated as LRCO) provides a way to enlist and process one non-XA datasource to the global transaction managed by the transaction manager. But LRCO contains a pitfall. When the crash of the system (or the connection) happens in particular point of the time, during two-phase commit processing, it causes data inconsistency. Namely, the LRCO could be committed while the rest of the resources will be rolled-back.

Let's elaborate a bit on the LRCO failure. Let's say we have a JMS resource where we send a message to a message broker and non-XA JDBC datasource where we save information to the database.

NOTE: The example refers to the Narayana two-phase commit implemenation.

  1. updating the database with INSERT INTO SQL command, enlisting LRCO resource under the transaction
  2. sending a message to the JMS broker, enlisting the JMS resource to the transaction
  3. Narayana starts the two phase commit processing
  4. prepare is called to JMS XA resource, the transaction log is stored at the JMS broker side
  5. prepare phase for the LRCO means to call commit at the non-XA datasource. That call makes the data changes visible to the outer world.
  6. crash of the Narayana JVM occurs before the Narayana can preserve information of commit to its transaction log store
  7. after the Narayana restarts there is no notion about the existence of any transaction thus the prepared JMS resource is rolled-back during transaction recovery

Note: roll-backing of the JMS resource is caused by presumed abort strategy applied in the Narayana. If transaction manager does do not apply the presumed abort then you end ideally not better than in the transaction heuristic state.

The LRCO processing is about ordering the LRCO resource as the last during the transaction manager 2PC prepare phase. At place where transaction normally calls prepare at XAResources there is called commit at the LRCO's underlaying non-XA resource.
Then during the transaction manager commit phase there is called nothing for the LRCO.

Commit markable resource (aka. CMR)

The Commit Markable Resource, abbreviated as CMR, is an enhancement of the last resource commit optimization applicable on the JDBC resources. The CMR approach achieves capabilities similar to XA by demanding special database table (normally named xids) that is accessible for transaction manager to write and to read via the configured CMR datasource.

Let's demonstrate the CMR behavior at the example (reusing setup from the previous one).

  1. updating the database with INSERT INTO SQL command, enlisting the CMR resource under the transaction
  2. sending a message to the JMS broker, enlisting the JMS resource to the transaction
  3. Narayana starts the two phase commit processing
  4. prepare on CMR saves information about prepare to the xids table
  5. prepare is called to JMS XA resource, the transaction log is stored at the JMS broker side
  6. commit on CMR means calling commit on underlaying non-XA datasource
  7. commit on JMS XA resource means commit on the XA JMS resource and thus the message being visible at the queue, the proper transaction log is removed at the JMS broker side
  8. Narayana two phase commit processing ends

From what you can see here the difference from the LRCO example is that the CMR resource is not ordered as last in the resource processing but it's ordered as the first one. The CMR prepare does not mean committing the work as in case of the LRCO but it means saving information about that CMR is considered to be prepared into the database xids table.
As the CMR is ordered as the first resource for processing it's taken as first during the commit phase too. The commit call then means to call commit at the underlying database connection. The xids table is not cleaned at that phase and it's normally responsibility of CommitMarkableResourceRecordRecoveryModule to process the garbage collection of records in the xids table (see more below).

The main fact to understand is that CMR resource is considered as fully prepared only after the commit is processed (meaning commit on the underlaying non-XA JDBC datasource). Till that time the transaction is considered as not prepared and will be processed with rollback by the transaction recovery.

NOTE: the term fully prepared considers the standard XA two-phase commit processing. If the transaction manager finishes with the prepare phase, aka. prepare is called on all transaction participants, the transaction is counted as prepared and commit is expected to be called on each participant.

It's important to note that the correct processing of failures in transactions which contain CMR resources is responsibility of the special periodic recovery module CommitMarkableResourceRecordRecoveryModule. It has to be configured as the first in the recovery module list as it needs to check and eventually process all the XA resources belonging to the transaction which contains the CMR resource (the recovery modules are processed in the order they were configured). You can check here how this is set up in WildFly.
The CMR recovery module knows about the existence of the CMR resource from the record saved in the xids table. From that it's capable to pair all the resources belonging to the same transaction where CMR was involved.

xids: database table to save CMR processing data

As said Narayana needs a special database table (usually named xids) to save information that CMR was prepared. You may wonder what is content of that table.
The table consists of three columns.

  • xid : id of the transaction branch belonging to the CMR resource
  • transactionManagerID : id of transaction manager, this serves to distinguish more transaction managers (WildFly servers) working with the same database. There is a strict rule that each transaction manager must be defined with unique transaction id (see description of the node-identifer).
  • actionuid : global transaction id which unites all the resources belonging to the one particular transaction

LRCO failure case with CMR

In the example, we presented as problematic for LRCO, the container crashed just before prepare phase finished. In such case, the CMR is not committed yet. The other transaction participants are then rolled-back as the transaction was not fully prepared. The CMR brings the consistent rollback outcome for all the resources.

Commit markable resource configured in WildFly

We have sketched the principle of the CMR and now it's time to check how to configure it for your application running at the WildFly application server.
The configuration consists of three steps.

  1. The JDBC datasource needs to be marked as connectable
  2. The database, the connectable datasource points to, has to be enriched with the xids table where Narayana can saves the data about CMR processing
  3. Transaction subsystem needs to be configured to be aware of the CMR capable resource

In our example, I use the H2 database as it's good for the showcase. You can find it in quickstart I prepared too. Check out the https://github.com/jbosstm/quickstart/tree/master/wildfly/commit-markable-resource.

Mark JDBC datasource as connectable

You will mark the resource as connectable when you use attribute connectable="true" in your datasource declaration in standalone*.xml configuration file. When you use jboss cli for the app server configuration you will use commands

/subsystem=datasources/data-source=jdbc-cmr:write-attribute(name=connectable, value=true)
:reload

The whole datasource configuration then looks like

<datasource jndi-name="java:jboss/datasources/jdbc-cmr" pool-name="jdbc-cmr-datasource"
          enabled="true" use-java-context="true" connectable="true">
  <connection-url>jdbc:h2:mem:cmrdatasource</connection-url>
  <driver>h2</driver>
  <security>
      <user-name>sa</user-name>
      <password>sa</password>
  </security>
</datasource>

When datasource is marked as connectable then the IronJacamar (JCA layer of WildFly) creates the datasource instance as implementing org.jboss.tm.ConnectableResource (defined in the jboss-transaction-spi project). This resource defines that the class provides method getConnection() throws Throwable. That's how the transaction manager is capable to obtain the connection to the database and works with the xids table inside it.

Xids database table creation

The database configured to be connectable has to ensure existence of the xids before transaction manager starts. As described above the xids allows to save the cruical information about the non-XA datasource during prepare. The shape of the SQL command depends on the SQL syntax of the database you use. The example of the table cleation commands is (see more commands under this link)

-- Oracle
CREATE TABLE xids (
  xid RAW(144), transactionManagerID VARCHAR(64), actionuid RAW(28)
);
CREATE UNIQUE INDEX index_xid ON xids (xid);

-- PostgreSQL
CREATE TABLE xids (
  xid bytea, transactionManagerID varchar(64), actionuid bytea
);
CREATE UNIQUE INDEX index_xid ON xids (xid);

-- H2
CREATE TABLE xids (
  xid VARBINARY(144), transactionManagerID VARCHAR(64), actionuid VARBINARY(28)
);
CREATE UNIQUE INDEX index_xid ON xids (xid);

I addressed the need of the table definition in the CMR quickstart by adding the JPA schema generation create script which contains the SQL to initialize the database.

Transaction manager CMR configuration

The last part is to configure the CMR for the transaction subsystem. The declaration puts the datasource under the list JTAEnvironmentBean#commitMarkableResourceJNDINames which is then used in code of TransactionImple#createResource.
The xml element used in the transaction subsystem and the jboss cli commands look like

<commit-markable-resources>
  <commit-markable-resource jndi-name="java:jboss/datasources/jdbc-cmr"/>
</commit-markable-resources>
/subsystem=transactions/commit-markable-resource="java:jboss/datasources/jdbc-cmr":add()
:reload

CMR configuration options

In addition to such simple CMR declaration, the CMR can be configured with following parameters

  • jndi-name : as it could be seen above the jndi-name is way to point to the datasource which we mark as CMR ready
  • name : defines the name of the table which is used for storing the CMR state during prepare while used during recovery.
    The default value (and we've reffered to it in this way above) is xids
  • immediate-cleanup : If configured to true then there is registered a synchronization which removes proper value from the xids table immediatelly after the transaction is committed.
    When synchronization is not set up then the clean-up of the xids table is responsibility of the recovery by the code at CommitMarkableResourceRecordRecoveryModule. It checks about finished xids and it removes those which are free for garbage collection.
    The default value is false (using only recovery garbage collection).
  • batch-size : This parameter influences the process of the garbage collection (as described above). The garbage collection takes finished xids and runs DELETE SQL command. The DELETE contains the WHERE xid in (...) clause with maximum of batch-size entries provided. When there is still some finished xids left after deletion, another SQL command is assembled with maximum number of batch-size entries again.
    The default value is 100.

The commit-markable-resource xml element configured with all the parameters looks like

<subsystem xmlns="urn:jboss:domain:transactions:4.0">
  <core-environment>
      <process-id>
          <uuid/>
      </process-id>
  </core-environment>
  <recovery-environment socket-binding="txn-recovery-environment" status-socket-binding="txn-status-manager"/>
  <object-store path="tx-object-store" relative-to="jboss.server.data.dir"/>
  <commit-markable-resources>
      <commit-markable-resource jndi-name="java:jboss/datasources/jdbc-cmr">
          <xid-location name="myxidstable" batch-size="10" immediate-cleanup="true"/>
      </commit-markable-resource>
  </commit-markable-resources>
</subsystem>

And the jboss cli commands for the same are

/subsystem=transactions/commit-markable-resource="java:jboss/datasources/jdbc-cmr"\
  :write-attribute(name=name, value=myxidstable)
  /subsystem=transactions/commit-markable-resource="java:jboss/datasources/jdbc-cmr"\
  :write-attribute(name=immediate-cleanup, value=true)
/subsystem=transactions/commit-markable-resource="java:jboss/datasources/jdbc-cmr"\
  :write-attribute(name=batch-size, value=10)
:reload

NOTE: the JBoss EAP documentation about the CMR resource configuration can be found at section About the LRCO Optimization for Single-phase Commit (1PC)

Conclusion

The article explained what is the Narayana Commit Markable resource (CMR), it compared it with LRCO and presented its advantages. In the latter part of the article you found how to configure the CMR resource in your application deployed at the WildFly application server.
If you like to run an application using the commit markable resource feature, check our Narayana quickstart at https://github.com/jbosstm/quickstart/tree/master/wildfly/commit-markable-resource.

Monday, October 17, 2016

Achieving Consistency in a Microservices Architecture

Microservices are loosely coupled independently deployable services. Although a well designed service will not directly operate on shared data it may still need to ensure that that data will ultimately remain consistent. For example, the requirement to debit an account to pay for an on-line purchase creates a dependency between the customer and supplier account balances and the stock database. Historically a distributed transaction has been used to maintain this consistency which in turn will employ some flavour of distributed locking during the data update phase. This dependency introduces tight coupling, higher latencies and greater lock contention, especially when failures occur where the locks cannot be released until all services involved in the transaction become available again. Whilst the user of the system may be satisfied with this state of affairs it should not be the only possible interaction pattern. A more common approach will employ the notion of eventual consistency where the data may sometimes be in an inconsistent state but will eventually come back into the desired state: in our example the stock level will be reduced, the payment has been processed and the item delivered.

I have, from time to time, seen blogs and articles that recognise this problem and suggest solutions but they seem to mandate that either service calls naturally map on to a single data update or that the service writer picks one of the services to do the coordination taking on the responsibility of ensuring that all services involved in the interaction will eventually reach their target consistent state (see for example a quote from the article Distributed Transactions: The Icebergs of Microservices: "you have to pick one of the services to be the primary handler for the event. It will handle the original event with a single commit, and then take responsibility for asynchronously communicating the secondary effects to other services"). This sounds feasible but now you have to start thinking about how to provide the reliability guarantees in the presence of failures, how to orchestrate services, storing extra state with every persistent update so that the activity coordinator can continue the interaction after failures have been resolved. In other words, whilst this is a workable approach it hides much of the complexity involved in reliably recovering from system and network failures which at scale will surely happen. A more robust design for microservice architectures is to delegate the coordination component of the workflow to a specialised service explicitly designed for this kind of task.

We have been working in this area for many years and one set of ideas and protocols that we believe are particularly suited to microservices architectures is the use of compensatable units of work to achieve eventual consistency guarantees in this kind of loosely coupled service based environment. I produced a write up of the approach and accompanying protocol for use in REST based systems back in 2009 (Compensating RESTful Transactions) based on earlier work done by Mark Little et al. Mark also wrote some interesting blogs in 2011 (When ACID is too strong and Slightly alkaline transactions if you please ...) about alternatives to ACID when various constraints are loosened and his summary is relevant to the problems facing microservice architects.

The use of compensations, coordinated by a dedicated service, will give all the benefits suggested in Graham Lea's article referred to earlier, but with the additional guarantees of consistency, reliability, manageability, reduced complexity etc in the presence of failures. The essence of the idea is that the prepare step is skipped and instead the services involved in the interaction register compensation actions with a dedicated coordinator:

  1. The client creates a coordination resource (identified via a resource url)
  2. The client makes service invocations passing the coordinator url by some (unspecified) mechanism
  3. The service registers its compensate logic with the coordinator and performs the service request as normal
  4. When the client is done it tells the coordinator to complete or cancel the interaction
    • in the complete case the coordinator has nothing to do (except clean up actions)
    • in the cancel case the coordinator initiates the undo logic. Services are not allowed to fail this step. If they are not available or cannot compensate for the activity immediately the coordinator will keep on trying until all services have compensated (and only then will it clean up)

We do not have an implementation of this (JDI) protocol but we do have an implementation of an ACID variant of it (called RTS) which has had extensive exposure in the field (and this can/will serve as the basis for the implementation of the JDI protocol). The documentation for RTS is available at our project web site. The nice thing about this work is that it can integrate seamlessly into Java EE environments and additionally is available as a WildFly subsystem. This latter feature means that it can be packaged as a WildFly Swarm microservice using the WildFly Swarm Project Generator. In this way if your microservices are using REST for API calls then they can make immediate use of this feature.

We also have a working prototype framework for how to do compensations in a Java SE environment. The API is available at github where we also provide a number of quickstarts showing how to use it.

Finally, we have a solution where we allow the compensation data to be stored at the same time as the data updates in a single (one phase) transaction thus ensuring that the coordinator will have access to the compensation data. This technique works particularly well with document oriented databases such as MongoDB

Monday, December 28, 2015

Software Transactional Memory with WildFly-Swarm

A long time ago (not in a Galaxy Far Far Away!) I wrote about the STM implementation we were adding to Narayana. Over the intervening years this implementation was added to Vert.x and even the RaspberryPi made an appearance! Now whilst the implementation has been available in WildFly for many years, we tend not to mention it because, well, it's not Java EE compliant. However, with the advent of WildFly-Swarm things may be about to change.

Sure, when you're looking at using Swarm it's likely that at least initially you'll be coming at a problem from the perspective of Java EE, but the more you look to decompose your application into constituent (micro) services the more chances there are that you'll also start to look at functionality and frameworks that aren't necessarily just about Java EE. As we've mentioned before, STM is compatible with JTA and JTS transactions as well, as long as you understand what it means to mix-and-match them. Therefore, we've added an example of STM usage within WildFly-Swarm, which hopefully will become part of the mainline Swarm examples eventually. Take a look and give us any feedback, either in the Swarm group/IRC/Twitter or the usual Narayana routes.

Tuesday, September 17, 2013

Narayana is now JTA 1.2 Compliant

I'm very proud to announce that with Narayana 5.0.0.M3 we are now JTA 1.2 compliant! What's more, with the release of JCA 1.7 in WildFly 8.0.0.Beta1 (WFLY-510), the new application server features are now available.

I'd like to say a big "thank you" to the community members who provided feedback during the design and implementation of this specification. Like most of our larger features, this was discussed over at the Narayana developer forums, giving the community the opportunity to shape the development of these features.

So what's new in JTA 1.2?

JTA 1.2 is considered a 'maintenance release', so the change list is quite small. However, that's not to say that the changes aren't exciting. Here's an overview of the three improvements:

@Transactional. This is the headline feature and brings the ability to place transaction annotations on any CDI managed bean. Prior to this feature, you had to make your class an EJB in order to use transactional annotations.
@TransactionScoped. This feature allows you to associate CDI beans with the scope of a transaction.
The third change clarifies when the container should call 'delistResource' on the transactional resource. This is a minor change and is of less interest to an application developer, so I'm not going to discuss it further here.

Tell me more about @Transactional and @TransactionScoped


The remainder of this post will tell you what you need to know about these new features and provide some examples, showing how they can be used.

@Transactional

This new annotation provides an alternative to javax.ejb.TransactionAttribute that can be placed on CDI managed bean classes and its methods. Prior to this feature, many developers were using EJBs just so that they could use declarative transactions. Now you can make an architectural decision of wether EJB or CDI is the right approach for your application, knowing that you will be able to use declarative transactions with either approach. The remainder of this section will show an example using @Transactional and highlight the differences between @Transactional and the EJB @TransactionAttribute.


@Transactional(Transactional.TxType.MANDATORY)
public class MyCDIBean {

    @Transactional(Transactional.TxType.NEVER)
    public void doSomethingWithoutATransaction() throws Exception
    {

        //Do something that must be done outside of a transaction
    }

    @Transactional(
         dontRollbackOn=MyNonCriticalRuntimeException.class,
         rollbackOn=TestException.class)
    public void doSomething() throws Exception {
//Do something that must be done inside a transaction
    }
}

The "@Transactional(Transactional.TxType.MANDATORY)" annotation on the class states that, by default, all methods must be invoked within the scope of a JTA transaction. This can be overridden on a per-method basis.

The 'doSomethingWithoutATransaction' method overrides the 'MANDATORY' type with "NEVER". Therefore, this method will fail if it is invoked in the scope of a JTA transaction.

Specifying what Exceptions should cause the transaction to rollback is specified differently with @Transactional. With EJB's @TransactionalAttribute, the developer had to declare on the actual Exception class whether an exception of that type, thrown from a method annotated with @TransactionalAttribute, should cause the active transaction to be marked for 'rollback only'. The problem with this approach was that it could not (easily) be applied to third-party exception implementations and it also applied for all usages of that exception. @Transactional fixes these issues by allowing the developer to specify on a per-method basis, which Exception types (and sub-classes of) should cause a rollback. This is done through the 'rollbackOn' and 'dontRollbackOn' attributes. By default, RuntimeException and its subclasses will cause a rollback.

Another difference to watch out for is the default behaviour when no transaction annotation is provided. With an EJB, the default value of TransactionalAttribute is TRANSACTION_REQUIRED. This means that a new JTA transaction is begun (when calling the method) if one doesn't already exist. This works for EJB as the developer has already opted-in through the use of an EJB annotation (@Stateless, @Statefull, etc). With CDI, a managed bean can have no annotations, thus it is difficult to differentiate between it and a regular POJO. Therefore, the default value for @Transactional is TxType.SUPPORTED. This means that the method will run within a transaction if one exists; otherwise it will run without one. This is essentially how transactions are handled with java POJOs.


@TransactionScoped

With @TransactionScoped brings an additional CDI context to accompany @SessionScoped, @RequestScoped and @ApplicationScoped. Annotating a bean with @SessionScoped ensures that the same instance of the bean is made available for all usages of it within the scope of a http session. This allows the developer to easily share state between multiple requests within a session, whilst also isolating it from other requests in a different session.

@TransactionScoped allows data to be shared between all usages of the bean within a particular transaction, whilst isolating the instance from other accesses of the bean within a different transaction. The TransactionScoped bean instance's lifecycle matches that of the transaction.

The following code shows an example of this in use:

@TransactionScoped
public class MyCDITransactionScopeBean implements Serializable {

    private int value = 0;
    public int getValue() ... 
    public void setValue(int value) ...
}

MyCDITransactionScopeBean represents the data to be associated with the active transaction. It is simple POJO annotated with @TransactionScoped and also marked as Serializable.

public class SomeClass {

    @Inject
    MyCDITransactionScopeBean myBean;
...
}

MyCDITransactionScopeBean can then be injected into other classes that need to use it.

Go and give it a try!

Just download the latest version of WildFly and start deploying code. We don't yet have any quickstarts for these features, but our Arquillian tests provide a complete example of how to use the functionality.