Grails and databases, easy right?
Having Grails and GORM makes a programmers life really easy when a simple database is your target storage. Grails generates a bunch of things for you, the conventions allow for almost zero configuration and presto: there’s your working application running on spring boot.
Things change though when you need to, for instance, white-label your application. A third party might actually host and/or maintain your application database and data. These kinds of setups are like renting out parts of a building, this goes by the term “multi tenant”.
Approaches to multi tenant data
- Separate databases: different customers need to get connected to different (virtual) hosts
- Separate schemas: customer data is kept separate in the database by defining schemas for database users
- Partitioned data: same database & schema, but all data is always tagged with a customer identification
The first two approaches allow for a very generic setup and clear security boundaries, which can be convenient.
The third option requires quite some ‘intelligence’ in the application, and can become very tricky if data security is high priority. Customer data must be kept separate, and only the (client) application knows how.
Multiple approaches may apply
Our legacy application has all the approaches at once. Tenants are defined by schemas, but not all schemas are on the same host and within a tenant, the data is again spread over multiple sub-tenants. How we handle the sub-tenants is not part of this story, it is basically the ‘partitioned data’ approach (3) for which some tooling exists.
How Grails handles things
Grails 3 has no clear guidelines on how to handle multi tenant data. It assumes by default that only 1 data source is used. GORM does allow specifying on which data source a table is located, but of course that is not multi tenancy, rather just a different way to spread data over multiple databases.
For Grails2 (also assumes one data source by default) there were a few ready made options in the form of plugins which made GORM have support for multi tenancy.
Hibernate has explicit support for multi tenant databases, but only starting from version 4 (and partitioned data from ~maybe~ version 6). How this interacts with GORM however is something else. The latest GORM (not the default for Grails) promises actual support. However, since we also want to be able to use the lower level groovy Sql approach in a seamless way, we would like to avoid fiddling with Hibernate configuration magic.
To the drawing board
We need to connect to all the relevant databases and schemas and always pick the one that our customer needs.
The ideal way for us is to have the ‘normal’ Grails data sources, normal GORM, normal Hibernate and still have all the flexibility.
A solution using Spring
Spring provides a so-called switching data source, somewhat like a facade to the real data sources.
Our solution relies on the fact that the tenant can be safely assumed to stay the same during the processing of a request to the server. In other words, as long as the same request is still being handled, only 1 tenant is relevant.
A consequence of this approach is that things like second level cache are non trivial since that would mean that data from multiple request are cached in that same ‘bucket’. This would mean that data from multiple tenants would get mixed. This in turn can be prevented with another filter at the cache level, but…. -things get complicated here- (and we don’t need it). Disabling fancy caching is fine for now.
The details
Ingredients:
- an interceptor that identifies the customer and relates it to a tenant
- a thread local construct to store the tenant id during the request processing
- a switching data source, Spring provides one jay!
- some configuration for the various data sources and how they relate to tenants
1) The Interceptor
Grails interceptors are very powerful and easy. They come with dependency injection, access to configuration, etc.
This is the place to get to know which tenant is relevant for a request, before the request is actually handled. Combining with logging using MDC seems convenient, you might want to know which tenant is logging something (or have different log files for instance).
The interceptor should also make very sure that the current Thread is clean when the request handling is finished. Threads are reused, so consecutive requests could potentially interfere!
Example interceptor:
/** * Provides tenant specific resources: * - an MDC value for logback. The tenantId variable can be used in the logging pattern. * - the dataSource id value for the switching data source. */ @Slf4j class ExampleInterceptor implements grails.artefact.Interceptor { public SessionInterceptor() { matchAll() } boolean before() { MDC.put "tenantId", "${params.tenantId}" // for logback SelectedTenantHolder.setCurrentTenant(params.tenantId) // if not set, the default data source will apply true } boolean after() { true } void afterView() { MDC.clear() SelectedTenantHolder.clear() } }
2) A Thread local to go
We need a place to store the ‘current’ tenant, a Thread local variable is the simplest way to achieve this.
It is good practice to encapsulate. Watch out with unit tests, a Thread local might ‘stick’ and tests could again (like requests) influence each other.
Thread local Tenant holder:
/** * Wrapper for a thread local storage of the selected tenant. */ class SelectedTenantHolder { private static final ThreadLocal<String> contextHolder = new ThreadLocal<String>() static void setCurrentTenant(String tenantId) { contextHolder.set(tenantId) } static Map getCurrentTenant() { return contextHolder.get() } static void clear() { contextHolder.remove() } }
Note that you may put anything in this Thread local, you are not limited to Strings or primitives. Having a configuration specific to the tenant ready here might be convenient.
3) A switching data source
Spring provides a ready made abstract class that just needs a little logic to somehow determine what data source to select.
It expects a list of data sources to be configured elsewhere. This implementation logs some of the internals when in trace mode for debugging.
A warning about multiple data sources (switching or not): Grails 3 needs the default data source to be named ‘dataSource’. There is no way around this!
Switching data source:
import org.apache.commons.lang.reflect.FieldUtils import org.springframework.jdbc.datasource.lookup.AbstractRoutingDataSource /** * Resolves to the real dataSource bean by looking for the id in the application context. * * Inspired from example: * https://github.com/dewarim/cinnamon-humulus/blob/master/src/groovy/humulus/SwitchableDataSource.groovy */ @Slf4j class ExampleSwitchingDataSource extends AbstractRoutingDataSource { @Override protected Object determineCurrentLookupKey() { def currentId = SelectedTenantHolder.currentTenant log.trace "Current dataSource id: ${currentId}" if (log.traceEnabled) { def targetDataSources = FieldUtils.readField(this, "targetDataSources", true); log.trace "Current targets: ${targetDataSources}" def resolvedDataSources = FieldUtils.readField(this, "resolvedDataSources", true); log.trace "Current resolved: ${resolvedDataSources}" } return currentId } }
4) Some configuration
Linking tenants to specific data sources and providing the actual data sources as a list to the switching data source (the facade).
In the example code the tenantId parameter matches the actual data source name, this is of course a simplification and not something you might want in a production setup.
application.groovy (does not exist by default):
// example for oracle def myDialect = "org.hibernate.dialect.Oracle10gDialect" def myJdbcClass = "oracle.jdbc.OracleDriver" def defaultProperties = { jmxEnabled = false testOnBorrow = true validationQuery = "SELECT 1 FROM DUAL" initialSize = 3 maxActive = 3 maxIdle = 1 minIdle = 0 removeAbandonedTimeout = 120 removeAbandoned = true } def defaultHbm2ddl = "" // prevent from doing any ddl in runtime! dataSources { dataSource { // (=default) pooled = true dialect = myDialect driverClassName = myJdbcClass url = jdbcConnectUrlGoesHere username = "myname" password = "mypassword" properties defaultProperties dbCreate = defaultHbm2ddl } // provided by jndi or direct connection pools might even be mixed... fr { jndiName = "java:comp/env/jdbc/frExampleDataSource" } it { jndiName = "java:comp/env/jdbc/itExampleDataSource" } }
resources.groovy:
beans = { // Collect the configured dataSources that apply to the environment // and provide them to the switching dataSource. // (the actual dataSources are configured in 'application.groovy') Map envDataSources = [ dataSource_default:ref("dataSource"), dataSource_d:ref("dataSource_fr"), dataSource_f:ref("dataSource_it"), ] // special datasource that switches between the delegate ones, needs to be linked to domain classes dataSource_switching(ExampleSwitchingDataSource) { defaultTargetDataSource = ref("dataSource") targetDataSources = envDataSources } }
Done. Now let’s see what using all this actually looks like.
Using groovy Sql
class ExampleController { // just inject the switching data source (by name) def dataSource_switching def index() { new Sql(dataSource_switching).eachRow("SELECT 'Hello World!'") { row -> log.info "${row}" } } }
Using Domain classes
- Edit happened here -
It seems that only regular groovy Sql is supported correctly. Where Grails2 had no problems with a switching default data source, Grails3 unfortunately is not as forgiving. For real GORM (domain classes) support, see the information on the recently introduced GORM multi-tenancy support.
Seeing everything together
- A request arrives at the server.
- The interceptor sets, based on the request, a marker on a thread local variable.
- Whenever the ‘database’ is requested for querying, the switching data source will choose the correct source based on the thread local variable and the configuration.
- All the following queries are made on the correct ‘real’ data source.
- When the request ends the thread local is cleared, the data source is left as-is and ready to be reused.
Further reading
Hibernate on multi tenancy:
https://docs.jboss.org/hibernate/orm/4.3/devguide/en-US/html/ch16.html
Somewhat dates but still relevant information from spring:
https://spring.io/blog/2007/01/23/dynamic-datasource-routing/
An implementaton in java using some AOP:
http://fedulov.website/2015/10/14/dynamic-datasource-routing-with-spring/