First8 staat voor vakmanschap. Al onze collega’s zijn een groot aanhanger van Open Source en in het bijzonder het Java-platform. Wij zijn gespecialiseerd in het pragmatisch ontwikkelen van bedrijfskritische Java toepassingen waarbij integratie van systemen, hoge eisen aan beveiliging en veel transacties een belangrijke rol spelen. Op deze pagina vind je onze blogs.

Vertical Scaling with OpenShift

If you run your application on an OpenShift environment, one of the advantages you will have is that you can scale the environment quite easily. If traffic to your website continues to increase, at some point a single small gear simply won’t cut it anymore. In the next few blog posts I’ll explore the available options you have for scaling out and for upgrading your existing application.

Vertical scaling

First of all, the easiest way of scaling is vertical scaling. This is the traditional way of coping with increased traffic: simply using faster hardware. In OpenShift terminology this means to upgrade your gear to a bigger version (OpenShift Online gives you small, small-highcpu, medium and large gears but other providers might have different sizes).

We could also split our application and divide different functions over different gears. The most common practice is to have the database running on its own gear, separate from the application. If you mark an application server cartridge (such as a Tomcat or JBoss cartridge) as scalable, it automatically assumes that you want to deploy the database on a separate gear. This means that when you add the database cartridge to your application server cartridge, the database will now take up another gear instead of being deployed on the same gear.

In this blog post I’ll assume a traditional web application and a database as two distinct pieces of the entire application. Of course, you could define more domains or services if desired (depending on what architectural patterns are currently the hype) to distribute over gears, but that will not be covered in this post.

Upgrading a non-scalable app

When you created an application on OpenShift and didn’t take scalability into account, it is likely you created it as a non-scalable application since this setting is the default. Probably something like this, without the specific parameter -s to mark it as scalable:

 

If you investigate this setup with rhc show-app, you’ll notice that PostgreSQL is installed on the same gear:

 

For testing purposes, I’ll create some test data in the database so we can verify that the upgrade procedure correctly saves the data:

 

To upgrade the gear which your application uses, you will have to create a snapshot of the current state, shut it down and restart on a new, bigger gear.

First, we’ll create the snapshot and delete the running website:

 

Next, we recreate the application but this time we run it on a large gear and make it scalable using the added parameter -s:

 

This time we can see that PostgreSQL is running on its own gear. We can also see that a HAProxy is running on the same gear as Tomcat (JBoss EWS), we’ll expand on that in a future blog post. Now we have to restore the previously created snapshot into this new environment. (Note: if you restore the snapshot, the command line tool might complain that there already is a checkout of the mywebsite repository. You could remove that checkout to fix this, or simply ignore the warning).

 

Restoring the snapshot

One would hope that executing the command rhc snapshot restore -a mywebsite -f mywebsite.tar.gz should be sufficient to restore the snapshot, but unfortunately it is not. Since scalable applications have a very different structure in the snapshots as non-scalable ones, the current tools do not support this upgrade. To make it work, you will have to change the tarball’s structure to accommodate for the scalable requirements.

The difference between the structure of a snapshot for a scalable application and the structure of a non-scalable one is that the scalable application snapshot contains nested snapshots for all the embedded snapshots (like PostgreSQL), while the non-scalable snapshot is simply one flat structure. To adapt the non-scalable snapshot into a scalable version, we’ll have to recreate this nested snapshot structure.

We’ll take the original snapshot and create two copies of it, one as a master and one for the embedded PostgreSQL snapshot. Note that the GUID will be different for your application so change the following commands accordingly. (The GUID will be replaced when restoring the snapshot so it doesn’t actually matter what it is, as long as it looks like a GUID).

So, first create both snapshots, one for PostgreSQL and one for the master gear:

Next, we’ll strip the PostgreSQL snapshot down to just PostgreSQL and create the tarball:

And now we can place the PostgreSQL snapshot in the master snapshot and restore it:

Note that the ./  is required when recreating both the tarballs, otherwise the snapshots will not work.

If we now log in to the main gear, we’ll see that the database is correctly restored:

 

We can also see that PostgreSQL is actually running on another gear and check how much disk space is left:

Summary

The first step in making an application scalable is actually marking it as scalable and upgrading your gears. In most cases it’s best to do this from day zero, but even if you want to start out on a single small gear, you can still scale up later. If you require more than this for your application, or if you need to be more flexible in computing power (scaling up and down as needed), stay tuned for the next blog post on horizontal scaling.

 

Read more: