Vertical Scaling with OpenShift

If you run your application on an OpenShift environment, one of the advantages you will have is that you can scale the environment quite easily. If traffic to your website continues to increase, at some point a single small gear simply won’t cut it anymore. In the next few blog posts I’ll explore the available options you have for scaling out and for upgrading your existing application.

Vertical scaling

First of all, the easiest way of scaling is vertical scaling. This is the traditional way of coping with increased traffic: simply using faster hardware. In OpenShift terminology this means to upgrade your gear to a bigger version (OpenShift Online gives you small, small-highcpu, medium and large gears but other providers might have different sizes).

We could also split our application and divide different functions over different gears. The most common practice is to have the database running on its own gear, separate from the application. If you mark an application server cartridge (such as a Tomcat or JBoss cartridge) as scalable, it automatically assumes that you want to deploy the database on a separate gear. This means that when you add the database cartridge to your application server cartridge, the database will now take up another gear instead of being deployed on the same gear.

In this blog post I’ll assume a traditional web application and a database as two distinct pieces of the entire application. Of course, you could define more domains or services if desired (depending on what architectural patterns are currently the hype) to distribute over gears, but that will not be covered in this post.

Upgrading a non-scalable app

When you created an application on OpenShift and didn’t take scalability into account, it is likely you created it as a non-scalable application since this setting is the default. Probably something like this, without the specific parameter -s to mark it as scalable:

$ rhc app create -a mywebsite -t tomcat-7 -g small
$ rhc cartridge add postgresql-9.2 -a mywebsite

If you investigate this setup with rhc show-app, you’ll notice that PostgreSQL is installed on the same gear:

$ rhc show-app mywebsite
mywebsite @ http://mywebsite-first8.rhcloud.com/ 
    (uuid: 55473b535973ca4e5d00006d)
-------------------------------------------------
  Domain:     first8
  Created:    11:26 AM
  Gears:      1 (defaults to small)
  Git URL:    ssh://....
  SSH:        55473b535973ca4e5d00006d@mywebsite
  Deployment: auto (on git push)

  jbossews-2.0 (Tomcat 7 (JBoss EWS 2.0))
  ---------------------------------------
    Gears: Located with postgresql-9.2

  postgresql-9.2 (PostgreSQL 9.2)
  -------------------------------
    Gears:          Located with jbossews-2.0
    Connection URL: 
postgresql://$OPENSHIFT_POSTGRESQL_DB_HOST:$OPENSHIFT_POSTGRESQL_DB_PORT
    Database Name:  mywebsite
    Password:       -----
    Username:       admin-----

For testing purposes, I’ll create some test data in the database so we can verify that the upgrade procedure correctly saves the data:

$ rhc ssh mywebsite
Connecting to 55473b535973ca4e5d00006d@mywebsite-first8.rhcloud.com ...
[mywebsite-first8.rhcloud.com 55473b535973ca4e5d00006d]> psql
psql (9.2.10)
Type "help" for help.

mywebsite=# create table developers (id serial, name varchar);
NOTICE:  CREATE TABLE will create implicit sequence "developers_id_seq" 
  for serial column "developers.id"
CREATE TABLE
mywebsite=# insert into developers (name) values ('arjan');
INSERT 0 1
mywebsite=# insert into developers (name) values ('bas');
INSERT 0 1
mywebsite=# select * from developers;
 id | name  
----+-------
  1 | arjan
  2 | bas
(2 rows)

To upgrade the gear which your application uses, you will have to create a snapshot of the current state, shut it down and restart on a new, bigger gear.

First, we’ll create the snapshot and delete the running website:

$ rhc snapshot save -a mywebsite
Pulling down a snapshot of application 'mywebsite' 
  to mywebsite.tar.gz ... done
$ rhc app delete -a mywebsite

Next, we recreate the application but this time we run it on a large gear and make it scalable using the added parameter -s:

$ rhc app create -a mywebsite -t tomcat-7 -g large -s
...
$ rhc cartridge add postgresql-9.2 -a mywebsite
...
$ rhc show-app mywebsite
mywebsite @ http://mywebsite-first8.rhcloud.com/ 
    (uuid: 554742be4382ec93270000d6)
------------------------------------------------
  Domain:     first8
  Created:    11:58 AM
  Gears:      2 (defaults to large)
  Git URL:    ssh://...
  SSH:        554742be4382ec93270000d6@mywebsite
  Deployment: auto (on git push)

  haproxy-1.4 (Web Load Balancer)
  -------------------------------
    Gears: Located with jbossews-2.0

  jbossews-2.0 (Tomcat 7 (JBoss EWS 2.0))
  ---------------------------------------
    Scaling: x1 (minimum: 1, maximum: available) on large gears

  postgresql-9.2 (PostgreSQL 9.2)
  -------------------------------
    Gears:          1 large
    Connection URL: 
postgresql://$OPENSHIFT_POSTGRESQL_DB_HOST:$OPENSHIFT_POSTGRESQL_DB_PORT
    Database Name:  mywebsite
    Password:       -----
    Username:       admin----

This time we can see that PostgreSQL is running on its own gear. We can also see that a HAProxy is running on the same gear as Tomcat (JBoss EWS), we’ll expand on that in a future blog post. Now we have to restore the previously created snapshot into this new environment. (Note: if you restore the snapshot, the command line tool might complain that there already is a checkout of the mywebsite repository. You could remove that checkout to fix this, or simply ignore the warning).

Restoring the snapshot

One would hope that executing the command rhc snapshot restore -a mywebsite -f mywebsite.tar.gz should be sufficient to restore the snapshot, but unfortunately it is not. Since scalable applications have a very different structure in the snapshots as non-scalable ones, the current tools do not support this upgrade. To make it work, you will have to change the tarball’s structure to accommodate for the scalable requirements.

The difference between the structure of a snapshot for a scalable application and the structure of a non-scalable one is that the scalable application snapshot contains nested snapshots for all the embedded snapshots (like PostgreSQL), while the non-scalable snapshot is simply one flat structure. To adapt the non-scalable snapshot into a scalable version, we’ll have to recreate this nested snapshot structure.

We’ll take the original snapshot and create two copies of it, one as a master and one for the embedded PostgreSQL snapshot. Note that the GUID will be different for your application so change the following commands accordingly. (The GUID will be replaced when restoring the snapshot so it doesn’t actually matter what it is, as long as it looks like a GUID).

So, first create both snapshots, one for PostgreSQL and one for the master gear:

$ tar xfz mywebsite.tar.gz 
./55473b535973ca4e5d00006d/
./55473b535973ca4e5d00006d/.vimrc
./55473b535973ca4e5d00006d/.pgpass
....
$ mv 55473b535973ca4e5d00006d/ postgresql-9.2
$ tar xfz mywebsite.tar.gz 
./55473b535973ca4e5d00006d/
./55473b535973ca4e5d00006d/.vimrc
./55473b535973ca4e5d00006d/.pgpass
....

Next, we’ll strip the PostgreSQL snapshot down to just PostgreSQL and create the tarball:

$ cd postgresql-9.2/
$ rm -rf git/ jbossews/
$ cd ..
$ tar cvfz postgresql-9.2.tar.gz ./postgresql-9.2

And now we can place the PostgreSQL snapshot in the master snapshot and restore it:

$ cp postgresql-9.2.tar.gz 55473b535973ca4e5d00006d/app-root/data/
$ tar cvfz mywebsite-scalable.tar.gz ./55473b535973ca4e5d00006d
$ rhc snapshot restore -a mywebsite -f mywebsite-scalable.tar.gz

Note that the ./ is required when recreating both the tarballs, otherwise the snapshots will not work.

If we now log in to the main gear, we’ll see that the database is correctly restored:

$ rhc ssh mywebsite

Connecting to 554742be4382ec93270000d6@mywebsite-first8.rhcloud.com ...
[mywebsite-first8.rhcloud.com 554742be4382ec93270000d6]> psql
psql (9.2.10)
Type "help" for help.
mywebsite=# select * from developers;
id | name 
----+-------
  1 | arjan
  2 | bas
(2 rows)

We can also see that PostgreSQL is actually running on another gear and check how much disk space is left:

[mywebsite-first8.rhcloud.com 554742be4382ec93270000d6]> 
  echo $OPENSHIFT_POSTGRESQL_DB_URL
postgresql://adminq5iabx8:dI8pE94Ip7RH@554743a64382ecbadf00018d-first8.rhcloud.com:57866/
[mywebsite-first8.rhcloud.com 554742be4382ec93270000d6]> 
  ssh ${OPENSHIFT_POSTGRESQL_DB_GEAR_UUID}@${OPENSHIFT_POSTGRESQL_DB_HOST}
[554743a64382ecbadf00018d-first8.rhcloud.com 554743a64382ecbadf00018d]> 
  df
Filesystem           1K-blocks    Used Available Use% Mounted on
/dev/mapper/rootvg-rootvol
                       8125880 6075316   1631136  79% /
tmpfs                     5120       0      5120   0% /dev/shm
/dev/xvda1              243823   69732    161291  31% /boot
/dev/mapper/rootvg-var
                       8378368 1271320   7107048  16% /var
/dev/mapper/EBSStore01-user_home01
                     131061760 6142352 124919408   5% /var/lib/openshift

Summary

The first step in making an application scalable is actually marking it as scalable and upgrading your gears. In most cases it’s best to do this from day zero, but even if you want to start out on a single small gear, you can still scale up later. If you require more than this for your application, or if you need to be more flexible in computing power (scaling up and down as needed), stay tuned for the next blog post on horizontal scaling.

https://forums.openshift.com/recreate-an-existing-app-so-it-is-scaleable#comment-33240

Vertical Scaling with OpenShift

Arjan Lamers

Previous PostHorizontal Scaling with Open Shift

Next PostFirst8Friday Editie 4 OpenShift - Adding a database to OpenShift

First8 | Conclusion en volg ons