Git tells me that I put my first Oracle database inside a docker container in early 2016.
The issue was simple – we were working with a large, old, legacy database and we wanted to be able to run our integration tests against the database. Our solution was to use a Docker image based on the legacy database and then trash that image so that we could properly control the database inputs to our integration tests. (The database included triggers and stored procedures, so integrating with an in-memory database like H2 would not prove that it would work with the real database.)
The standard use-case for a docker container with a database is for the database data to be in a separate volume. However, in our case we wanted to be able to reset the data after each build. Docker does not provide volume versioning. Also, in order to put the data in a volume you need to run the container. We wanted the data to be in the database after the build step so that each run would provide a clean database for testing.
I struggled for a while to install a full size Oracle DB inside a docker image. All of my problems can be summed up with the statement, “Oracle does not like being inside a Docker container”. Fortunately for me, someone else had created the wnameless/oracle-xe-11g image and this allowed me to progress, albeit with the XE version.
I extended this image so that, during the build, it would start up the database and use impdp to import the structure of the production database from a backup file. I would’ve liked to create the schema from scratch but, like I said, I was working with a legacy schema and my team did not have exclusive control of it so it could change in live at any time. (To be clear, this is far from ideal and we would have loved to have taken sole ownership of this schema but you have to live in the world you live in.)
My Dockerfile ended up looking something like this:
FROM wnameless/oracle-xe-11g ENV ORACLE_HOME=/u01/app/oracle/product/11.2.0/xe \ ORACLE_SID=XE \ TNS_ADMIN=$ORACLE_HOME/network/admin ADD prep_db.sql /home/oracle/prep_db.sql RUN service oracle-xe start && \ su -c "$ORACLE_HOME/bin/sqlplus '/ as sysdba' @/home/oracle/prep_db.sql " oracle USER oracle ADD import_database.sh /home/oracle/import_database.sh ADD data.dmp /home/oracle/data.dmp RUN chmod u+rwx /home/oracle/import_database.sh && \ /home/oracle/lccassets/import_database.sh
Where prep_db.sql contained some initial setup and import_database.sh had the gymnastic impdp calls necessary to import the schema successfully from data.dmp.
This meant that a nightly build could rebuild the Docker image and the integration tests in our software build could just run one to run integration tests against (waiting 37 seconds for the database to start up, of course :’-) ). The database in the running container would contain all of the structures they needed. After the tests had completed, the build process could destroy the container. Nice and clean.
This setup saved us many times. On many occasions, someone’s test would blow up a trigger or stored procedure in the database. On one memorable occasion, an unexpected change in the live database changed the structure of the database and meant that our software was no longer compatible. We caught that one too.
…and then it all went wrong (sort of)
As I understand it from the internets, Oracle made a request to have the wnameless image taken down on copyright grounds in February 2019. The first I heard of this was when I couldn’t download the dependency while building my image on a new host a couple of weeks ago.
After some research, it seemed to me that using the wnameless image was no longer an option, so I looked around for alternatives. Oracle have provided a collection of Dockerfiles at https://github.com/oracle/docker-images, so I started using one of those. There are instructions in a README in the repo for building the image here.
First impressions were frustrating. In order to build a docker image using the Oracle 11G Dockerfile, you need to download the Oracle 11G installer, which seems to necessitate a manual sign-in to Oracle’s website. I wouldn’t criticise anyone’s right to assert their copyright, but right out of the gate it seems impossible to automate this build entirely without infringing Oracle’s copyright.
Once I had successfully built an instance of oracle/database:220.127.116.11-xe, it was very simple to get it going. My Dockerfile now looks like this:
FROM oracle/database:18.104.22.168-xe ADD ./customscripts/* /u01/app/oracle/scripts/setup/
Where ./customscripts contains an ordered set of scripts (usually numbered, e.g. 01_setup.sql, 02_copy.sh and so on). Conveniently, the parent image will detect the kind of script and run it accordingly, so you can mix and match shell scripts with SQL scripts as needed.
When you first spin up a container from this image, Oracle will complete a number of setup actions on the database, one of which will be to run your custom scripts. The next time the container starts, if the database has already been configured it will just start the existing database.
Unlike the wnameless image, there is no database at build time, only at runtime. This means that my ideal of building an image that can then be launched quickly and scrapped with every build is, sadly, dead. Dead, that is, unless I’m prepared to wait a while for the image to build and set up the database every time.
I could, of course, craft a slightly different docker build process that starts an existing container and mounts a volume to capture the data the first time, so that subsequent runs can mount the same volume, but this still leaves me with the problem of being unable to version the volume. I would therefore need a separate process to back up the volume and restore it for each build. It’s not ideal but I’ll try that next. The build itself is too valuable, it has saved our skins too many times, to be abandoned now.
Just so it doesn’t seem like I’m bashing Oracle, who have provided a lot of free stuff to help build docker containers, it looks like the docker images for Postgres work in a somewhat similar way.
Maybe my use case – using Docker to provide a disposable and repeatable database for builds – is just a bit too niche to work out of the box.