Acropolis

Linked Open Data processing stack

Introduction

Acropolis is the software stack that powers the Research & Education Space. The Research & Education Space is a platform being jointly delivered by Jisc, the British Universities Film & Video Council (BUFVC), and the BBC with the aim of bringing as much as possible of the UK’s publicly-held archives, and more besides, to learners and teachers across the UK.

You can read more about how the platform works in our book, Inside Acropolis.

Components

The stack consists of a number of different projects, including utility libraries and daemons. Each can be checked out and built individually (as would typically be the case in a real deployment), or the whole stack can be built together, which can be useful in development.

Unless otherwise noted in each component’s own repository, each is released under the terms of the Apache License, Version 2.0.

Anansi
A parallel web crawler with modules for processing RDF and Linked Open Data
Twine
An engine for implementing RDF processing workflows
Quilt
An application for publishing RDF on the web, both as HTML and as machine-readable data
Spindle
Plug-in modules for Twine and Quilt which provide the core co-reference aggregation capabilities of the Research & Education Space
liblod
A library for fetching and processing Linked Open Data published on the Web
liburi
A library for parsing and manipulating URIs
libsql
An abstraction layer for working with SQL databases
libsparqlclient
A client library for the SPARQL protocol used by most RDF graph databases
libmq
An abstraction library for communicating with message brokers
libcluster
A library designed to make it easier to build applications which have multiple instances working in parallel-processing clusters

Pre-requisites

To compile and run the stack successfully there are a number of pre-requisites which must first be installed. See the Dependencies document for further information. If you're just going clone repositories to look through the code locally, you won’t need to install any of these except for a Git client.

Getting Acropolis

The bbcarchdev/acropolis Github repository hosts the Acropolis meta-project (it is a project which exists primarily to make it easier to build the various sub-projects together).

You can download the repository, and all of the linked sub-projects, using git:

$  git clone git://github.com/bbcarchdev/acropolis.git
remote: Counting objects: 120, done.
remote: Compressing objects: 100% (112/112), done.
remote: Total 120 (delta 56), reused 0 (delta 0)
Receiving objects: 100% (120/120), 24.16 KiB | 0 bytes/s, done.
Resolving deltas: 100% (56/56), done.
Checking connectivity... done.
 $  cd acropolis
 $  git submodule update --init --recursive
Submodule 'anansi' (git://github.com/bbcarchdev/anansi.git) registered for path 'anansi'
Submodule 'libcluster' (git://github.com/bbcarchdev/libcluster.git) registered for path 'libcluster'
Submodule 'liblod' (git://github.com/bbcarchdev/liblod.git) registered for path 'liblod'
Submodule 'libmq' (git://github.com/bbcarchdev/libmq.git) registered for path 'libmq'
Submodule 'libsparqlclient' (git://github.com/bbcarchdev/libsparqlclient.git) registered for path 'libsparqlclient'
Submodule 'libsql' (git://github.com/bbcarchdev/libsql.git) registered for path 'libsql'
Submodule 'liburi' (git://github.com/bbcarchdev/liburi.git) registered for path 'liburi'
Submodule 'm4' (git://github.com/bbcarchdev/m4.git) registered for path 'm4'
Submodule 'quilt' (git://github.com/bbcarchdev/quilt.git) registered for path 'quilt'
Submodule 'spindle' (git://github.com/bbcarchdev/spindle.git) registered for path 'spindle'
Submodule 'twine' (git://github.com/bbcarchdev/twine.git) registered for path 'twine'
Cloning into 'anansi'...
[remainder of output skipped]

About the build system

The Acropolis stack uses GNU Autotools and pkg-config to build consistently across platforms and manage dependencies. When building from a Git clone in (or if you modify any part of the build logic), you will need to have the autoconf, automake, GNU libtool packages for your operating system installed in order to re-generate configure scripts and Makefile.in files.

The projects make extensive use of Git submodules. Using submodules means that a particular project will reference a specific commit in a sub-project, and recursively checking out the project will always cause that reference to point to the same commit until a change to the parent is committed.

The up-side is that subsequent commits can be made to the sub-project without the risk of breaking the parent projects which rely upon it. The down-side is that if you’re going to do development work on a project, you need to check that you’ve checked out a branch and are working on an up-to-date version of it before you begin making changes (use git status to tell you: if it reports Not currently on any branch, it means the current working copy is a checkout of a specific commit).

Building Acropolis

Use autoreconf to generate the configure scripts:

$  autoreconf -i

Now, configure the stack, and all of the sub-projects:

$  ./configure

By default, the installation prefix will be /opt/res (this is to allow this project to be safely built and installed without conflicting with any individually-installed components of the stack), but if you'd prefer to install to somewhere else, such as /usr/local, you can pass an option to configure to specify that:

$  ./configure --prefix=/usr/local

If you want to build debugging versions of components (i.e., compiler optimisations are reduced and debugging symbols are not stripped from compiled binaries), pass the --enable-debug option:

$  ./configure --prefix=/usr/local --enable-debug

If all goes well, all of the components will be configured and ready to build. To compile them, simply run make:

$  make

Assuming no errors occur, you can now install it with:

$  sudo make install