Main introduction to LGI
The Leiden Grid Infrastucture (LGI) is an easy-to-use scalable grid
middleware designed specifically for application oriented research
groups.
LGI is based on a Linux-Apache-MySQL-PHP (LAMP) stack on the so-called
project-server, together with x509 client- and server-certificates. The
project-server Remote Procedure Calls (RPC) Application Programming
Interface (API) routines have been implemented in PHP.
Users submit jobs to an LGI project-server only for specifically
installed applications using either the general web-interface, the
command-line-interface or perhaps by using the python class interface
from within a python script. The web-interface is automatically included
and active on the project-server and the command-line-interface is easily
compiled on any POSIX system using a C++ compiler with the Standard
Template Library (STL) and libcurl. A link to the basic web-interface for
this project-server can be found below.
Other project and application specific interfaces can make use of the
RPC-APIs LGI has to offer (see the documentation below).
Resources within the LGI poll an LGI project-server and request work
for applications that have been installed on a resource by configuring
and running a resource-daemon. The resource-daemon runs as a normal user
on the resource and can run behind a firewall and or a Network Adress
Translating (NAT) router. If the resource-daemon runs as root, each
individual job will be sandboxed automatically and run as a non-root
user. The resource-daemon can handle any local back-end (like Torque/PBS
or LoadLeveler) through local scripts specified in the resource-daemon xml
configuration file (see documentation below
for examples). The resource-daemon has been made especially resilient to
crashes and caches all information into files on disk. Also, jobs being
resubmitted to other resources by the project-server, perhaps because of
a lost heart-beat, are taken care of gracefully by the resource-daemon.
The daemon can thus be successfully used in a non-stable (grid)
environment.
Both the user <-> project-server and the resource <->
project-server communication is encrypted and authenticated through the
x509 standard.
Currently the project-server schedules jobs to resources on a
first-come-first-serve and in first-in-first-out job priority order. If
projects want to use other types of scheduling or a quality of service,
they can be implemented on the project-server side where a special
event-queue is implemented and a hook in the scheduler loop is available.
Several of these project-server based schedulers can run concurently on
the same project-server if required. It is also possible to have several
slave project-servers connected to your master project-server.
Resource-daemons transparently take care of that (see the documentation
below, a resource-daemon requests work from
other project-servers if no work was found on the configured
project-server).
User management and resource management on the project-servers (master and
slaves) is made easy by using a specialized ManageDB script (see the
documentation below). Slave servers regularly
synchronize to the master project-server and updates are propagated to all
project-servers automatically. Several limits can be enforced per
application, group or user. If required, several project-servers can use a
single MySQL database located perhaps on yet another host or each
project-server can make use of a separate MySQL project database perhaps
on separate hosts. It is also possible to setup several web-servers, using
DNS round-robin load balancing, as a single master project-server using a
single MySQL host (with only one project database) and perhaps with
separate job repository storage hosts (local to each host, or through a
global filesystem available accross the web-servers being balanced over).
With all these possibilities, LGI offers a user-friendly and very scalable
infrastructure for application oriented research groups.
The LGI software has been licenced under the GNU
General Public License
version 3.
[top]
Screenshots of the interfaces
Here are some screenshots of the basic LGI web-interface and the LGI
command-line-interface on Linux:
An example python script using the LGI python client class interface can
be downloaded below.
[top]
Documentation and example configuration
files on LGI
The general design document of LGI can be found here
(pdf).
The documentation on how to setup and maintain an LGI project-server or
an LGI resource can be found here
(ascii/text). For RHEL/Rocky/Alma 8 and 9 based systems, RPMs can be built
to ease the install.
The MySQL database structure can be found here (ascii/text).
An example resource-daemon configuration file can be found here (ascii/text).
Example resource-daemon back-end scripts for SLURM can be found here (ascci/text).
An example of using the python LGI_Client class interface script can
be found here.
The latest ChangeLog.txt can be found here.
All these documents are also part of the LGI middleware distribution
you can download below.
[top]
Frequently asked Questions
on LGI
Q: Why was LGI developed?
A: When the Theoretical Chemistry group
of the Leiden University was
looking for a way to connect all it's clusters
and the dutch supercomputers to a
single easy to use interface, it was found that well-known grid
middleware solutions pose several problems (check out this tutorial);
1) installing grid middleware on computers not administrated by group
members is impossible without root access, 2) firewalls and NAT routers
pose problems, 3) user interfaces dealing with proxy-certificates and
job submission description languages are hard to use, 4) maintenance,
administration and deployment of the middleware is hard for standard
UNIX admins and 5) some of the applications are licensed to specific
users on specific computers and are binary-only distributed. Moreover
looking at the number of programs used by the group and their
respective use, it was found that they only use about five applications
at most and typical calculations take days to weeks instead of minutes.
Also the number of calculations was found to be small compared to the
numbers for which other grid middleware software was designed. It
appeared that a simple to use, easy to deploy and administrate
middleware could be built in-house to circumvent the above issues. In
short; most grid middlewares are designed for high throughput computing
rather than high
performance computing and ease of use. LGI can however offer both
types of computing, with ease of use and ease of management.
Q: Why was LGI developed on a
Linux-Apache-MySQL-PHP stack?
A: The so-called LAMP-stack is
a well known stack for Linux / UNIX administrators. Installing such a
system is well documented on the web, easily done using
package-managers on whatever your favourite distribution is and runs on
any type of hardware.
Q: Why were the resource-daemon
and the command-line-interface tools written in C++ using the Standard
Template Library?
A: C++ and the STL, together
with libcurl, are well supported on any system. The standard C++ code
is therefore very portable and installation only requires a single
'make' command to be issued. You can however also use the LGI_Client
python class interface to interface to an LGI project-server from your
python scripts.
Q: Is LGI a scalable
infrastructure?
A: Yes, LGI
is very scalable on all ends. Several resource-daemons can run
concurrently on the same resource and several project slave servers
can be added into the project, each using a separate MySQL back-end.
With the user management options within LGI, also users can be
load-balanced over the project servers. There are, however, several
other scaling options available. For more details on those, please
check out the documentation above.
[top]
Download of LGI middleware
The LGI middleware source code and documentation can be downloaded from
here
(.tar.gz).
[top]
Basic interfaces of project "LGI"
The basic web-interface of this LGI project-server can be reached
through this
link. You can only use the basic web-interface if you have a valid personal
x509 certificate signed by the LGI Certificate Authority of this project
"LGI".
To use the basic command-line-interface, you need to download
the LGI software and compile it on your favourite POSIX system. Be sure
you have installed libcurl too (get it from http://curl.haxx.se/libcurl/).
Check out the documentation above on how
to configure the tools and to see some examples on how to use them.
Using the python LGI_Client class is easy. Just look at the example
above. The LGI_Client class uses the
same default configuration files as the command-line-interface.
[top]
Support
The development of LGI has been supported by the Theoretical Chemistry group
of the Leiden Institute of
Chemistry of the Leiden University.
[top]