 |  |
 | Contents |  |  |
 |
|
 | Abstract |  |  |
 | Code librarian is a
generic tool for keeping track of updates to CVS repositories and presenting them for site visitor in a
friendly CVS Browser web
environment, conceptually similar to tools such as
Bonsai or
ViewCVS. It also presents
nice commit graphs, such as those shown on this site.
Main goals for the project include scalability, usability, extendability and configurability, and in trying to acheve
those goals, much of the dirty work is delegated to MySQL and Roxen
WebServer.
|
 | Feel the source, Luke! |  |  |
 | At present, the repository is hosted by the
Lysator Academic Computer Society
and can be downloaded via anonymous CVS from cvs.lysator.liu.se. More info
on setting it all up will be available, here on the home page and in the
INSTALLING file not yet written. For now, you can
browse its source code
on Lysator via the tool itself, though the layout template in use there is not
as thoroughly worked through as that on this site. Stay tuned for updates.
Those of you who want to check out the code and its docs to play with it and
try it out could issue these two commands:
|
 | cvs -d:pserver:anonymous@cvs.lysator.liu.se:/cvsroot/code_librarian login
cvs -d:pserver:anonymous@cvs.lysator.liu.se:/cvsroot/code_librarian co code_librarian
|
 | Design - a db oriented briefing |  |  |
 | The system design is fairly straight-forward; the core of the Code
Librarian is its database, where all data about monitored repositories
reside. This database is shared among the different components, who
may or may not run across multiple machines simultaneously. Where
needed, synchronization is implemented using the MySQL cooperative
locking scheme (GET_LOCK() / RELEASE_LOCK()).
|
 | The database is initialized by the Code Librarian indexer, whose
primary function is to produce the main contents of the database. Once
setup, the database keeps track of what repositories are monitored,
their host machine, cvs access scheme, cvsroot and how they should be
referred to in the user interface. The bulk of the database, however,
is all data collected about these repositories.
|
 | The indexer |  |  |
 | The Code Librarian indexer is the main producer of content for the
database. For one CL setup, you may run multiple indexers, typically
one per repository host. Each indexer may monitor multiple
repositories, granted filesystem (read) access to their cvsroot. The
indexer can run in two modes (interchangably; a switch can be done at
will anytime without them interfering with one another), depending on
the nature of your setup and your personal preference.
|
 | - Trigger mode |  |  |
 | The recommended (most responsive, least resource intensive) mode of
operation is the incremental (notify, trigger, daemon) mode. This mode
requires setting up commitinfo and taginfo hooks for the
cvs repositories that notify the indexer of repository events as they
happen through an eventlog directory on the same machine. Then start
the indexer in daemon mode (supply the -l flag and the path
of the event log directory). It will perform an initial sweep of all
of its repositories at start-up, and then continuously monitor the log
directory for updates (new commits and tags). This way, repository
changes take effect and show up in the UI almost instantly, with a
minimum of strain on the system.
|
 | - Full sweep mode |  |  |
 | If, for some reason, incremental mode is not a viable option, there
is also the more crude full sweep mode, suited to be run from crontab,
for instance. Full sweep mode only performs one sweep through all its
repositories and then exits back to the shell. The advantages over
daemon mode is that it takes a minimum of setup and that it only runs
when you choose to (if this is good or bad is of course a matter of
taste). The disadvantages of this mode are:
|
 |  | the unavoidable latency
- The system is less responsive to repository changes, and the UI
will lag behind more the less often you run the indexer. |  | excessive disk activity
- Since each invocation has to traverse all directories of
all repositories, your disk arrays could take on a massive load
of stat() / get_dir() calls if you monitor
reasonably big repositories. |
|
 | The web interface |  |  |
 | The web frontend is provided by a set of roxen modules (okay,
currently only one) and a set of RXML templates. The look and feel is
entirely up to these templates, since the modules only provide the
tools for extracting the relevant data from the database in a
convenient manner consistent over time (database design is more likely
to change than the RXML access methods). The web server (or servers;
you may run the frontend on multiple machines, should you want to)
need not run on the same machine(s) as your repositories. Read and
write access to the database is also needed. Write access is strictly
only needed for caching up some operations (colorized view of file
contents, for instance) so far, though.
|
 | | |  |