Optical Tablature Recognition Toolkit for Gamera
|
|
|
Optical Tablature Recognition
The aim of optical tablature recognition (OTR) is to extract
a machine readable tablature code from a bitmap image of a historic
lute tablature print.
Although the tablature image is better suited for a human reader, the
seemingly more cryptic tablature encoding offers a number of advantages:
- music and audio/midi output can be automatically generated, thus
making the music accessible to people who cannot read lute tablature
- nice looking modern editions can be generated
- the code can be stored in music databases
What is Lute Tablature?
Lute tablature is a historic music notation which is specific to fretted
string instruments like lute, guitar or viol. Unlike common music notation,
it does not describe the sound of the music, but where and when the strings of
the instrument are stopped. It was in commen use from ca. 1500 until 1750
and hundreds of historic tablature prints and manuscripts have survived from
that period.
There has not been a single lute tablature notation in commen use througout
Europe, but different regions used different encoding schemes:
- French Tablature uses letters for the frets and the top line
for the highest course (see the example above)
- Italian Tablature uses numbers for the frets and the bottom line
for the highest course
- German Tablature uses a unique letter for each fret/string
combination and thus does not need staff lines to represent the different
courses
Beside these differences there was also a wide variety of notations even
between tablature prints of the same tablature type, eg. with respect
to the rythmic notation or the fret letter font.
How is OTR accomplished?
As many pattern recognition systems, OTR requires the following steps:
- Preprocessing
- includes image enhancement like smoothing and staff line removal.
- Segmentation
- the isolation of the individual symbols.
- Classification
- the identification of the individual symbols, eg. as
a quarter rhythm flag or as the letter 'a'
- Postprocessing
- includes a semantic interpretation of the individual symbols an the
generation of a tablature code
Concerning the classification it is essential that the recognition system is
adaptive to different tablature variants. In other words, it must be easy to
train the system for a particular tablature print.
We have decided to use the
Gamera framework
for document image analysis for a number of reasons:
- Gamera already provides functions for image segmentation (projections,
connected component analysis), classification (kNN) and a classifier
training interface
- Gamera methods can be combined flexibly because they are provided as
python modules
- Gamera is platform independant: it runs on MacOS X, Linux and Win32
- Gamera is OpenSource with a small but pleasant developer community
Hence our system is distributed as a toolkit for Gamera.
What is this toolkit?
The aim of the Lute Tablature Recognition Toolkit is to help building
tablature recognition systems. It makes it easy to create recognition
systems for a wide variety of tablature prints.
This toolkit provides
- python library functions for building own tablature recognition
applications
- ready to use tablature recognition scripts for training tablature
symbols and the actual recognition based on the trained data
It is based on and requires the
Gamera
document image analysis framework.
Documentation and References
For a comprehensive overview of the recognition system for staffline based
lute tablatures, see
For detailed description of the OTR toolkit, you can browse the docs
here online.
This documentation describe how to install, use and extend the toolkit.
The same html documentation is also included in the doc/html
subdirectory of the OTR toolkit source distribution.
Authors
Software
The source code of our software is available under the terms of the GNU
General Public License. File releases of stable versions are available below.
Moreover you can get a development snapshot via CVS access from the
OTR SourceForge site.
In short, you obtain the source code with the following two commands:
cvs -d:pserver:anonymous@otr4gamera.cvs.sourceforge.net:/cvsroot/otr4gamera login
cvs -z3 -d:pserver:anonymous@otr4gamera.cvs.sourceforge.net:/cvsroot/otr4gamera co -P otr4gamera
When asked for a password by the first command (login), just press
ENTER. Note that the login command does not perform a login and
start a session. It will instead store login information somewhere in your
home directory and return immediately. Future checkouts will then no longer
require the login command.
Prerequisites
The recognition system requires a working installation of the Gamera framework
for document analysis and recognition. See the
Gamera homepage
for information how to obtain and install Gamera. The GUI part of the toolkit
requires wxPython 2.5 or later (preferably 2.6 or 2.8).
This toolkit additionally relies on the MusicStaves toolkit for
staff line detection and removal. This toolkit is freely available from the
MusicStaves toolkit homepage. See
there for additional information on its installation.
Downloads
On MacOS X and Linux, download the source code package and install it with
python setup.py build && sudo python setup.py install. See the
documentation for more information on building
and installing this toolkit from the sources. On Windows, you can either
use the source package or the binary installer.
Source code:
- otr-2.0.2.tar.gz
(Feb 16 2010)
Binary installer for Windows:
- otr-2.0.2.win32-py2.5.exe
for Python 2.5 and Gamera ≥ 3.2.4 (Feb 16 2010)
The documentation is included in the source package, but not in the binary
installer. If you use the binary installer, you can
download it as a .tar.gz archive.