by Jacquelijn Ringersma and Paul Trilsbeek
Language documentation is a field in linguistics which went through a “technology driven” change over the last 10 to 15 years. Linguists have been going into the field for decades making sound recordings of languages and linguistic events. However the miniaturization of recording equipment made it much easier to make large quantities of high quality audio recordings. In addition, upcoming affordable, high quality, video equipment permitted an extension of documentation work from audio to the visual dimension. The latter made it possible to document the languages within their natural and cultural context, which triggered the establishment of a branch within linguistics where the creation of a rich multimedia corpus for languages that are threathened with extinction became the main goal. In addition to collecting large amounts of primary audio and video recordings, numerous derived resources are produced: annotations and transcriptions, lexica, grammars, field notes etc.
The DoBeS (Dokumentation Bedrohter Sprachen/Documentation of Endangered Languages) programme, which started about 10 years ago, was among the first funding initiatives for endangered languages documentation projects. An important aspect of this programme was the establishment of a central, specialized archive to take care of long-term preservation of the valuable material that was collected by the documentation projects. The central archive, which is based at the Max Planck Insitute for Psycholinguistics, was made an essential part of the programme because one had become aware of the fact that large amounts of recordings about languages and cultures were in danger of being lost forever. Old tapes and films that are not stored in specialized climatized rooms rapidly degrade over time, but the situation is even worse for modern digital storage media such as DVDs and hard disks. Even if the media would survive, the technology changes so fast that it is very unlikely that there will be equipment around to read today’s storage media 20 years from now. A specialized digital archive will continuously migrate the stored material to the latest storage technology and will also migrate the stored file formats should they become obsolete.
Some researchers have their doubts about storing their resources in an online archive. Arguments presented to us are in the form of: (1) Once my material is in there, I will not be able to get it out; or (2) Other researchers will use my material without giving me the credit and do all kinds of nice things with it. However, when you store material in the MPI archive, you will maintain full control over the access to the data through an online access management system (AMS). You are the owner of the data, and you will remain the owner of the data. You decide who you will grant access. This opens up opportunities to give access to members of the speech communities or the relatives of those recorded.
The MPI archive accepts deposits from linguists who do not have an affiliation with the MPI or DoBeS. Storing your data in the MPI archive has the advantage that the data is stored in an organized manner and that you can use online tools to search through your data. You can also use online tools to visualize your data in an attractive manner. But most important, we will safeguard your data by making various backup copies in the Netherlands and Germany, by always using the latest state of the art in storage technology and by migrating to newer file formats should the current ones become obsolete in the future.