“The Language Archive” (TLA) was founded in September 2011 to give a broad multilateral undertaking to build the foundations for a sustainable archive for language materials of all sorts, but especially records of small and endangered languages. The archive now hosts priceless materials from many endangered languages, notably the outcome of the VW DOBES project and many projects from the MPI Nijmegen, but also from other depositors. The original brief of TLA was to build the infrastructure for the long term preservation of these and similar materials.
TLA was funded for a 5 year period with the broad and generous support from three institutions, two academies, BBAW and KNAW, and the Max Planck Gesellschaft (MPG). It was agreed that the arrangement would be reviewed mid-way to reconsider the best way forward to ensure the long term preservation and access to the archive. Developments now suggest that, on the one hand, the present engagement of the three funding institutions may not be sustained into a next funding round, and on the other hand, that new opportunities have arisen. These include the establishment of a new MPG centre (Max Planck Computing and Data Centre, MCDC) where a lot of the eHumanities expertise now in TLA is likely to be further enhanced, and the growing capability of public-domain off-the-shelf archival frameworks (like Fedora Commons) and their wide adoption in public institutions which promise to lower the requirements for an independent unit like TLA.
In the light of these developments, our current planning is that TLA will be transformed after October 2016 into a joint operation between the MPI Nijmegen and the new MPG Computing and Data Centre (MCDC) in Garching, with a more focused Archive, run directly by the technical group of the host institution MPI Nijmegen, and the broader eHumanities development work coordinated by the MCDC. Between the two centres, the intention is to maintain for the long term the essential goals of TLA, and proper access to all the current archival materials, to upgrade and update the upload and retrieval software, and to continue to support such crucial exploitation tools as ELAN. We also hope that synergies with other major archives (such as shared back-up and maintenance solutions) can be maintained as they are now. Naturally, the original guarantees to all our depositors will be honoured under the new arrangements. Meanwhile, in the remaining two years of funding, TLA will concentrate on streamlining and improving the usability of all the important existing software, in order to construct a sustainable resource far into the future.
What depositors of datasets in the Archive can expect
Depositors can rest assured that the data will be preserved not only in the sense of bit-stream preservation (the MPG guaranteed this for DOBES and several other data sets for at least 50 years), but also in terms of long-term accessibility. Persistent identifiers will continue to be valid and to resolve to the actual data, and data will continue to be accessible and downloadable for those authorized, and protected from those not authorized. TLA’s topmost priorities are on reliable and (where possible) direct and easy access, with an emphasis on visualization (ANNEX). The maintenance of these functionalities takes place against some major changes in the underlying archival software, so we cannot at this stage be sure exactly what functionalities will be enhanced or restricted.
What new projects and people with legacy data can expect
TLA will give priority to existing collaborations and projects. New deposits will be handled dependent on available human resources.
What partners in collaboration projects can expect
It goes without saying that TLA will continue to participate actively in all ongoing projects and will attempt to meet all project obligations. Project collaborations without funding where TLA participates on a voluntary basis will continue as time and staffing allow.
What partners interested in future collaborations can expect
The Language Archive after 2016 will be organized as a joint undertaking between the MPI in Nijmegen and the Max Planck Computing and Data Centre (MCDC) in Garching. Future collaborations on joint development projects will need to be undertaken in consultation with the MCDC.
What archives using TLA’s ‘LAT’-software can expect
Fifteen institutions worldwide make use of TLA’s archiving software “Language Archiving Technology” (LAT), and several depend on at least some TLA support in order to maintain their archives. TLA is planning a major redesign of the archiving software in which a widely used open source repository system will form the core of the setup. The goal of this move is to create an archiving solution that will require less maintenance in the long run. TLA will extend the open source repository system such that suitable access and deposit methods for the kinds of data we host are available. This new LAT software – named ‘easyLAT’ – will be made available to all our partners, but we cannot guarantee extensive support after 2016.
What users of our stand-alone-software can expect
TLA has developed a number of tools. TLA cannot guarantee the maintenance and further development of most of them after September 2016, except for ELAN which will continue to be developed. The source code of the other tools will be made available publicly on platforms like GitHub. It is hoped that a larger community will take up further development.
What those concerned with infrastructures for the Humanities can expect
TLA has begun to organize a process where much of its advanced activities and central services will gradually be taken over by centers other than Nijmegen, including the Max Planck Computing and Data Centre in Garching. The MPI has asserted the interest and ambition to continue to be a CLARIN-Center. The institutional setting and funding of the CLARIN infrastructure after 2016 is not determined yet. However, even if TLA will not act as the strong CLARIN center it used to be after September 2016, this is in line with the vision of the project: right from its start CLARIN was designed as a distributed network of centers where some of the centers grow and others may shrink without putting the infrastructure as a whole at risk.