Persoonlijke hulpmiddelen
U bent hier: Home Research Linguistics Taalkorpora_en Frisian Language Corpus
Document Acties

Frisian Language Corpus

The language corpora of the Frisian Academy


Overview

We are working on the construction of four language corpora.

  • The corpus New Frisian
  • The corpus Nineteenth-century Frisian 
  • The corpus Middle Frisian
  • The corpus Old Frisian

All corpora will be joined together to form one big corpus of Frisian spanning roughly 1200-2100. Clic for a screenshot of the demo.


The corpus New Frisian

The corpus New Frisian contains a representative sample of 20th century Frisian, containing various sorts of texts such newspaper articles, novels, poetry, technical literature, and so on. The corpus contains some 25 million words. The corpus can be digitally searched as annotated text containing bibliographical information. In the future, differing spellings of words will be linked to each other and paradigms will be subsumed under one entry, in the same way as has been done for the Corpus Middle Frisian.


The corpus Nineteenth-century Frisian

Around a million words have been scanned and corrected, and some handwritten manuscripts have been typed over by volunteers. A beta version of this corpus is currently being tested. In the future, differing spellings of words will be linked to each other and paradigms will be subsumed under one entry, in the same way as has been done for the Corpus Middle Frisian.


The corpus Middle Frisian

This corpus contains all Middle Frisian texts, that is all texts written between 1550 and 1800. All spelling variants of the same word have been identified, members of one and the same paradigm have been connected to one entry and morphological and grammatical information has been added to every word. This corpus serves as a model for the other corpora. We are currently improving the presentation and user-friendeliness of this corpus.


Contact

For questions about the language corpora, please contact:



Powered by Plone