V četrtek, 12. novembra 2015, je v Modri sobi Filozofske fakultete Univerze v Ljubljani potekalo predavanje Václava Cvrčka, direktorja Inštituta za češki nacionalni korpus Karlove univerze v Pragi, z naslovom Introducing Czech National Corpus. Predavanje sta organizirala Filozofska fakulteta Univerze v Ljubljani (programska skupina P60215 “Slovenski jezik: bazične, kontrastivne in aplikativne raziskave”) in Center za uporabno jezikoslovje zavoda Trojina.
Povzetek v angleščini
Introducing Czech National Corpus, Václav Cvrček
Czech National Corpus (CNC, see www.korpus.cz) is an academic project striving for continuous mapping of Czech language in all possible dimensions which was in 2011 acknowledged as a research infrastructure for empirical language-oriented research in social sciences and humanities. Since its foundation in 1994, the CNC has been systematically collecting, processing and providing access to large language corpora of Czech and other languages for contrastive research. In my talk I would like to introduce the CNC project – its current activities as well as its development outlook – with respect to following topics: data collection (current data coverage and plans for future), data processing (linguistic and structure annotation), tools and applications for corpus-based research developed within the CNC.