Online service for dependency (sub)tree extraction and analysis.
STARK is a versatile tool for exploring the syntactic structure of language in linguistically annotated corpora, known as treebanks.
The current version of the STARK tookit is 3.1.0.
Date of last update: 24.05.2025
The source code for the STARK tool can be accessed via the CLARIN.SI repository under the following license: Apache 2.0.
Online interface at orodja.cjvt.si
CJVT Tools
Ljubljana, 2025
This work is licensed under the Apache 2.0 open-source license.
STARK-demo interface development
Luka Krsnik
Kaja Dobrovoljc
STARK toolkit development
Luka Krsnik
Kaja Dobrovoljc
Marko Robnik Šikonja
Published by
Centre for Language Resources and Technologies, University of Ljubljana
Citation
TBA
STARK is a versatile tool for exploring the syntactic structure of language in linguistically annotated corpora, known as treebanks. It identifies and extracts a wide range of syntactic structures, or "trees", to reveal which patterns actually occur in a language and how prominent they are with respect to various statistical metrics.
STARK is primarily aimed at processing treebanks based on the Universal Dependencies annotation scheme, but it also takes any other dependency treebank in the CONLL-U format as input. Essentially, the tool produces a table listing all tree structures that match user-defined criteria, along with their frequencies and other corpus-linguistic statistics. Its flexible settings support a wide range of investigations on both lexicalized and delexicalized data—from broad, bottom-up analyses (e.g. identifying all noun-headed structures) to more targeted, top-down queries (e.g. finding all verbs that take two objects).
STARK was developed by Kaja Dobrovoljc, Luka Krsnik and Marko Robnik Šikonja as part of the research project SPOT: A Treebank-Driven Approach to the Study of Spoken Slovenian (ARIS grant no. Z6-4617) and the CLARIN.SI Resource and Service Development grants (2019, 2024). With support from CJVT UL, this online interface was created to make STARK’s core functionality accessible to a broader audience, but provides a simplified set of options compared to the full-featured command-line version, which is available at: https://github.com/clarinsi/STARK.
Version
STARK 3.1.0
Date of last update of the tool: 17.5.2025
Date of last update of the interface: 24.5.2025