Commit 468ae5ea authored by Oliver Hellwig's avatar Oliver Hellwig

readmes

parent 778de393
# Merged annotations
This directory contains a single csv file in which the verb-argument (H. Hettrich) and the morpho-lexical annotations (O. Hellwig) are presented in a merged format.
The layout of the file resembles that in the folder *morpho-lexical*, but $ is used as separation character.
##Fields with different names (morpho-lexical level):
* **lex_id** = morpho-lexical$lemma_id
* **boundary** = morpho-lexical$sentence_boundary
* **udpos** = morpho-lexical$coarse_pos
* **cas**(e)
* **num**(ber)
* **gen**(der)
* **per**(son)
* **tem** = morpho-lexical$tense_mode
## Fields describing the verb-argument level
* **instance_id**: unique identifier of the verb-argument structure. All instances (lines) that share a number belong to the same verb-argument construction.
* **kasus**: as annotated by H. Hettrich
* **semantik**: word semantic class as annotated by H. Hettrich
* **funktion**: case semantic function as annotated by H. Hettrich
* **annotation_mode**: How was this argument annotated?
* def: By direct match between Hettrich and Hellwig data
* heu: Using a heuristic (refer to the paper)
* oh: Later annotation by O. Hellwig (use with care!)
* **annotation_problems**: If no direct match between Hettrich and Hellwig was possible, this field indicates possible sources of problems
* **num_anno_problems**: Number of such problems (integer)
\ No newline at end of file
# Morpho-lexical annotations
The file 'rigveda.csv' contains the morpho-lexical annotations that were generated with the SanskritTagger tool and manually validated.
The file 'rigveda.csv' contains the morpho-lexical annotations that were generated with the SanskritTagger tool and manually validated by O. Hellwig.
The file is a plain text file in UTF-8 encoding using # for separating fields.
......@@ -19,4 +19,10 @@ The first line of the file contains the headline. Explanation of the individual
* **lemma_id**: unique lemma id
* **id_tea**: unique id of the lemma at this position (internal use only)
* **sentence_boundary**
coarse_pos#case#number#gender#person#tense_mode#synsets
* **coarse_pos**: Approximate POS tag (may be incorrect for pronominal classes, doesn't distinguish between common nouns and named entities)
* **case** of a noun or adjective
* **number** of a noun, adjective or verb
* **gender** of a noun or adjective
* **person** of a finite verbal form
* **tense_mode** of a finite verbal form
* **synsets**: (internal use only)
# Verb-argument annotations
This directory contains the **original**, unchanged verb-argument annotations performed by H. Hettrich.
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment