Typecraft v2.5
Jump to: navigation, search

Difference between revisions of "Norwegian Valency Corpus"

(Created page with "==Version 1== Version 1 of the corpus consists of 22000 sentences imported from the Leipzig Corpus Collection, all with the standard TypeCraft IGT annotation and with valency...")
 
(Version 1.0 - trial version)
 
(18 intermediate revisions by 2 users not shown)
Line 1: Line 1:
==Version 1==
+
==Version 1.0 - trial version==
Version 1 of the corpus consists of 22000 sentences imported from the Leipzig Corpus Collection, all with the standard TypeCraft IGT annotation and with valency information for each verb occurrence, given in the form
+
--[[User:Typecraft|Typecraft]] ([[User talk:Typecraft|talk]]) 10:03, 18 July 2017 (CEST)
 +
 
 +
The corpus consists of 22000 sentences imported from the [http://corpora.uni-leipzig.de/en?corpusId=deu_newscrawl_2011 Leipzig Corpus Collection], all with the standard TypeCraft IGT annotation and with valency information for each verb occurrence, given in the form exemplified for ''ditransitive'':
 
  SAS: NP+NP+NP
 
  SAS: NP+NP+NP
 
  FCT: ditransitive
 
  FCT: ditransitive
 
  SIT: ternaryRel
 
  SIT: ternaryRel
 
  ConstructionLabel: v-ditr  
 
  ConstructionLabel: v-ditr  
where 'SAS' stands for 'syntactic argument structure', 'FCT' stands for 'functional characterization', 'SIT' for situation structure, and 'ConstructionLabel' for a code described at [[]]. The valency information is always stated relative to the ACTIVE form of the verb, even if the example provided is in passive form.
+
Here 'SAS' stands for 'syntactic argument structure', 'FCT' stands for 'functional characterization', 'SIT' for situation structure, and 'ConstructionLabel' for a code described at [[Verbconstructions cross-linguistically - Introduction]]. The valency information is stated relative to the ACTIVE form of the verb, even if the example provided is in passive.
 
When doing search you can use either of these types of labels. The array of options within each type is explained and exemplified as follows:
 
When doing search you can use either of these types of labels. The array of options within each type is explained and exemplified as follows:
 
  SAS at [[Valency label 'SAS']]
 
  SAS at [[Valency label 'SAS']]
 
  FCT at [[Valency label 'FCT']]
 
  FCT at [[Valency label 'FCT']]
 
  SIT at [[]]  
 
  SIT at [[]]  
  ConstructionLabel at [[Valence Profile Norwegian]] (for illustrations using English, see [[Valence Profile English]]).
+
  ConstructionLabel at [[Valence Profile Norwegian]] (for illustrations using English, see
You can search relative to valency type in general, or specifically for a given verb, where the verb can be specified by citation form or in its actually occurring form. The search interface is the standard one for TypeCraft:  
+
[[Valence Profile English]]).
 +
Joint illustrations of them all are given in [[Valency code illustrations]].
 +
 
 +
You can search relative to valency type in general, or specifically for a given verb, where the verb can be stated by citation form or by its actually occurring form. The search interface is the standard one for TypeCraft:  
 
  TypeCraft Tools (in upper left corner) -> TypeCraft Search -> Phrase search.
 
  TypeCraft Tools (in upper left corner) -> TypeCraft Search -> Phrase search.
On this page choose 'Norwegian Bokmål' from the Language menu; at 'Phrase level', write (or glue) the valency label into the slot 'Phrase description'. If you want to search also relative to verb, enter the exact form of the verb under 'Word level - Exact form'. (The slot for its citation form is 'Morpheme level - Exact base form', however this search option is temporarily disabled.)
+
On this page choose 'Norwegian Bokmål' from the Language menu; at 'Phrase level', write (or glue) the valency label into the slot 'Phrase description'. If you want to search also relative to verb, enter the exact form of the verb under 'Word level - Exact form'. (The slot for its citation form is 'Morpheme level - Exact base form', however this search option is temporarily disabled. The same holds for any other search for morphological properties when done in conjunction with 'Phrase description'.)
 +
 
 +
A verb lexicon with valence types given in the ConstructionLabel format is given in [[Media: ValenceLexicon.odt|Valence lexicon]]. Each entry, specific to a specific frame, is given on the form exemplified below
 +
yppe_tr-refl_vlxm := v-tr-obRefl
 +
where the part
 +
v-tr-obRefl
 +
is the valence type. You can glue this into the 'Phrase description' field in the Search interface and get all sentences realizing this frame. By, in addition, specifying "yppe" in Word level - Exact form' you get all sentences with "yppe" used with this frame.
 +
 
 +
See
 +
[[NorVal resources]]
 +
for the general valence resource reflected in the annotations.
 +
 
 +
The present version is a trial of a methodology described in [[to appear]], which potentially allows for a rapid increase in corpus size. The present version has clear errors, to be improved for a next stage.
 +
 
 +
 
 +
[[Category:Valence by language]]
 +
[[Category:Norwegian]]

Latest revision as of 14:55, 23 September 2021

Version 1.0 - trial version

--Typecraft (talk) 10:03, 18 July 2017 (CEST)

The corpus consists of 22000 sentences imported from the Leipzig Corpus Collection, all with the standard TypeCraft IGT annotation and with valency information for each verb occurrence, given in the form exemplified for ditransitive:

SAS: NP+NP+NP
FCT: ditransitive
SIT: ternaryRel
ConstructionLabel: v-ditr 

Here 'SAS' stands for 'syntactic argument structure', 'FCT' stands for 'functional characterization', 'SIT' for situation structure, and 'ConstructionLabel' for a code described at Verbconstructions cross-linguistically - Introduction. The valency information is stated relative to the ACTIVE form of the verb, even if the example provided is in passive. When doing search you can use either of these types of labels. The array of options within each type is explained and exemplified as follows:

SAS at Valency label 'SAS'
FCT at Valency label 'FCT'
SIT at [[]] 
ConstructionLabel at Valence Profile Norwegian (for illustrations using English, see
Valence Profile English).

Joint illustrations of them all are given in Valency code illustrations.

You can search relative to valency type in general, or specifically for a given verb, where the verb can be stated by citation form or by its actually occurring form. The search interface is the standard one for TypeCraft:

TypeCraft Tools (in upper left corner) -> TypeCraft Search -> Phrase search.

On this page choose 'Norwegian Bokmål' from the Language menu; at 'Phrase level', write (or glue) the valency label into the slot 'Phrase description'. If you want to search also relative to verb, enter the exact form of the verb under 'Word level - Exact form'. (The slot for its citation form is 'Morpheme level - Exact base form', however this search option is temporarily disabled. The same holds for any other search for morphological properties when done in conjunction with 'Phrase description'.)

A verb lexicon with valence types given in the ConstructionLabel format is given in Valence lexicon. Each entry, specific to a specific frame, is given on the form exemplified below

yppe_tr-refl_vlxm := v-tr-obRefl 

where the part

v-tr-obRefl 

is the valence type. You can glue this into the 'Phrase description' field in the Search interface and get all sentences realizing this frame. By, in addition, specifying "yppe" in Word level - Exact form' you get all sentences with "yppe" used with this frame.

See NorVal resources for the general valence resource reflected in the annotations.

The present version is a trial of a methodology described in to appear, which potentially allows for a rapid increase in corpus size. The present version has clear errors, to be improved for a next stage.