Difference between revisions of "NorVal resources"
Lars Hellan (Talk | contribs) |
Lars Hellan (Talk | contribs) |
||
Line 28: | Line 28: | ||
(2) reads as an entry with the lemma huske, the selected item på (‘on’), and the frame type ‘intransitive with oblique’, where the oblique argument consists of the preposition på and a declarative clause, as in (3): | (2) reads as an entry with the lemma huske, the selected item på (‘on’), and the frame type ‘intransitive with oblique’, where the oblique argument consists of the preposition på and a declarative clause, as in (3): | ||
− | (3) Han husket på at det var søndag ‘he remembered that it was Sunday’ | + | (3) Han husket på at det var søndag ‘he remembered that it was Sunday’ |
The format of a multivalent verb lemma entry is illustrated in (4), where each constituting lexval is represented with ‘V’ as placeholder for the lemma (‘EqSuInf’ stands for ‘infinitive equi-controlled by subject’); such a structure we call a ''valpod'': | The format of a multivalent verb lemma entry is illustrated in (4), where each constituting lexval is represented with ‘V’ as placeholder for the lemma (‘EqSuInf’ stands for ‘infinitive equi-controlled by subject’); such a structure we call a ''valpod'': | ||
− | (4) | + | (4) huske:V__intr & V-på__intrObl-oblDECL & V-på__intrObl-oblEqSuInf & V-på__intrObl-oblINTERR & V-på__intrObl-oblN & V__tr & V__tr-obDECL & V__tr-obEqSuInf & V__tr-obINTERR |
− | + | ||
− | huske:V__intr & V-på__intrObl-oblDECL & V-på__intrObl-oblEqSuInf & V-på__intrObl-oblINTERR & V-på__intrObl-oblN & V__tr & V__tr-obDECL & V__tr-obEqSuInf & V__tr-obINTERR | + | |
The structure of a valpod is essentially a set, although represented with ordering conventions among its members (e.g., intransitives before transitives). | The structure of a valpod is essentially a set, although represented with ordering conventions among its members (e.g., intransitives before transitives). | ||
Line 45: | Line 43: | ||
An ArgumentLabel describes a constituent of the construction or frame, while a GlobalLabel represents a categorization of the construction or frame as a whole. | An ArgumentLabel describes a constituent of the construction or frame, while a GlobalLabel represents a categorization of the construction or frame as a whole. | ||
Thus, for the description of the object of a clause or frame we use an ArgumentLabel, while in characterizing the clause or frame as a whole (as transitive), we use a GlobalLabel. | Thus, for the description of the object of a clause or frame we use an ArgumentLabel, while in characterizing the clause or frame as a whole (as transitive), we use a GlobalLabel. | ||
+ | |||
+ | ArgumentLabels are composed of a prefix indicating the grammatical function (GF) of the constituent, followed by one or more parts indicating inherent properties of the constituent. For instance, obDECL is an ArgumentLabel with ob as GF-indicating prefix and DECL indicating that the constituent is a declarative clause. | ||
+ | GlobalLabels consist minimally of a symbol for overall valence, such as tr for ‘transitive’, in many cases with additional symbols indicating further structure. Thus, the frame representation intrObl-oblDECL in (2) has intrObl as GlobalLabel. | ||
+ | The number of ArgumentLabels and GlobalLabels is fairly large, and it is essential that the meaning they carry is in each case concise and transparent. (Moreover, to be better navigable and understandable, entries using the notations ought to be accompanied by illustrations, such as short exemplifying sentences.) | ||
+ | |||
+ | The resource concretely consists of five files, as follows: | ||
+ | |||
+ | lexvals-pure: a list of currently 15,753 lexval entries in the style of (2) above; | ||
+ | |||
+ | lexvals-exemplified: the same entries as in ‘lexvals-pure’, each with a short sentence instantiating the argument structure of the frame: | ||
+ | |||
+ | multivals: a list of all multi-membered valpods (defined relative to 3,360 lemmas); | ||
+ | |||
+ | univals: a list of all uni-membered valpods (defined relative to 2,970 lemmas); | ||
+ | |||
+ | valtypes_exemplified: a list of the 300 frame types, with a lexval entry for each type, and a short exemplifying sentence, with English translation. | ||
+ | |||
+ | They are all text-files, the last one also in xl format. | ||
+ | |||
+ | The files are as yet not accessible through any web interface. | ||
+ | |||
+ | |||
Revision as of 12:31, 6 September 2021
This page describes resourrces related to the Norwegian Valence Catalogue NorVal.
An exhaustive descrition is given in
Hellan, Lars, 2021. A valence catalogue for Norwegian. (In Loukanova (ed) Natural Language Processing in Artificial Intelligence - NLPinAI 2021. Springer. (To appear)
with exemplifying files in Hellan, Lars, 2021. Supplementary Data for 'A valence catalogue for Norwegian': https://doi.org/10.18710/8U3L2U.
A presentational talk was given at the Societas Linguistics Europaea (SLE) meeting 2021,
The valence resource NorVal represents the valence frames of 6300 Norwegian verbs. The number of frame types used in their classification is about 300, representing: valence profiles like intransitive, transitive, ditransitive, copular, and more; grammatical functions such as ‘subject’, ‘object’, etc.; for an argument to be noun-headed vs. being an embedded clause (declarative, interrogative, or infinitival, in canonical or ‘extraposed’ position); for an argument to relate syntactically and semantically to the same predicate or not; occurrence of particles; all basic structures of ‘Logical Form’ as far as argument structure is reflected go. Not included in the lexical descriptions are senses, in the sense of definitions, paraphrases or synonyms, hence it is not a dictionary in the normal sense; due to its stripped-down formalism, we rather call it a valence catalogue.
Given that a verb can have more than one valence frame, a valence resource needs two kinds of entries, viz. entries for verb lemmas, and entries for a verb with a given frame; the latter we refer to as a ‘lexically instantiated valence frame’, for short ‘lexval’. The format of a lexval entry is illustrated by one of the frame environments for the verb lemma huske ‘remember’ in (2), instantiating the general coding pattern in (1):
(1) Lemma – selectedItem (if any) __ FrameType
(2) huske-på__intrObl-oblDECL
(2) reads as an entry with the lemma huske, the selected item på (‘on’), and the frame type ‘intransitive with oblique’, where the oblique argument consists of the preposition på and a declarative clause, as in (3):
(3) Han husket på at det var søndag ‘he remembered that it was Sunday’
The format of a multivalent verb lemma entry is illustrated in (4), where each constituting lexval is represented with ‘V’ as placeholder for the lemma (‘EqSuInf’ stands for ‘infinitive equi-controlled by subject’); such a structure we call a valpod:
(4) huske:V__intr & V-på__intrObl-oblDECL & V-på__intrObl-oblEqSuInf & V-på__intrObl-oblINTERR & V-på__intrObl-oblN & V__tr & V__tr-obDECL & V__tr-obEqSuInf & V__tr-obINTERR
The structure of a valpod is essentially a set, although represented with ordering conventions among its members (e.g., intransitives before transitives). There are as many valpods as there are lemmas, i.e., 6300; the number of multi-membered valpods (like (4)) is about 3360.
The notation for frame types uses the system Construction Labeling (‘CL’) (cf. Hellan and Dakubu 2010, Dakubu and Hellan 2017, and Hellan 2019), which characterizes verb-headed constructions and verb valence frames through strings of symbols built up in the following way:
Head-POS – GlobalLabel – ArgumentLabel1- ArgumentLabel2 - …
An ArgumentLabel describes a constituent of the construction or frame, while a GlobalLabel represents a categorization of the construction or frame as a whole. Thus, for the description of the object of a clause or frame we use an ArgumentLabel, while in characterizing the clause or frame as a whole (as transitive), we use a GlobalLabel.
ArgumentLabels are composed of a prefix indicating the grammatical function (GF) of the constituent, followed by one or more parts indicating inherent properties of the constituent. For instance, obDECL is an ArgumentLabel with ob as GF-indicating prefix and DECL indicating that the constituent is a declarative clause. GlobalLabels consist minimally of a symbol for overall valence, such as tr for ‘transitive’, in many cases with additional symbols indicating further structure. Thus, the frame representation intrObl-oblDECL in (2) has intrObl as GlobalLabel. The number of ArgumentLabels and GlobalLabels is fairly large, and it is essential that the meaning they carry is in each case concise and transparent. (Moreover, to be better navigable and understandable, entries using the notations ought to be accompanied by illustrations, such as short exemplifying sentences.)
The resource concretely consists of five files, as follows:
lexvals-pure: a list of currently 15,753 lexval entries in the style of (2) above;
lexvals-exemplified: the same entries as in ‘lexvals-pure’, each with a short sentence instantiating the argument structure of the frame:
multivals: a list of all multi-membered valpods (defined relative to 3,360 lemmas);
univals: a list of all uni-membered valpods (defined relative to 2,970 lemmas);
valtypes_exemplified: a list of the 300 frame types, with a lexval entry for each type, and a short exemplifying sentence, with English translation.
They are all text-files, the last one also in xl format.
The files are as yet not accessible through any web interface.
One resource, to be later described, allows one to see grammatical feature structures associated with the frame/construction types listed below. These types classify entries in NorVal, called lexvals. The download site is:
http://regdili.hf.ntnu.no:8081/typegramusers/menu
The procedure and the facility will be available by late August 2021.
Frame types defined in NorVal
copAdj copAdj-suDECL copAdj-suINTERR copAdv copExpnAdj copExpnN-expnAbsinf copExpnN-expnDECL copExpnN-expnINTERRwh copExpnN-expnINTERRyn copExpnPP-expnDECL copExpnPP-expnINTERRyn copIdAbsinf copIdAbsinf-suAbsinf copIdDECL copIdN copIdINTERRwh copIdINTERRyn copImpersAdjLoc copN copN-suAbsinf copN-suDECL copN-suINTERRwh copN-suINTERRyn copPP copPP-suDECL copPredprtcl copToFind copToughAdj ditr ditrExpnSu-obMeas-expnEqIobInf ditr-iobRefl ditr-iobRefl-obDECL ditr-iobRefl-obEqIobInf ditr-iobRefl-obINTERR ditr-obDECL ditr-obEqIobBareinf ditr-obEqIobInf ditr-obEqSuInf ditr-obINTERR ditr-obINTERRwh ditr-iobRefl-obINTERRwh ditr-obINTERRyn ditr-iobRefl-obINTERRyn ditrObl-oblPRTOFiob ditr-suAbsinf ditr-suDECL ditr-suDECL-obDECL ditr-suDECL-obINTERR ditr-suINTERR ditr-suINTERR-obINTERR impers impersObl-oblN impersPrtcl intr intrAdv intrAdvExpn-expnDECL intrAdvPresnt intrAdv-suDECL intrAuxmodScpr-scSuNrg-scBareinf intrAuxpassScpr-scSuNrg-scPass intrAuxperfScpr-scSuNrg-scPerf intrComp-compINTERR intrComp-suDECL-compINTERR intrExpn-expnAbsinf intrExpn-expnDECL intrExpn-expnINTERR intrLghtScpr-scAdj intrLghtScpr-scPredprtcl intrLghtScpr-scSuNrg-scPredprtcl intrObl-oblN-oblAbsinf intrObl-oblN-oblDECL intrObl-oblN-oblEqOblInf intrObl-oblN-oblEqSuInf intrObl-oblN-oblINTERR intrObl-oblN-oblN intrOblExlnk-oblExlnkAbsinf intrOblExlnk-oblExlnkDECL intrOblExlnk-oblExlnkINTERR intrOblExpn-expnDECL intrOblExpn-expnINTERRwh intrOblExpn-expnINTERRyn intrOblExpn-oblINTERRwh-expnINTERRwh intrOblExpn-oblINTERRwh-expnINTERRyn intrOblExpn-oblINTERRyn-expnINTERRwh intrOblExpn-oblINTERRyn-expnINTERRyn intrObl-oblAbsinf intrObl-oblDECL intrObl-oblEqSuInf intrObl-oblINTERR intrObl-oblINTERRwh intrObl-oblLoc intrObl-oblN intrObl-oblN-ACTIVITY intrObl-oblPRTOFsu intrObl-oblRefl intrOblRais-oblRaisInf intrObl-suAbsinf intrObl-suDECL-oblDECL intrObl-suDECL-oblN intrObl-suINTERR-oblINTERR intrObl-suINTERR-oblN intrPath-suDir-PUREORIENTATION intrPresnt intrPresntDir intrPresntLoc intrPresntObl-oblDECL intrPresntObl-oblINTERR intrPresntObl-oblN intrPrtcl intrPrtclExpn-expnAbsinf intrPrtclExpn-expnDECL intrPrtclOblExpn-expnDECL intrPrtclOblExpn-expnINTERRwh intrPrtclOblExpn-expnINTERRyn intrPrtclOblExpn-oblINTERRwh-expnINTERRwh intrPrtclOblExpn-oblINTERRwh-expnINTERRyn intrPrtclOblExpn-oblINTERRyn-expnINTERRwh intrPrtclOblExpn-oblINTERRyn-expnINTERRyn intrPrtclObl-oblAbsinf intrPrtclObl-oblDECL intrPrtclObl-oblEqSuInf intrPrtclObl-oblINTERR intrPrtclObl-oblINTERRwh intrPrtclObl-oblINTERRyn intrPrtclObl-oblLoc intrPrtclObl-oblN intrPrtclObl-oblPRTOFob intrPrtclOblRais-oblRaisInf intrPrtclObl-suDECL-oblN intrPrtclObl-suINTERR-oblINTERR intrPrtclObl-suINTERR-oblN intrPrtclScpr-scSuNrg-scPredprtclN intrPrtclScpr-scSuNrg-scPredprtclS intrPrtcl-SUSTAINEDACTIVITY intr-RESULT intrScprExpn-scAdj-expnAbsinf intrScprExpn-scAdj-expnDECL intrScprExpn-scAdj-expnINTERR intrScprPrtcl-scSuNrg-scAdj intrScpr-scPredprtcl intrScpr-scSuNrgCsd-scAdv intrScpr-scSuNrgCsd-scPred intrScpr-scSuNrg-scDir intrScpr-scSuNrg-scInf intrScpr-scSuNrg-scN intrScpr-scSuNrg-scPred intrScpr-scSuNrg-scPredprtcl intrScpr-scSuNrg-scPredprtclN intrScpr-scSuNrg-scPredprtclS intr-suAbsinf intr-suDECL intr-suDir intr-suDirTemp intr-suINTERR intrPresnt-RESULT tr trAdv trAdv-obRefl trExpnOb-expnAbsinf trExpnOb-expnCOND trExpnOb-expnDECL trExpnSu-expnAbsinf trExpnSu-expnCOND trExpnSu-expnDECL trExpnSu-expnEqObInf trExpnSu-expnINTERR trExpnSu-expnINTERRwh trExpnSu-expnINTERRyn trExpnSu-obMeas-expnAbsinf trExpnSu-obRefl-expnAbsinf trExpnSu-obRefl-expnDECL trExpnSu-obRefl-expnINTERRwh trExpnSu-obRefl-expnINTERRyn trImpers-obRefl tr-obAbsinf tr-obDECL tr-obDECL-obV tr-obDir tr-obEqBareinf tr-obEqSuInf tr-obEventunit tr-obINTERR trObl-oblN-oblDECL trObl-oblN-oblEqObInf trObl-oblN-oblINTERR trObl-oblN-oblN trObl-obRefl-oblN-oblAbsinf trObl-obRefl-oblN-oblINTERR trObl-obRefl-oblN-oblN trOblExpnOb-expnAbsinf trOblExpnSu-oblEqObInf-expnDECL trOblExpnSu-oblEqObInf-expnEqObInf trOblExpnSu-oblN-expnDECL trOblExpnSu-oblN-expnEqObInf trObl-obAbsinf-oblAbsinf trObl-obAbsinf-oblN trObl-obDECL-oblN trObl-obEqOblInf-oblN trObl-obEqSuInf-oblEqSuInf trObl-obEqSuInf-oblN trObl-obEqSuInf-oblRefl trObl-obINTERR-oblN trObl-oblAbsinf trObl-oblDECL trObl-oblEqObInf trObl-oblEqSuInf trObl-oblINTERR trObl-oblLoc trObl-oblN trObl-oblPRTOFob trObl-oblRefl trObl-obRefl-oblAbsinf trObl-obRefl-oblDECL trObl-obRefl-oblEqObInf trObl-obRefl-oblINTERR trObl-obRefl-oblLoc trObl-obRefl-oblN trObl-obRefl-oblPRTOFob trOblRais-oblRaisObInf trObl-suDECL trObl-suDECL-oblDECL trObl-suDECL-oblEqObInf trObl-suDECL-oblINTERR trObl-suDECL-oblN trObl-suEqObInf-oblEqObInf trObl-suEqObInf-oblN tr-obRefl tr-obRefl-obDir trPath-obDir-ORIENTING trPath-obRefl-obDir trPresnt trPresntDir-obRefl trPresntLoc-obRefl trPresnt-obRefl trPrtcl trPrtclExpnOb-expnDECL trPrtcl-obDECL trPrtcl-obEqSuInf trPrtcl-obINTERR trPrtclObl-obINTERR trPrtclObl-oblEqObInf trPrtclObl-oblINTERR trPrtclObl-oblN trPrtclObl-oblPRTOFob trPrtclObl-obRefl-oblDECL trPrtclObl-obRefl-oblEqObInf trPrtclObl-obRefl-oblINTERR trPrtclObl-obRefl-oblPRTOFob trPrtclObl-obRefl-oblN trPrtcl-obRefl trPrtclScpr-obRefl-scObNrg-scPredprtcl trScprExpnOb-scObNrg-scAdj-expnAbsinf trScprExpnOb-scObNrg-scAdj-expnDECL trScprExpnOb-scObNrg-scAdj-expnINTERR trScprExpnOb-scObNrg-scPredprtclAdj-expnAbsinf trScprExpnOb-scObNrg-scPredprtclAdj-expnDECL trScprExpnOb-scObNrg-scPredprtclAdj-expnINTERR trScprExpnOb-scObNrg-scPredprtclInf-expnAbsinf trScprExpnOb-scObNrg-scPredprtclInf-expnDECL trScprExpnOb-scObNrg-scPredprtclInf-expnINTERR trScprExpnOb-scObNrg-scPredprtclN-expnAbsinf trScprExpnOb-scObNrg-scPredprtclN-expnDECL trScprExpnOb-scObNrg-scPredprtclN-expnINTERR trScpr-obDECL-scPPrefl trScpr-obEqSuInf-scPPrefl trScpr-obRefl-scObLoc trScpr-obRefl-scObDir trScpr-obRefl-scObNrgCsd-scPred trScpr-obRefl-scObNrg-scBareinf trScpr-obRefl-scObNrg-scN trScpr-obRefl-scObNrg-scPred trScpr-obRefl-scPP trScpr-obRefl-scPred trScpr-obRefl-scPredprtcl trScpr-obRefl-scPredprtclInf trScpr-obRefl-scSuNrg-scBareinf-suRAISsuMob trScpr-obRefl-scSuNrg-scInf trScpr-obRefl-scSuNrg-scPred trScpr-scObCsd trScpr-scObLoc trScpr-scObNrgCsd-scPred trScpr-scObNrg-scBareinf trScpr-scObNrg-scBareinf-obRAISsuMob trScpr-scObNrg-scInf trScpr-scObNrg-scN trScpr-scObNrg-scPred trScpr-scPasscmplx trScpr-scPPrefl trScpr-scPredprtcl trScpr-scPredprtclInf trScpr-scSuNrg-scInf trScpr-scSuNrg-scN trScpr-scSuNrg-scPred trScpr-scSuNrg-scPredprtcl tr-suAbsinf tr-suAbsinf-obAbsinf tr-suDECL tr-suDECL-obDECL tr-suDir tr-suDir-obLengthunit tr-suDir-obRefl tr-suEqObInf tr-suINTERR tr-suINTERR-obINTERR