Typecraft v2.5
Jump to: navigation, search

Difference between revisions of "Classroom:LING2208 - Annotating Norwegian Bokmål"

Line 106: Line 106:
 
====Reflexive pronouns in Norwegian====
 
====Reflexive pronouns in Norwegian====
  
We find two different reflexive pronoun forms in Norwegian; one with a pronoun, such as "hans" (english: his) or "hennes" (eng: hers), and one without such a pronoun, as in "sin".  
+
We find two different reflexive pronoun forms in Norwegian; one with a pronoun, such as "hans" (english: his) or "hennes" (her), and one without, as in "sin".  
We assume here that the form "hans" can be broken down into the masculine pronoun "han" and the reflexive marker "sin" (which has become cliticized as "s"). This pattern is found in the feminine form "hennes" (henne sin) and the neuter forms "dens/dets" (den sin/det sin), as well as the plural form "deres" (dere sin; eng: their).  
+
We assume that the form "hans" consists of the masculine pronoun "han", and the reflexive marker "sin" (which has become cliticized as "s"). This pattern is found in the feminine form "hennes" and the neuter forms "dens/dets", as well as the plural form "deres" (their).  
  
 
In terms of agreement, the reflexive marker "sin" takes its values of GENDER and NUMBER from its object, as seen in the examples below:
 
In terms of agreement, the reflexive marker "sin" takes its values of GENDER and NUMBER from its object, as seen in the examples below:
Line 118: Line 118:
 
"Mannen sine vinduer" (the man's windows)
 
"Mannen sine vinduer" (the man's windows)
 
<Phrase>41911</Phrase>
 
<Phrase>41911</Phrase>
When the controller "vinduer" is in the plural, then so is the reflexive "sine".  
+
When the object "vinduer" is in the plural, then so is the reflexive "sine".  
  
"Mennene sitt vindu"
+
"Mennene sine vinduer" (the mens' windows)
<Phrase>41912</Phrase>
+
The reflexive "sitt" is in the singular to agree with its object, even though the subject is in the plural. This indicates that it is the object rather than the subject which is the controller that spreads its values of the features GENDER and NUMBER.
The reflexive "sitt" is in the singular to agree with its values controller of the feature NUMBER. This example illustrates nicely that it is "vindu" that it is the controller rather than "mennene" for the features GENDER and NUMBER.
+
  
The same control relation also goes for the reflexive determiner "egen" (own):      
+
The same control relation also goes for the reflexive determiner "egen" (own).      
  
 
"Mannen sitt eget vindu." (the man's own window)
 
"Mannen sitt eget vindu." (the man's own window)
<Phrase>41914</Phrase>
+
"Window" spreads its values NEUTER and SINGULAR to "eget", so that they agree.  
"Vindu" spreads its values NEUTER and SINGULAR to "eget", so that they agree.  
+
  
 
"Mannen sine egne vinduer" (the man's own windows)
 
"Mannen sine egne vinduer" (the man's own windows)
<Phrase>41914</Phrase>
+
When the object "vinduer" is in the plural, then so is the reflexive "egne".  
When "vinduer" is in the plural, then so is the reflexive "egne", even if "mannen" is in the singular.
+
  
"Mennene sitt eget vindu" (the mens' own window)
+
"Mennene sitt eget vinduer" (the mens' own windows)
<Phrase>41916</Phrase>
+
 
In this case the values for NUMBER are reversed, so that "sitt" and "vindu" are in agreement of the value singular, whereas "mennene" are in the plural.
+
This is a purely syntactical analysis for the properties of ''sin'' and ''eget''. Within this, we try to show that agreement only occurs within the noun phrase.
 +
 
 +
===Agreement statistics===
 +
 
 +
The following table describes the distribution of marked gender as glossed on adjectives, and the total distribution of tags for Norwegian Bokmål in TypeCraft. This is compared to the distribution of genders among nouns in the [http://www.hf.uio.no/iln/om/organisasjon/tekstlab/prosjekter/nowac/index.html NoWaC corpus]. The percentages in the first columns represent the ratio of each tag to the total for each count, (i.e: 56% of all nouns are tagged in NoWaC as masculine). The final column contains the compound ratio of the ratio of each gender in entries tagged with ADJ in TypeCraft and the ratio of each gender in entries tagged as nouns in NoWaC. This gives us an indication of whether some genders are more frequently glossed for adjectives than they naturally occur.
 +
 
 +
{| class="wikitable"
 +
|-
 +
! Gender
 +
! Adjectives
 +
! Total for all tags in TypeCraft
 +
! Total for nouns in NoWaC
 +
! Ratio for ADJ to NoWaC
 +
|-
 +
| ''FEM''
 +
| 0 (0%)
 +
| 33 (6.33%)
 +
| 20358360 (16.47%)
 +
| 0%
 +
|-
 +
| ''MASC''
 +
| 13 (21%)
 +
| 302 (58%)
 +
| 69209955 (56%)
 +
| 37.5%
 +
|-
 +
| ''NEUT''
 +
| 49 (79%)
 +
| 186 (35.7%)
 +
| 34026414 (27.53%)
 +
| 286.96%
 +
|-
 +
| Total:
 +
| 62 (100%)
 +
| 521 (100%)
 +
| 123594729 (100%)
 +
| ''N/A''
 +
|}
 +
 
 +
From this data we can see that infinitival gender is overrepresented for adjectives. This is due to feminine and neuter genders (which appear to be equally underrepresented) not being indicated morphologically in adjectives, but rather indicated by their un-inflected base form, infinitival adjectives are inflected with a morpheme. This reflects a tagging convention that is morphologically oriented.

Revision as of 15:31, 18 February 2014

Agreement

The following phrase contains agreement between the noun kjøttbein and the adjectives fint and saftig:

Han så en slakterbutikk, og gikk raskt inn og stjal et fint, saftig kjøttbein fra hyllen.
“He spotted a butcher's shop, and quickly went in and stole a nice, juicy bone from the shelf.”
Han
Han
He1SG
PN
seePAST
Vtr
en
en
3SGMASCINDEF
ART
slakterbutikk
slakterbutikk
butcher.shop
N
og
og
and
CONJ
gikk
gikk
walkSGPAST
V
raskt
raskt
quickly
ADVm
inn
inn
in
ADVplc
og
og
and
CONJ
stjal
stjal
PAST
V
et
et
aINDEFNEUTSG
DET
fint
fint
niceSGINDEFNEUT
ADJ
saftig
saftig
juicySGINDEF
ADJ
kjøttbein
kjøttbein
meat.boneNEUT
N
fra
fra
fromSRC
PREP
hyllen
hyllen
shelfDEF
N

[[1]]

Both adjectives are tagged as being singular and neuter, which corresponds to the head of the NP in which they are embedded; et fint, saftig kjøttbein. Although kjøttbein is only tagged as neuter, its indefiniteness is given by the determiner et, which also agrees with the noun.

The corpus for Norwegian Bokmål available on TypeCraft contains 182 sentences tagged as adjectives, with 60 of them tagged with gender markings, such as in the adjectives discussed.

Clause Linkage

The phrase mentioned above is also a complex clause, consisting of the simple clauses Han så en slakterbutikk and [og gikk raskt inn [og stjal et fint, saftig kjøttbein fra hyllen]]. The complex clause is an adjoined clause, in which the second simple clause contains a conjunction (og), and is coordinated with the first clause. The syntagms are not in a relation of dependency, as no grammatical slot is occupied by one. Therefore, the second syntagm is not embedded, in which case it would fill a grammatical slot.

The second syntagm may itself be divided into two coordinate clauses, which in turm form a coordinate clause itself. All of thee clauses in the sentece constitute syndetic parataxis.

The syntagms describe a series of events in temporal order. The first clause contains the head of the sentence (), and would be grammatical without the rest of the coordinated clauses. Gramatically, the clauses are all linked by tense (past), and the grammaticality would be questionable if they were in different tense. This may be because all of the clauses share the same subject.

--Are Ormberg 13:31, 17 February 2014 (UTC)



AGREEMENT

den innså sin egen dårskap for sent og gikk avsted sulten og trist men kanskje litt klokere
“it realized its own folly too late and walked off, hungry and sad, but perhaps a little wiser”
den
den
3SGCOMMSBJ
PN
innså
inn
 seeVstemPRET
V
sin
sin
REFL3PCOMM
TRUNC
egen
egen
REFLSGCOMM
DET
dårskap
dårskap
foolishNstemnessN>NCOMM
N
for
for
tooDEG
ADVm
sent
sent
lateADJstemNEUT
ADJ
og
og
 
CONJ
gikk
gikk
walkVstemPRET
V
avsted
avsted
aPARTwayN>ADV
ADVplc
sulten
sulten
hungryN>ADJSGCOMM
ADJ
og
og
 
CONJ
trist
trist
sadCOMMSG
ADJ
men
men
 
CONJ
kanskje
kanskje
maybeV>ADVV>ADV
ADV
litt
lit:
a.littleDEG
ADVm
klokere
klokere
wiseADJstemCMPR
ADJ

"Den innså sin egen dårskap for sent og gikk avsted sulten og trist men kanskje litt klokere" [[2]]

The pronoun "den" is an anaphor that picks up its antecedent "hunden", specified for the values "COMMON GENDER" for the feature GENDER, and the suffix "-en" which is specified for the value SINGULAR for the feature NUMBER, as well as 3RD PERSON for the feature PERSON. The values spreading from the pronoun "den" the reflective determiner "egen", as well as the adjectives "sulten" and "trist" are SINGULAR and COMMON GENDER. When it comes to the reflective pronoun "sin" these values, and the value 3RD PERSON are in agreement.


CLAUSE LINKAGE

den innså sin egen dårskap for sent og gikk avsted sulten og trist men kanskje litt klokere
“it realized its own folly too late and walked off, hungry and sad, but perhaps a little wiser”
den
den
3SGCOMMSBJ
PN
innså
inn
 seeVstemPRET
V
sin
sin
REFL3PCOMM
TRUNC
egen
egen
REFLSGCOMM
DET
dårskap
dårskap
foolishNstemnessN>NCOMM
N
for
for
tooDEG
ADVm
sent
sent
lateADJstemNEUT
ADJ
og
og
 
CONJ
gikk
gikk
walkVstemPRET
V
avsted
avsted
aPARTwayN>ADV
ADVplc
sulten
sulten
hungryN>ADJSGCOMM
ADJ
og
og
 
CONJ
trist
trist
sadCOMMSG
ADJ
men
men
 
CONJ
kanskje
kanskje
maybeV>ADVV>ADV
ADV
litt
lit:
a.littleDEG
ADVm
klokere
klokere
wiseADJstemCMPR
ADJ

"Den innså sin egen dårskap for sent og gikk avsted sulten og trist men kanskje litt klokere" [[3]]

The complex clause above consists of two simple clauses;

1: "Den innså sin egen dårskap for sent" 
2: "Den gikk avsted sulten og trist men kanskje litt klokere"

These two simple clauses are connected with the conjunction "and", which often is used to coordinate two or more clauses. In other words we are here dealing with an example of parataxis, in which the clauses are independent of each other (even though they share the same subject). A sign of this is the inflection of the verb contained in these clauses, and that the clauses are quite autonomous, as shown in the breakdown into separate clauses 1 and 2 above. However, they agree in tense (both are in the preterite) which suggests that they are linked temporally. From the semantic content it may seem that the clauses are linked causally, which would imply subordination, or hypotaxis: "Because <Den innså sin egen dårskap for sent>, <gikk den avsted sulten og trist men kanskje litt klokere>", but in my view this complex clause seems to be an example of coordination rather than subordination.

--Eirik Zahl 19:16, 16 February 2014 (UTC)


Agreement

In the course of the story we find two cases of agreement that are different with respect to a single feature. It shows, quite neatly, how agreement works in norwegian and how it affects syntactical composition of Norwegian. In sentence 6 we find this noun phrase [4]:

Han så en annen hund nøyaktig lik ham som holdt et bein i munnen sin.
“he saw another dog exactly like him holding a bone in his mouth.”
Han
han
he3SGMASC
PN
sawPAST
V
en
en
INDEFMASCSG
DET
annen
annen
otherMASC
ADJ
hund
hund
dogMASC
N
nøyaktig
nøyaktig
exactly
ADV
lik
lik
likeCMPR
PRT
ham
ham
himMASC3SGACC
PN
som
som
which
CONJS
holdt
holdt
holdVstemPAST
V
et
et
INDEFNEUTSG
DET
bein
bein
boneNEUT
N
i
i
 
PREP
munnen
munnen
mouthMASCDEFMASCSG
N
sin
sin
 MASCSG
PNposs

En annen hund - Another dog (eng)


In sentence 7, however, we find this noun phrase

Den grådige hunden bestemte seg for at han ville ha dét beinet óg så han knurret i håpet om at den andre hunden i elva skulle miste beinet ut av frykt.
“The greedy dog decided that he wanted that bone too, so he growled in the hope that the other dog would drop the bone out of fear.”
den
den
DEFMASCSG
DET
grådige
grådige
greedyAGRMASCSG
ADJ
hunden
hunden
dogMASCDEFMASCSG
N
bestemte
bestemte
decideVstemPAST
V
seg
seg
self3SGREFL
PNrefl
for
for
 
PRTv
at
at
that
COMP
han
han
he3SGMASC
PN
ville
ville
wouldVstemPAST
AUX
ha
ha
haveINF
V
dét
dét
thatNEUTSG
DEM
beinet
beinet
boneNEUTDEFNEUTSG
N
óg
óg
 
ADV
 
 
Han
han
he3SGMASC
PN
knurret
knurret
growlVstemPAST
V
i
i
in
PREP
håpet
håpet
hopeNEUTDEFNEUTSG
N
om
om
 
PREP
at
at
that
CONJ
den
den
DEFMASCSG
DET
andre
andre
otherDEF
ADJ
hunden
hunden
dogMASCDEFMASCSG
N
i
i
in
PREP
elva
elva
riverFEMDEFFEMSG
N
skulle
skulle
shouldPAST
AUX
miste
miste
dropINF
V
beinet
beinet
boneNEUTDEFNEUTSG
N
ut
ut
 
PREP
av
av
 
PREP
frykt
frykt
fearMASC
N

Den andre hunden - The other dog (eng)


It should be relatively clear that the only difference between the two noun phrases is one of definiteness. In both cases the controller is the word hund, which means dog and is the head of the phrase. The noun phrase, accordingly, is the domain of agreement. The word hund in itself carries only the feature of masculine (MASC), and definiteness is impossible to determine through this word alone. However, an indefinite article has been chosen, namely en, and thus renders the noun indefinite. En becomes a target for the controller and agrees with the feature MASC. Therefore it carries the two features MASC and indefinite (INDEF). The adjective annen, which means other in English, is also a target for the controller and therefore has to agree in both the features MASC and INDEF.

This can be seen by comparing it to the other noun phrase in sentence 7. Here the word hund has gained the additional morpheme -en. This is the definite article in Norwegian, and so the word now holds two features in itself, namely MASC and DEF. An interesting point is that there is still a preceding article den which also marks definiteness, irrespective of the presence of the definite suffix. This is called double definiteness, and it surfaces when the noun is modified by an adjective. Regardless, this den is affected by the controller and gains the feature MASC. The adjective is also affected by the controller and gathers the features of MASC and DEF. Because of this, it changes form from annen to andre, which is a definite form of the word.

--Anders Lynghaug Haugen 21:49, 16 February 2014 (UTC)

Clause Linkage

There are a number of different forms of clause linkage that can be found throughout the story. Let ut first look at sentence number 2 [5]:

Han spanet et slakterhus og smatt raskt inn og stjal et stort, fint, saftig bein fra hyllen.
“He spied a slaughterhouse and snuck quickly inn and stole a big, fat, juicy bone from the shelf.”
han
han
he3SGMASC
PN
spanet
spanet
spyVstemPAST
V
et
et
INDEFNEUTSG
DET
slakterhus
slakterhus
slaughterhouseNEUT
N
og
og
 
CONJC
smatt
smatt
snuckPAST
V
raskt
raskt
quickADJ>ADV
ADV
inn
inn
 
PREP
og
og
 
CONJC
stjal
stjal
stole
V
et
et
INDEFNEUTSG
DET
stort
stort
bigAGRNEUTSG
ADJ
fint
fint
niceAGRNEUTSG
ADJ
saftig
saftig
juicy
ADJ
bein
bein
boneNEUT
N
fra
fra
 
PREP
hyllen
hyllen
shelfMASCDEFSG
N

[Han spanet et slakterhus] og [smatt raskt inn] og [stjal et stort, fint, saftig bein fra hyllen]


The brackets here mark the boundaries of the clauses, whether indepedent or embedded in the sentence. In this sentence we can see by the bracketing that we have three clauses within the sentence. They need to be in this order because of temporal and causal resttrictions, but syntactically speaking, they are independent of one another. In this sense, they are parallel one another, and this is marked through the use of og, which acts as a coordinating conjunction. This is a phenomenon called parataxis.

For another form of clause linkage, let's again look at sentence 6 [6]. Here we find this noun phrase:

En annen hund [[nøyaktig lik ham] som hold et bein i munnen sin]

The linked clause is the part between the brackets. This clause is complex because the phrase en annen hund works fine on its own. The part in the brackets is thus a modifying element. Nøyaktig lik ham is an adjectival expression which can be overlooked for this purpose. However, the part som holdt et bein i munnen sin is rather important, because it is initiated through the use of the subordinating conjunction som. This results in a downgrading which causes this clause to lack a subject, because this is taken to be the head of the nound phrase it is subordinated.

Yet another form of clause linkage is found in sentence 3 [7]:

[Mens han tygget lykkelig på beinet] sprang han inn i skogen.

In this example, the clause within the brackets is completely adverbial. It is not needed by the main verb, which is sprang, and is therefore not embedded in the sentence. In spite of this it is introduced by a conjunction fucntions as a temporal adverb. Because it is subordinate to the main event. This would a type of clause linkage that could be said to be halfway between parataxis and embedding.

Finally we have sentence 7 [8]. Here we fin this portion:

Den grådige hunden bestemte seg for [at han ville ha det beinet óg...]

In this complex clause, the part within the brackets is completely embedded within the sentence. This is because the clause within the brackets is absolutely necessary to fulfill the valency of the main verb bestemte seg for, i.e. it fills a grammatical slot predicated by the main verb. The clause acts as a complement to the verb and is therefore an embedded clause and totally dependent on the main verb in this sentence. It is the subordinating conjunction at that introduces the embedded element.

--Anders Lynghaug Haugen 21:49, 16 February 2014 (UTC)




Reflexive pronouns in Norwegian

We find two different reflexive pronoun forms in Norwegian; one with a pronoun, such as "hans" (english: his) or "hennes" (her), and one without, as in "sin". We assume that the form "hans" consists of the masculine pronoun "han", and the reflexive marker "sin" (which has become cliticized as "s"). This pattern is found in the feminine form "hennes" and the neuter forms "dens/dets", as well as the plural form "deres" (their).

In terms of agreement, the reflexive marker "sin" takes its values of GENDER and NUMBER from its object, as seen in the examples below:

"Mannen sitt vindu." (the man's window)

[[9]] "Window" spreads its values NEUTER and SINGULAR to "sitt", so that they agree.

"Mannen sine vinduer" (the man's windows)

When the object "vinduer" is in the plural, then so is the reflexive "sine".

"Mennene sine vinduer" (the mens' windows) The reflexive "sitt" is in the singular to agree with its object, even though the subject is in the plural. This indicates that it is the object rather than the subject which is the controller that spreads its values of the features GENDER and NUMBER.

The same control relation also goes for the reflexive determiner "egen" (own).

"Mannen sitt eget vindu." (the man's own window) "Window" spreads its values NEUTER and SINGULAR to "eget", so that they agree.

"Mannen sine egne vinduer" (the man's own windows) When the object "vinduer" is in the plural, then so is the reflexive "egne".

"Mennene sitt eget vinduer" (the mens' own windows)

This is a purely syntactical analysis for the properties of sin and eget. Within this, we try to show that agreement only occurs within the noun phrase.

Agreement statistics

The following table describes the distribution of marked gender as glossed on adjectives, and the total distribution of tags for Norwegian Bokmål in TypeCraft. This is compared to the distribution of genders among nouns in the NoWaC corpus. The percentages in the first columns represent the ratio of each tag to the total for each count, (i.e: 56% of all nouns are tagged in NoWaC as masculine). The final column contains the compound ratio of the ratio of each gender in entries tagged with ADJ in TypeCraft and the ratio of each gender in entries tagged as nouns in NoWaC. This gives us an indication of whether some genders are more frequently glossed for adjectives than they naturally occur.

Gender Adjectives Total for all tags in TypeCraft Total for nouns in NoWaC Ratio for ADJ to NoWaC
FEM 0 (0%) 33 (6.33%) 20358360 (16.47%) 0%
MASC 13 (21%) 302 (58%) 69209955 (56%) 37.5%
NEUT 49 (79%) 186 (35.7%) 34026414 (27.53%) 286.96%
Total: 62 (100%) 521 (100%) 123594729 (100%) N/A

From this data we can see that infinitival gender is overrepresented for adjectives. This is due to feminine and neuter genders (which appear to be equally underrepresented) not being indicated morphologically in adjectives, but rather indicated by their un-inflected base form, infinitival adjectives are inflected with a morpheme. This reflects a tagging convention that is morphologically oriented.