Difference between revisions of "Parallel Annotation of Speech and Text"
Line 1: | Line 1: | ||
− | = | + | <span style="color:red"> '''This page is under construction'''</span> |
− | '' | + | |
− | Sentences 1 to 3 | + | == Project Description== |
+ | ===Description of the material=== | ||
+ | reference to "Sound to Sense" | ||
+ | |||
+ | Sentences 1 to 3 | ||
+ | |||
+ | Speaker dialect: Bergen | ||
<Phrase>10903</Phrase> | <Phrase>10903</Phrase> | ||
− | <flashmp3>PSTA01.mp3</flashmp3 | + | <flashmp3>PSTA01.mp3</flashmp3> |
− | Download files for viewing in the Praat application | + | |
− | + | Download files for viewing in the Praat application:[[Media:PSTA01.mp3|Sound]], [[Media:PSTA01.txt| TextGrid]] | |
+ | |||
+ | |||
<Phrase>10904</Phrase> | <Phrase>10904</Phrase> | ||
− | <flashmp3>PSTA02.mp3</flashmp3 | + | <flashmp3>PSTA02.mp3</flashmp3> |
− | Download files for viewing in the Praat application | + | |
− | + | Download files for viewing in the Praat application: [[Media:PSTA02.mp3|Sound]], [[Media:PSTA02.txt|TextGrid]] | |
+ | |||
+ | |||
<Phrase>10905</Phrase> | <Phrase>10905</Phrase> | ||
− | <flashmp3>PSTA03.mp3</flashmp3> | + | |
− | Download files for viewing in the Praat application | + | <flashmp3>PSTA03.mp3</flashmp3> |
+ | |||
+ | Download files for viewing in the Praat application:[[Media:PSTA03.mp3|Sound]], [[Media:PSTA03.txt|TextGrid]] | ||
+ | |||
+ | ==Speaker Dialect: Trondheim== | ||
+ | |||
+ | [[Parallel Processing of Speech and Text Data - Part 2]] | ||
+ | |||
+ | ==Speaker Dialect: == | ||
[[Parallel Processing of Speech and Text Data - Part 3]] | [[Parallel Processing of Speech and Text Data - Part 3]] | ||
− | + | ==About the TextGrid files== | |
The TextGrid files are opened together with the matching sound files for viewing in the Praat application. The TextGrid files consist of three tiers, 'Word' (rendered in Bokmål orthography) 'Phoneme' (shows underlying segments) and 'Note' (shows surface realisation with IPA symbols, and other notes). | The TextGrid files are opened together with the matching sound files for viewing in the Praat application. The TextGrid files consist of three tiers, 'Word' (rendered in Bokmål orthography) 'Phoneme' (shows underlying segments) and 'Note' (shows surface realisation with IPA symbols, and other notes). |
Revision as of 20:21, 18 March 2010
This page is under construction
Contents
Project Description
Description of the material
reference to "Sound to Sense"
Sentences 1 to 3
Speaker dialect: Bergen
Jeg |
e |
1SG |
PN |
ser | |
se: | r |
see | PRES |
V |
bildet | |
bild | e |
picture | DEFSG |
N |
kan |
kan: |
canPRES |
V |
du |
ʉ |
2SG |
CL |
si |
si: |
sayINF |
V |
litt |
lit: |
a.little |
ADVm |
på |
po |
onDIR |
PREP |
skrått | |
skro: | t |
diagonal | ADJ>ADV |
ADVm |
ned |
ned |
downDIR |
ADVm |
ovenifra |
ovenifra |
from.aboveDIRSRC |
ADVm |
Det |
de |
3SGNEUT |
PN |
dekker | |
dek: | er |
cover | PRES |
V |
omtrent |
umtrent |
approximately |
ADVm |
hele | |
he:l | e |
whole | DEF |
ADJ |
det |
de |
DEFSGNEUT |
ART |
venstre |
venstre |
left |
ADVm |
mest |
mest |
mostSUP |
ADJ |
altså |
aso |
that.isDM |
ADVm |
venstreste | ||
venstre | st | e |
left | SUPMU | DEF |
ADJ |
kortsiden | ||
kort | sid | en |
short | side | DEFSG |
N |
Hun |
hun |
3SGFEM |
PN |
står | |
sto: | r |
stand | PRES |
V |
med |
med |
withMNR |
PREP |
ryggen | |
ryɡ: | en |
back | DEFSG |
N |
mot |
mut |
againstDIR |
PREP |
veggen | |
veɡ: | en |
wall | DEFSG |
N |
opp |
up |
upDIRMU |
PREP |
og |
o |
and |
CONJC |
ser | |
se: | r |
see | PRES |
V |
på |
po |
atDIR |
PREP |
han |
han |
3SGMASC |
PN |
som |
som |
PNrel |
skal |
skal: |
shallPRES |
V |
kaste | |
kast | e |
throw | INF |
V |
ballen | |
bal: | en |
ball | DEFSG |
N |
som |
som |
PNrel |
står | |
sto: | r |
stand | PRES |
V |
utenfor |
ʉtenfor |
outside |
ADVm |
og |
o |
and |
CONJC |
peker | |
pe:k | er |
point | PRES |
V |
på |
po |
atDIR |
PREP |
boksene | |
boks | ene |
box | DEFPL |
N |
Speaker Dialect: Trondheim
Parallel Processing of Speech and Text Data - Part 2
Speaker Dialect:
Parallel Processing of Speech and Text Data - Part 3
About the TextGrid files
The TextGrid files are opened together with the matching sound files for viewing in the Praat application. The TextGrid files consist of three tiers, 'Word' (rendered in Bokmål orthography) 'Phoneme' (shows underlying segments) and 'Note' (shows surface realisation with IPA symbols, and other notes).
Here is a list of glosses used in the 'Note' tier:
Phonology/Phonetics:
BrV = Segent realised with breathy voice
CrV = Segent realised with creaky voice
DV = Underlying voiced segment realised devoiced
EPN = Epenthesis
RD = Reduction of segment (e.g. corner vowel realised as schwa or plosive as fricative).
V = Underlying non-voiced segment realised voiced
Morphophonology/Syntax
CL = Clitic
Other
ERR = The speaker errs and corrects himself
HES = (Audible) hesitation from speaker