Typecraft v2.5
Jump to: navigation, search

Difference between revisions of "Help:QuickStart"

Line 226: Line 226:
  
 
==Help with Search in the TypeCraft wiki==
 
==Help with Search in the TypeCraft wiki==
 +
The TypeCraft wiki has, like any Mediawiki, a search box, which you find on the navigation bar. You can type search terms into the search box and then choose *SELECT* and  a search results page will open.
 +
 +
If there is no result or the result is not useful, you still can use the Search page area called "Advanced" to specify the search domain. A Mediawiki calls collections of wiki pagers [[WP:namespace|namespaces]]. The advanced option allows you to specify more precisely in which namespace you want the wiki to search.
 +
You could also go to your user preferences page in order to tell the TypeCraft wiki in which namespaces you would like to search when using the wiki search box.
 +
 +
 +
 +
==Search results page==
 +
 +
The intent of the ''search results page'' is to use the newly placed search box to refine a list of results. You do this by writing query commands into the search box. For example, if you want to see more terms highlighted use "OR", and if you want to see all pages with the terms "language" and "annotation, write "language AND annotation"
 +
 +
===Syntax===
 +
The TypeCraft wiki supports the "-" character for "logical not", the AND, the OR, and the grouping parenthesis. Logical OR can be specified by spelling it out in capital letters.
 +
 +
The AND operator is assumed for all terms separated by spaces. Therefore, I you would like to look for the Brazilian Portuguese you have to use the parenthesis. You
 +
write: <code> "Brazilian Portuguese"</code>. If you use <code> Brazilian Portuguese</code> instead it is equivalent to using AND.
 +
 +
Or to say it differently: Double quotes can define a single search term that contains spaces. For example, "noun classes" is now defined such that the space between "noun" and "class" counts as a character, and not like a logical AND.
 +
 
 +
   
 +
'''Exclusion'''
 +
Terms can be excluded by prefixing a hyphen or dash (-), which, as mentioned above,  is "logical not".
 +
For example: <code> -"Brazilian Portuguese"</code> finds all articles with "Brazilian" and "Portuguese" '''except those with the phrase "Brazilian Portuguese".'''
 +
   
 +
'''Wildcard search'''
 +
 +
A wildcard character *, standing for any length of character-string can prefix or suffix a word or string. : the query *stan lists articles containing wprds like Kazakhstan and Afghanistan, while Anno* will list pages with words like: "annotation", "annotating" and "annotator".
 +
  
 
==Help with using the TypeCraft converter==
 
==Help with using the TypeCraft converter==

Revision as of 20:06, 12 October 2014


This is the TypeCraft QuickStart page

For additional information about linguistic online editing with TypeCraft please consult the following TypeCraft wiki pages:

Help with the TC Editors

Annotation using the TypeCraft 2.0 editor

You enter the TC2 editor by clicking on New text on the TypeCraft navigation bar. The editor opens a dialog window with the following message:

Your text will use the new and better TC editor.
If for some reason you want to use the old editor choose  
*No* below. 

The word "Text" in the phrase "Your text" above refers to any piece of digital writing not yet linguisticaly annotated. To annotate this text in the new editor, you now press "yes".

the Editor text area-no text loaded yet

The editor's text area opens (see screenshot to the left). To the right the Metadata matrix is accessible. We offer a default Metadata template and a template for the Norwegian Centre for Writing and Writing Research. (You find a *Change Metadata set* bottom at the end of the template.)

You enter your text into the text area by copying & pasting text from a file or from an online site into the text area. Next you select the text's language in the metadata template and provide the rest of the metadata. You can always come back to this, and fill in missing information.

Back to the text area you now define what we call Phrases for annotation. TypeCraft phrases may be sentences or fragments. Also, text does not need to be annotated sentence by sentence, although that is also possible by .............. ............... .

You select an element for annotation by highlighting it. You then press the New Phrase button which will put the selected element into the Phrase list. In the text area this phrase now appears in green.

Now it is time for starting with the core annotation. You can do that in two ways. You double-click on one of the instantiated elements in the text area (those sentence in green colour), or you open the Phrase list by clicking on *View Phrase list*.

When the phrase opens a dialogue window pops-up with the following message:

 TypeCraft wants to know 
 which of the following options you prefer:
 1. to insert full forms into the table directly (recommended).
      Choosing this option allows you to separate affixes from word stems
      in the input mask below. Insert hyphens "-" or spaces " " to indicate morph 
      or word boundaries and then click OK.
 2. to manually insert words from your phrase into the table.
      For this option, click *Cancel* and an empty table will appear.

The tabular annotation editor

After you have decided whether you would like to work with a pre-filled table which realises your choices of morph boundaries, or you would like to start with a clean table, the tabular editor will open, respectively, pre-filled or empty.

You annotate by navigating through the table. We recommend that you add annotations to the tiers vertically by making use of the space bar. This method is in our experience the fastest.

 To learn more about the tiers and annotation tags and levels go to
         Multi-level linguistic annotation with TypeCraft

The *WORD* and the *MORPH* tiers feature a menu bar which allows you to modify existing entries. The menu bar appears when you activate the field you want to change. From the menu you can also add words or change the word's segmentation. *Gloss* and *POS* tags are chosen from a predefined list. You find an overview over all Gloss and POS tags on your navigation bar. These lists are auto-generated and can be ordered by category at your convenience. The lists also provide short definitions for each tag.

The annotation table is supplemented by a large Note field. Notice that also the content of the note field can be searched, and if you for example use a designated marker to flag sentences that you would like to target by a search, this can be done easily. Is the annotation of a phrase questionable, you could add a question mark to the Note field. A search for "?" in the Note field will then allow you to target by a search only sentences with questionable annotations.

Annotation for Discourse Senses

The new TypeCraft editor allows its users to annotate text fragments in the context of a text, and to annotate words or word sequences for discourse senses (here called for short Senses). For the QuickStart, we describe how you can define text fragments of different length, assign discourse senses, indicate their scope and view them in a text.

   To learn more about the discourse senses in TypeCraft go to
         Multi-level linguistic annotation with TypeCraft


SenseAnno3.jpg
In order to add discourse senses to your annotations, go to your tabular editor. At the bottom of the annotation table you find the function *Add Discourse Sense*. Clicking the function will add a new tier to your annotation table. In this way, the editor can be extended by three additional tiers for sense annotation. At present we offer an experimental set of annotation tags for discourse annotation which can be accessed from the TypeCraft navigation bar.

After you have added a sense tier you can assign a sense tag. In the screenshot to the left, in the leftmost field, the sense tier appears in yellow when you activate it, which you do by clicking on it. Sense tags must be chosen from a list of predefined tags (see sense tag list in the navigation bar). Available tags appear in the drop-down menu which is activated as soon as you start typing into it, as indicated in the screenshot. Here you see the letters "As" , and below the drop-down lists with possible completions: ASSdesc and ASSnar are sense tags that start with "As".

Senses may span over individual or several words. Use your mouse, press the left mouse button, and mark the scope of the sense in its tier. The scope is marked by colour. In the screenshot above you see a blue tier. It indicates the scope length of the "ASSERTION" senses which the user was about to choose when we made the screenshot. The red tier is the scope of a KEYTERM, while the green line indicates the scope of a SPEC(ification). (The two latter sense tags are not visible in the screenshot because of the active drop-down menu for the ASSERTION).


View (Discourse) Senses
ViewSenses.jpg



This function allows you to view the sense annotations of a text by means of colour coding.

Sense annotations consist of a sense tag and a coloured bar that indicates the scope of the sense.

View senses reflects the scope of a discourse sense through coloured lines. Colours are not tied to

specific senses, but assigned freely. Instead, when pointing to a word or a text fragment that instantiates a sense,

the sense name becomes visible in the left upper corner of the *View senses* window.

This is illustrated in the screenshot to the right.





Annotation for Valence

Drop-down window for the attribute Syntactic argument structure

For the annotation of valence, TypeCraft provides a Valence description template. You enter the valence annotation mode from the tabular editor by pressing the button *Change*

to the right of the label Valence which you find above the word- and morph-level annotation table.

An additional annotation window, as shown in the screenshot below, appears and allows you to specify valence attributes using a predefined vocabulary.

Valence Annotation Schema

While the valence annotation schema is still under development, we allow at this point

the input of the following attributes:


Syntactic Argument Structure
Situation Type
Diathesis
Adjunct of Interest
Salient Sentence Pattern
Force & Eventuality
Modality
Sentence Aspect


Each of these attributes has a set of possible values. Some of the values for the attribute Syntactic Argument Structure

are shown to the the right in the way they appear in the drop-down menu.


More about Valence annotation in TypeCraft can be found under:

Multi-level linguistic annotation with TypeCraft


The Valence annotation is highlighted in yellow

When finished with the annotation of valency you return to your tabular annotation editor, where the valence values now appear

as a hyphenated string (as shown in the screenshot to the left) exposing the valency specifications that you have chosen

for the phrase under annotation. This is illustrated in the screenshot to the left; the Valence annotations are highlighted in yellow.









Annotation using the TypeCraft 1.0 editor

We recommend the use of the new and improved TypeCraft 2.0 editor.


Click *My Text* in the navigation-bar. The TC Editor opens. You may now enter or copy-and-paste a text into the left part of the editor window. Do not add morph boundaries at this point. Before you start to tokenize your text, determine the language by going to the *CHANGE* button. TC uses the ISO-639-3 code for languages. Please use the drop-down window to select one of the ISO languages for your text.


Tokenization You can tokenise your text into sentences. This generally works quite well. TypeCraft has at this point still problems with period signs for example in titles like Mr. or Dr. and semicolons. In order to tokenise you text or collection of sentences press *CREATE PHRASES*; this will initiate the tokenization. Inspect the result before you choose *Yes* from the dialogue box. If you have not highlighted parts of your text, TC will ask you whether you would like to tokenize the whole text. Say *Yes*. The tokenization can be repeated several times until you are content with the result.

Morph break-up Select a sentence from the set of tokenised sentences which now have appeared on the right hand side of your editor window. Click on one sentence. This will open a dialogue box. Follow the instructions in the dialogue box to insert morph boundaries into the annotation table.

Annotation Table Navigate through the annotation levels vertically by making use of the space bar. The *WORD* and the *MORPH* tier feature a menu bar which allows you to delete words/morphs which appear when you click on the field in those rows that you would like to change. From the menu you can also add words or change the word segmentation. *Gloss* and *POS* tags are chosen from a predefined list. You find an overview over all Gloss and POS tags on your navigation bar. These lists are auto-generated and can be ordered by category at your convenience.

Help with Search for Interlinear Glossed Texts in the TypeCraft database

Search

Search is operated from the navigation bar of the TypeCraft wiki. In the second from the top information box labeled *typecraft search* you have access to Text search and Phrase search.


Text search

Text search allows you to find Interlinear Glossed Texts in the TypeCraft IGT database.

Search1.jpg

Since TypeCraft data is structured throughout, you can use many different search criteria to find the type of text you are looking for. You also can decide if you would like to look only in your own data, or if you intend a general search in TypeCraft data.

Using Metadata information as search term, you can for example ask for the name of the text owner, when the text was modified last, and of course for the language. Strings or sub-strings of the text title or the title translation can be used directly as search terms.

Valence over Sense annotations as well as Gloss and Part of Speech tags can be used to select texts that contain them. One or several tags in combination can be specified as search terms, and their scope can be defined.

Also strings or sub-strings contained in the Note field can be used to search for texts.

Go to the Text search on your navigation bar to look for the other search options.

The screenshot above shows a partial result for a search of texts that contain thematic annotations; the GLOSS tags BEN(eficiary) and GOAL were used as search terms, and 38 texts were found with 127 instances for the search term GOAL and 154 instances for the search term BEN(eficiary).

Phrase search

Phrase search is equally fine-grained as text search. Next to specifying textual, phrasal, word and morpheme properties in order to inform your search, you can define the scope of your search.

For example: When defining two Gloss tags as search terms you can choose the search scope such that you only look for glosses that specify the same morpheme, as it would be the case for the 3SG and PRES gloss tags relative to the English verb suffix -s in the word goe-s.

You might instead define that certain search terms should occur on the same word, or occur in the same phrase.

As for text search, also the result of a phrase search is displayed showing the number of phrases found and the number of instances that were found for each of the search terms.


Help with Search in the TypeCraft wiki

The TypeCraft wiki has, like any Mediawiki, a search box, which you find on the navigation bar. You can type search terms into the search box and then choose *SELECT* and a search results page will open.

If there is no result or the result is not useful, you still can use the Search page area called "Advanced" to specify the search domain. A Mediawiki calls collections of wiki pagers namespaces. The advanced option allows you to specify more precisely in which namespace you want the wiki to search. You could also go to your user preferences page in order to tell the TypeCraft wiki in which namespaces you would like to search when using the wiki search box.


Search results page

The intent of the search results page is to use the newly placed search box to refine a list of results. You do this by writing query commands into the search box. For example, if you want to see more terms highlighted use "OR", and if you want to see all pages with the terms "language" and "annotation, write "language AND annotation"

Syntax

The TypeCraft wiki supports the "-" character for "logical not", the AND, the OR, and the grouping parenthesis. Logical OR can be specified by spelling it out in capital letters.

The AND operator is assumed for all terms separated by spaces. Therefore, I you would like to look for the Brazilian Portuguese you have to use the parenthesis. You write: "Brazilian Portuguese". If you use Brazilian Portuguese instead it is equivalent to using AND.

Or to say it differently: Double quotes can define a single search term that contains spaces. For example, "noun classes" is now defined such that the space between "noun" and "class" counts as a character, and not like a logical AND.


Exclusion Terms can be excluded by prefixing a hyphen or dash (-), which, as mentioned above, is "logical not". For example: -"Brazilian Portuguese" finds all articles with "Brazilian" and "Portuguese" except those with the phrase "Brazilian Portuguese".

Wildcard search

A wildcard character *, standing for any length of character-string can prefix or suffix a word or string. : the query *stan lists articles containing wprds like Kazakhstan and Afghanistan, while Anno* will list pages with words like: "annotation", "annotating" and "annotator".


Help with using the TypeCraft converter