Difference between revisions of "Computational Lexicography"
(→Tools) |
m (→Tools) |
||
Line 3: | Line 3: | ||
* Toolbox [http://www.sil.org/computing/download] | * Toolbox [http://www.sil.org/computing/download] | ||
* LexiquePro [http://www.lexiquepro.com/] | * LexiquePro [http://www.lexiquepro.com/] | ||
− | * WeSay [http://www.wesay.org] | + | * WeSay [http://www.wesay.org] help files http://wesay.org/wiki/Help_And_Contact |
* ATP3 | * ATP3 | ||
+ | |||
+ | All these four programs use a file with MDF markup to encode the dictionary information. | ||
+ | http://www.sil.org/computing/shoebox/MDF.html | ||
+ | |||
+ | The e-book http://www.sil.org/computing/shoebox/MDF_2000.pdf gives lexicographic principles and explains the markup. http://wiki.lingtransoft.info/tutorials/mdf gives a summary on the use of individual markers. | ||
+ | |||
+ | The order of the marker is not fully fixed in the MDF convention. LIFT is an XML format in which MDF formatted dictionaries may be encoded after some fixes. | ||
+ | |||
+ | LIFT (Lexicon Interchange FormaT) is an XML format for storing lexical information, as used in the creation of dictionaries. It's not necessarily the format for your lexicon. That can be tied to whatever program you're using. But LIFT allows you to move that data between programs (hence the term 'interchange'). | ||
+ | |||
+ | LIFT is also a decent archiving option. Not because it will be around in 50 years, but because people will still be able to read it with any text editor and easily make use of it, even then. (You think that's true of your non-SOLID Standard Format file? We should have a chat.) | ||
+ | |||
+ | LIFT has been designed to have a long life but also to be relatively easy to convert to and from existing lexicon formats, particularly Multi-Dictionary Formatter (MDF) and FieldWorks Language Explorer. | ||
= File formats = | = File formats = |
Revision as of 15:58, 25 November 2009
Tools
- Toolbox [1]
- LexiquePro [2]
- WeSay [3] help files http://wesay.org/wiki/Help_And_Contact
- ATP3
All these four programs use a file with MDF markup to encode the dictionary information. http://www.sil.org/computing/shoebox/MDF.html
The e-book http://www.sil.org/computing/shoebox/MDF_2000.pdf gives lexicographic principles and explains the markup. http://wiki.lingtransoft.info/tutorials/mdf gives a summary on the use of individual markers.
The order of the marker is not fully fixed in the MDF convention. LIFT is an XML format in which MDF formatted dictionaries may be encoded after some fixes.
LIFT (Lexicon Interchange FormaT) is an XML format for storing lexical information, as used in the creation of dictionaries. It's not necessarily the format for your lexicon. That can be tied to whatever program you're using. But LIFT allows you to move that data between programs (hence the term 'interchange').
LIFT is also a decent archiving option. Not because it will be around in 50 years, but because people will still be able to read it with any text editor and easily make use of it, even then. (You think that's true of your non-SOLID Standard Format file? We should have a chat.)
LIFT has been designed to have a long life but also to be relatively easy to convert to and from existing lexicon formats, particularly Multi-Dictionary Formatter (MDF) and FieldWorks Language Explorer.
File formats
2008: Conference on Language Resources and Evaluation Trippel et al.: Lexicon schemas: Lexicon Schemas and Related Data Models: when Standards Meet Users
http://www.lrec-conf.org/proceedings/lrec2008/slides/812.pdf
Conclusion:
- all schemes are implementations of LMF
- Interchange results in loss of implied information
- Tools lack support for interchange