You are here
Knows common markup languages for tagging lexical data
Lexicography (creation of dictionaries) is a vast field with many specialties. Lexicons are, by nature, structured documents. various standards have been proposed, used, and abandoned over the years. A specialist needs to know current standards used in the entity, so that data can be massaged back into line when broken. Higher competency covers knowledge of the converting between different and competing standards.
Is able to teach the markup standards for translation data and lexicons.
The following activities have been identified to achieve comptency in Lexical Markup in levels from Learner to Expert.
Identify the markup commonly used in language data for your entity (XML, SFM, LIFT, etc.)
Identify the specific varieties of these markup standards used in language data for your entity (standard vs. alternate MDF, MXB SFM, PLB SFM, etc.) and understand the key differences.
Use a SFM cleanup tool (e.g. SOLID) to normalize or restructure lexical data.
Teach practitioners about the SFM structures and cleanup tools.
Tasks relevant to Lexical Markup
- Create and edit a lexical database
- Create and edit a corpus of natural texts
- Find text corpus words in the lexicon
- Run a concordance search on a text corpus
- Transcribe audio/video with time-aligned annotations
- Interlinearize natural texts
- Test a text corpus against the grammar rules
- Publish linguistic data in a structured way