TBX, or TermBase eXchange, is the international standard for representing and exchanging information about terms, words, and other lexical data. It defines a family of formats that share a common structure and a limited range of information types. Each member of the family is called a dialect of TBX. The main purpose of TBX is to ensure that your lexical data can be used in different software applications. This separation between data and software provides benefits to authoring and translation activities, including protection, consistency, and interoperability.
In 2008, TBX was developed and published by the Localization Industry Standards Association (LISA), and by the International Organization for Standardization (ISO) as ISO 30042. It is available at no charge here (TTT.org) or here (GALA website) and can also be purchased from the ISO website.
- Software mobility. If you have a terminology database, or are thinking about having one, then you need TBX as an export/import format. It gives you the freedom to move your data to whatever terminology management system you want. Having software choices is in your interest.
- Authoring. Do you want to drive consistency of words and terms at the authoring stage? Do you also translate your content? Then you will need to move your terminology data between your authoring and translation tools. TBX is the interchange format.
- Translation. Today, translation usually involves the use of computer-assisted translation (CAT) tools. If you want to be able to integrate your terminology data into any CAT tool, you need TBX as an import/export format. Without TBX, you could be “locked” into one specific CAT tool, which could restrict your choice of language service providers (LSPs).
- Data Mining. If your database can be exported to TBX format, then it can be analyzed by a wide range of applications using standard XML processors. With TBX, the reports you can generate are limitless.
- Peace of mind. TBX is an ISO standard. The value and importance of ISO standards are well documented.
TBX is an exchange format
TBX is an XML-based terminology exchange format, designed to make terminology databases easier and safer to maintain, distribute, and use.
TBX separates data from software
The TBX format is not dependent on any particular software application. TBX ensures that your termbase can be equally accessible via any software you prefer to use to access, display, update, or process your terminology.
Because TBX does not use a proprietary format, if you want to start using different termbase software, you can easily migrate your terminology. Any software with TBX support that you use will be able to access your termbase, leaving you free to change or update software while safeguarding your valuable termbases.
The TBX format is based in XML and encoded in Unicode, so it is even accessible by a text editor.
TBX protects data assets
Proprietary termbase file types can be lost if the associate software stops working because of technical or licensing issues. TBX doesn't have this risk—because TBX is a standardized, open-source file type, it can easily be read by any compatible tools or any number of open-source utilities.
TBX facilitates software interoperability
The work of translation can require different software for different tasks. You may use one program to view, add and edit terms. Contracted translators may need different software to use the termbase as a reference or integrate the termbase into an automatic terminology lookup tool.
Because TBX is standardized and open-source, many termbase software packages easily support the format. TBX serves as a good medium for storing terminology in a way that can be accessed by any necessary means.
TBX allows terminological consistency
Translation requires consistent terminology usage for technical accuracy. Whether your translations are outsourced or done in-house, TBX allows for consistency across multiple projects because of its ease of use.
Using TBX, you can distribute necessary terminology for use by software for authoring, translation or quality control. All compatible programs can access the same data and changes can be synchronized, minimizing risk of inconsistency in translation.
Projects involving multiple translators will benefit from TBX, which allows them to share current terminology in a standardized way. TBX can assure accurate and consistent translation across the entire project.
Ten Design Requirements
We propose ten design requirements necessary for an effective terminology exchange format, which come as the result of many years of discussion of theoretical and practical concerns. We made TBX to follow each one as closely as possible. TBX, to be an effective terminology exchange format, must be:
At the very least, any exchange format should be viewable in a simple text editor. By default, TBX uses UTF-8 encoding and can be viewed in any text editor or XML viewer.
- Platform Independent
TBX is not tied to any particular software package or operating system. It is based in XML, so anything from compatible termbase software to handwritten custom scripts can access the termbase as needed.
- Compliant with the Terminology Markup Framework (ISO 16642, 2003)
This compliance ensures the standardization necessary for the termbase to be easily transferrable between compatible software. TBX draws its fundamental data elements from elements in the TMF meta-model.
- Tied to ISOcat
This international standard, the Data Category Registry for terminology and other language resources, encourages alignment between platforms for the type of data represented. TBX requires that all data categories be taken from ISO 12620.
- Consistent with Data Elementarity
This means that each category in the termbase should contain one piece of information. For example, a full term and its corresponding acronym should have separate, linked entries so that both can have distinct clarifying notes. TBX provides for each data category to have separate elements, while keeping these related concepts easily connected.
- Term Autonomy
This means that every term should have its own section and not be included as a note to another term. Synonyms, acronyms, and similar data should be included as a linked term, again, to allow each to contain all necessary clarifying data in the termbase. TBX supports synonyms and such related words as terms with their own XML elements.
The TBX standard framework allows for different dialects which can be suited to the needs of the individual user. These dialects are defined in XCS files which contain information about the data categories used and the information they contain. Because TBX allows for dialects, it can be modified to fit any needs, and because the TBX framework states how dialects should be formed, it preserves interchange between termbases.
- Capable of supporting automatic standard compliance checking
Because TBX is based on XML, any general-purpose validating XML parser can check and validate TBX form. DTD files can be used to check compliance of the termbases to the TBX dialect specifications. The ease of checking compliance facilitates acquisition and integration of TBX termbases.
- Capable of nearly blind exchange
Terminology exchange formats should allow terminology to be exported and imported with as little outside information as possible. TBX can define exactly what kind of data is expected in each category, and the more specific a dialect of TBX is, the easier it is to exchange between parties. The TBX-Basic dialect allows for relatively blind exchange.
- Recognized by relevant standards bodies
TBX is an ISO standard (ISO 30042).
This page is an adaptation of "TBX: A terminology exchange format" by Dr. Alan K. Melby, 2015.
Last updated: June 13, 2017 at 13:22 pm
© 2017 LTAC Global