TinyTM - Protocol
The
protocol is the core of TinyTM. The protocol sits between the clients and
the server and defines a common language for all participating applications.
A Simple Example
TinyTM V0.1 uses plain SQL in order to access the server-side data. The
following code shows how to retreive a list of fuzzy matches from the TinyTM-Server:
SELECT *
FROM tinytm_get_fuzzy_matches(
'en', 'de',
'THE EUROPEAN ECONOMIC COMMUNITY',
'', ''
);
So retreiving fuzzy matches is not precisely rocket science as you can see.
We have taken great care when designing TinyTM to keep access to the TinyTM
server as simple as possible.
The following lines represent the result that comes back from the database
as a reply to the initial "tinytm_get_fuzzy_matches" request.
You can see that the TinyTM server returns a list of (original) source
segments, plus their matching score and their translation.
score | source_text | target_text
-------+----------------------------------------+-----------------------------------------------------
96.9 | THE EUROPEAN ECONOMIC COMMUNITY, | DIE EUROPAEISCHE ATOMGEMEINSCHAFT:
96.9 | THE EUROPEAN ECONOMIC COMMUNITY, | DIE EUROPAEISCHE GEMEINSCHAFT:
62.5 | THE EUROPEAN ATOMIC ENERGY COMMUNITY: | DIE EUROPAEISCHE ATOMGEMEINSCHAFT:
59.4 | THE EUROPEAN COAL AND STEEL COMMUNITY: | DIE EUROPAEISCHE GEMEINSCHAFT FUER KOHLE UND STAHL:
(4 rows)
Protocol Layers
Looking one step deeper into the The TinyTM protocol, we can identify two
different "layers":
- Transport Layer
This layer deals with the question: How are data send over "the wire"
from the client to the server?
At the time of writing (2008-04-10), only one transport layer exists:
ODBC.
ODBC (plain database connection) is easy to use and available for almost
all common IT platforms. In particular, ODBC allows us to create TinyTM
prototypes very quickly.
In later phases we will completment ODBC with other transport layers such
as XML-RPC, SOAP etc.
- Functional Layer
The functional layer deals with the question: What actions should be executed
on the server side, and what should be the results?
The functional layer is independent of the transport layer is designed
to change very slowly over time, providing the backbone for the various
TinyTM components to work together.
Functional Protocol V0.1 Calls
The TinyTM protocol V0.1 consists of the following main calls:
- tinytm_new_segment(source_lang, target_lang, source_text, target_text)
Adds a new TM segment to the database.
This is the "short" version of the procedure. It requires the
following arguments:
- source_lang and target_lang: The source and target language in a
2 or 5 letter format such as "en", "en_US" or
"EN-UK". TinyTM does not distinguish between lower and upper
case and between dash ("-") and underscore ("_").
- source_text and target_text: The original text and its translation
in UTF-8 format.
- tinytm_new_segment(segment_key, parent_key, creation_ip, customer_name,
segment_type, text_type, document_key, source_lang, target_lang, source_segment,
target_segment, tag_string)
Adds a new TM segment to the database.
This is the "long" version of the procedure. It performs the
same operation as the short version but adds the following arguments:
- segment_key: A textual description (max 100 characters) of the segment.
- parent_key: Allows to specify that this segment is an edited variant
of another segment. The parent_key is the segment_key of the original
segment.
- creation_ip: The IP address of the client that created the segment.
Allows to trace back creators of rouge segments.
- customer_name: The name of the customer who paid for this translation.
- segment_type: The type of the segment such as "Paragraph",
"Word", "Sentence", ... Default is "Paragraph".
-
text_type: The MIME type of the text. Default is "text/plain".
- document_key: A (max. 1000 character) description of the document
containing the source_text.
- tinytm_get_fuzzy_matches(source_lang, target_lang, source_text)
-> setof tinytm_fuzzy_search_result(score, source_text, target_text)
Retreives the 20 best matches translations for the given source_text.
The call arguments (source_lang, target_lang and source_text) are all
like before. The call returns a "set of" records. Each record
consists of:
- score: The fuzzy score (0.0% .. 100.0%)
- source_text: The source text of the returned segment
- target_text: The desired translation.
- tinytm_get_fuzzy_matches(source_lang, target_lang, source_text,
max_results, tag_string, penalties) -> setof tinytm_fuzzy_search_result(score,
source_text, target_text)
This is the long version of the call before. In addition it takes the
following arguments:
- max_results: The number of returned segments. Default is 20.
- tag_string: A string of komma separated tags. Tags are a kind of
lightweight semantic markup, also known as "folksonomy".
Please see the explanation of tags in the "Fuzzy Matching"
page.
- penalty_string: A textual description of penalties to apply. Please
see the "Fuzzy Matching page for details.
Other Functional Protocol V0.1 Elements
For Protocol V0.1 we have decided to skip additional API calls for auxilary
tables such as languages, groups, segment types etc. Instead, the following
tables are readable for the client:
- tinytm_languages(language_id, language_name)
Contains the list of source- and target languages
- tinytm_segment_types(segment_type_id, segment_type_name)
Contains the list of segment types
- tinytm_groups(group_id, group_name, join_policy)
Contains the list of groups (currently not used)
- tinytm_customers(customer_id, customer_name, not)
Contains the list of customers visible to the authenticated user.
- tinytm_tags(tag_id, tag_name, description)
Contains the list of valid tags.
We will replace the direct access to these tables in the next Protocol
version by a number of Pl/SQL API calls.