27 September 2005

Internet-based Sharing of Translation Memory

New solutions that are changing translation and localization workflows

GARRY LEVITT
(Garry Levitt is a localization project manager at LinguaPoint. He can be reached at garry_levitt@linguapoint.de )

Internet-based translation memory (TM) sharing will constitute the next big cost-saving improvement to localization workflow since the introduction of translation memory itself. This article explains the extent of the changes to current localization workflow that will be brought about by the widespread introduction of this new feature.

TM Sharing Explained

TM sharing enables two or more translators to translate the files of one and the same project, while using the same TM in order to retrieve previously translated segments — also called "fuzzy matching." We can distinguish between two different types of TM sharing. First, a Master TM can be shared internally over the vendor's own local area network (LAN), which would probably be the case if several in-house resources were working on the same project. This procedure is supported by various translation-tool packages such as TRADOS Team Edition 6 or TM Server.

But with tools such as Logoport and T-Remote Memory, several translators in different locations can also share a single TM. According to the developers of T-Remote Memory, Telelingua Software, their application is not another TM software package but an add-on that can be used in conjunction with any existing TM solution. Translators only require an Internet connection in order to connect to a "communication server," which transmits and collects all data provided by the connected TMs. This second type of TM sharing, also referred to as telesharing, represents the latest innovation in translation tool development. A single TM (and in the case of T-Remote Memory, one or more TMs) is fed with the translations of different translators. As the TM grows, translators can not only call upon their own translated segments in the TM, but also upon the translations of the other translators involved on the project — in real time.

TM Sharing or Multi-user Functionality?

In reality, Internet-based TM sharing had been around for quite some time before Logoport and Telelingua decided to market their solutions, albeit in a slightly different form. ForeignDesk is a translation tool that has provided support for TCP/IP-based connection of translators for a number of years through its "multi-user functionality." The difference with ForeignDesk, which has recently been released by Lionbridge as open source, lies in the basic functionality of the tool. There is no use of TMs, but of "projects" that already contain legacy material that has been leveraged from previous projects. Yet the basic principle remains the same. Translators can connect their project with the projects of the other translators via TCP/IP and can thereby share their translations. The main drawback with this type of telesharing is that all translators involved will have to be on-line and working at the same time in order to achieve the greatest benefit. This almost rules out a situation in which translators from totally different parts of the world can work together on one project. Unless, of course, they pay a flat rate for their Internet connection and can leave their computers on overnight, which will coincide with someone else's working day in another time zone. It could be argued that this particular tool is more suited for regional teams of translators rather than global teams.

TM Sharing vs. TM Exchange

The benefits of TM sharing can best be illustrated by first explaining the workflow of a localization project without the use of this technology. The increasingly short turnaround times for localization projects — necessary if we are ever to reach anything close to real simultaneous shipment of localized products — requires a language services provider (LSP) to divide a project up among several freelance translators, assuming the necessary resources are not available in house. Otherwise, the first type of TM sharing, a network TM, could be used. In order to limit the risk of inconsistencies creeping into the localized texts, translators are asked to exchange their individual TMs or translated bilingual files.

The process of TM exchange includes the need for translators to then import these memories into their own TM as they would with TRADOS. If they are exchanging translated bilingual files, all they need to do is update their TMs with these translations by "cleaning" the files into their translation memories. This memory-synchronization step can be especially cumbersome if you are dealing with very large TMs. Furthermore, translators or editors are often faced with the task of changing their translation a number of times over in order to be consistent with the translation that other translators have used.

TM exchange can go a long way in ensuring a certain degree of consistency when several translators are involved on one project. TM sharing, on the other hand, can really come to grips with the issue of consistency throughout a project, as inconsistencies are unlikely. If another translator has already translated a similar or identical segment, the Master TM will provide this translation, thereby doing away with the need for "manual" TM exchange.

Client-side Benefits

There is an increasing demand for the deployment of network TMs to enable further cost reductions and to allow for faster time-to-market, especially where large projects are concerned. In the current economic downturn, clients are constantly seeking to reduce localization costs as budgets dwindle. Many refuse to pay for proofing of 100% matches and repetitions or demand reduced rates for certain fuzzy-match brackets. Against this background, TM sharing can be seen as yet another hole in the belt that is slowly being tightened around localization. As clients become more and more aware of the benefits of TM sharing and the use of network TM functionality in several current tools, they will also come to expect their localization partners to adopt the use of this technology. This should not be a problem for most multilanguage vendors (MLVs), but survival is becoming increasingly hard for some of the middlemen of the cascaded localization production chain, namely, the single-language vendors (SLVs) and smaller agencies, as the MLVs pass on this requirement to their subcontractors.

Currently, an MLV may require an SLV to call on the services of several translators when carrying out a large localization project. The division of files among several translators can result in a loss of fuzzy matches, as every translator will only be translating a part of the total project with his or her own stand-alone TM. Depending on how large the loss of fuzzy matches is, an MLV may agree to foot the bill for this loss as until now there has been no other alternative than to choose a conventional multi-translator approach without TM sharing.

TM sharing will soon become the standard, as MLVs will refuse to subcontract projects to SLVs who either don't have TM sharing incorporated into their localization workflow or who can not allocate the necessary in-house resources to the project. This will not only be because of the risk to the overall consistency of the project and the longer turnaround time, but because MLVs and clients will no longer be willing to pay for the loss of fuzzy matching, which would result from the use of stand-alone TMs. SLVs will have the choice between two courses of action. They can decide to integrate TM sharing into their localization workflow and show themselves to operate on the cutting edge of localization. Or they could choose to ignore the developing trend and view the investment in new tools and software licenses as unnecessary or postponable. This, however, is not an option as they risk missing the boat altogether when MLVs decide to look for other subcontractors. And if there is a market, someone else will soon be willing to oblige.

Share and Share Alike

But convincing translators to accept TM sharing as the right way forward may take time. Before the advent of TM some years ago, translators were paid a fixed word rate. Understandably, it took them some time to adapt to the idea of being paid a reduced rate for repetitions, 100% matches or fuzzy matches. However, translators have gradually fallen into line with the use of TM as the benefits of this technology became clear. So how will translators react to TM sharing? Some translators today already draw the line at a staggered delivery of their translated files with intermediate synching of TMs. They want to be paid in accordance with the initial analysis of the translatable files and do not want to see their workload and, consequently, revenue reduced by an exchange of TMs. For the sake of consistency, most agencies will agree to pay the translators for the initial word count even if the real word count is less through the sharing of TMs.

Other important psychological factors will play a role in determining a translator's acceptance of TM sharing. An element of mutual trust and professionalism is essential for the success of these projects. If, for example, we are dealing with a highly repetitive set of files, one translator may decide to delay work on the project until the first set of files has been translated by another translator, and, therefore, the first translator benefits from the likely additional fuzzy matching. Even if the project text is less repetitive in nature, the translator will probably have saved himself or herself quite a bit of terminology research, as key terms will have already been translated.

Match Made in Heaven?

Furthermore, translators will most likely not approve of having to base their translations on the "bad translations" of other translators that will crop up as fuzzy matches. They will either have to grin and bear the translations of colleagues in their own files for the sake of consistency or totally rewrite fuzzy matches that came from other people's translations. In the first scenario, translators might be risking their reputations because in-house editors will not know who made the original mistake. In the second scenario, translators are likely to need more time to complete their work if they are to correct the possible mistakes of their colleagues, for the TM will contain all the translations from various translators before the files have been proofed and edited. This is an important issue, as the translation turnaround time may have been increased at the expense of the actual translation quality. One mistake by a single translator could have repercussions in numerous other files that are being processed by other translators.

Calculating Payment

It still remains to be seen whether vendors and translators will get paid for a fixed word count as described in the analysis of their files. One thing that we can be sure of is that the advantage will shift more in the direction of the vendor and the client as we slowly move from mere TM exchange to full-fledged TM sharing. However, an important condition for this shift will be the capability to indicate objectively and in concrete figures the net contribution of every translator working on the project.

A way may soon be created to monitor a translator's work and progress during a given project, including the number of words and type of matches that have been translated. In most translation tools, each translated segment already receives a unique identification of the person that created the translation. On project completion, the Master TM could perhaps be filtered on the basis of a translator's identification, and a personal log could be created by means of a logging system. These logs would accurately reflect the real number of translated words per translator and could be the basis for payment.

This solution, however, would mean that translators would be continuously confronted with the problem of not knowing how long it will take them to complete the job and how much they stand to earn. This would depend on many factors such as the number of assigned translators, their individual speeds and the number, time and lengths of the breaks that they take. If one translator takes a break, the Master TM will have grown considerably and will influence the remaining workload for all translators — especially the first, who will have seen his allocated chunk shrink right before his very eyes. Furthermore, translators would no longer be able to divide up their own time, as the progress of the other translators would constantly be influencing everyone's workload and the required time until completion. This could certainly be a problem in texts with a high degree of potential fuzzy matching. Frequent progress reports will also have to be made available to the translators for their information, in order to prevent them from having to turn down projects as their planning becomes increasingly difficult.

How far can we afford to go in our bid to make translation more cost effective? Increased uncertainty is a high price to pay for translation cost reduction and may not be conducive to higher quality — consistency or no consistency.

Control and Security

The issue of increased control and security through TM sharing is interesting and requires further attention, as it may be a double-edged sword. On the one hand, translators will no longer have a copy of any valuable TM that they have been feeding their translations into. Translation capital is protected as the TM resides on a local server to which translators have only limited remote access. On the other hand, if this Master TM were ever to become corrupt outside of the LSP's office hours, translators — possibly in a different time zone — would not have a copy of the TM available to them to carry on working, as would have been the case with a stand-alone TM. Neither would they be able to try any function, such as the TRADOS "reorganize" function, which can often solve the most frequently occurring TRADOS TM problems. With TM sharing, this option would probably not be open to translators. They would have to stop working for the rest of the day until an engineer on the LSP's side could be contacted.

Also, with all of the agency's TMs residing on a server, this could spell potential disaster for a large number of projects in the event of server down time. But there would actually be no more need for the translators to send the translated files back to the vendor, as the translation could also just be created from scratch by pre-translating the source files with the updated Master TM. Translators would not be required to resend their files if these became corrupt or otherwise problematical. The LSP could simply automatically recreate the translation by translating the source files with the updated Master TM.

On some projects, translators are required to use special settings for their stand-alone TMs and are sent a list of instructions. With the use of TM sharing, any special settings can be carried out by the LSPs themselves without the need for any intervention by the translator and thereby avoiding any potential mistakes.

Key Solution Providers

Among the key providers of TM solutions that offer support for simultaneous TM access over LAN are STAR with its Transit XV, TRADOS with TRADOS 6 LSP and TM Server, Atril with Déjà Vu X and SDL with SDLX Enterprise Server 2003 and SDLX for UNIX.

While the various TRADOS 6 versions (Freelance, LSP and Power LSP) are targeted mainly at freelance translators and language service providers respectively, the TRADOS TM Server, the next-generation TM server technology and flagship version, is more suited to the needs of global corporations, although it is also used by service providers requiring the extra benefits of this client/server version. TRADOS also plans to introduce a version for the Internet by the end of 2003.
Déjà Vu X is the latest release of Atril's translation tool, which includes Editor, Standard, Professional and Workgroup versions. The differences among these versions include the number of translation and terminology databases that each can access simultaneously over a LAN. Atril is also working on a Web TM server (TM Remote Server) to enable Internet-based TM sharing. It is expected to be available in the second half of 2003.

In addition to the companies providing solutions with support for TM access over LAN, Logoport Software's and Telelingua's products also support database access via an Internet connection.

Telelingua's T-Remote Memory recently became commercially available. It allows users situated anywhere in the world to simultaneously share both TMs, terminology databases and MT systems, provided that these can be queried remotely via APIs or Web services. T-Remote Memory is available in Standard and Enterprise versions as well as in a Leased version that enables companies to handle unexpected peak work flexibly.

Logoport is a fully Web-based TM solution developed by the German company Logoport Software GmbH. Rather than directly purchasing the software, translators, agencies and other companies can lease an amount of time on the Logoport system at short notice. The leased capacity can be adjusted with the current volume of work, and users are charged for the net time that they have been using the system. A company can also acquire licenses to install the Logoport server on its internal network.

Conclusion

TM sharing has the potential to literally turn the translation industry upside down. We have witnessed the introduction of TM at a time when translators did not avail themselves of more than a computer and a word processing program and have now reached a point where translators in the localization industry cannot afford to work without a TM tool. The widespread introduction of TM sharing could reverse this trend completely. Many freelance translators will no longer need to own a TM tool themselves but will log on to their client's system and hitch a ride on the TM solution that the client has implemented.

All clients are interested in cost efficiency, quality and time-to-market. TM sharing can help clients and vendors realize improvements in all three areas. The localization workflow paradigm is set to change. Clients, vendors and eventually translators will welcome this change as it provides benefits for all and creates room for the language industry to expand even further. Translation is likely to be targeted for further cost reductions for quite some time to come. TM sharing can help relieve some of the tension surrounding price arrangements, which are slowly developing into a bone of contention among clients, MLVs and SLVs. Progress is inevitable, and TM sharing is as progressive as it gets.

Comparison of Key Features and Benefits
=========================================

Déjà Vu X Standard, Professional and Workgroup
--------------------------------------------------------------------------------
Nature of the product: CAT tool
Simultaneous database access over Internet: No
Simultaneous database access over LAN: Yes
Increased consistency when working with several translators: Yes, but only in-house translators
Centralized management of project configuration:
Wider choice of translators: Yes, through DVX Editor.
Increased control over intellectual property: No
Shorter turnaround times when working with translators in different locations: N/A
Software purchase required: Yes
Price (in euros): Déjà Vu X Workgroup: Server license and first workstation: €2250; Additional workstations: €1490 each
Editor: Native editor
Additional features: Depending on the version, simultaneous access to several translation memories is possible. The Workgroup version includes the TM Builder, a programmable API, and enables the creation of satellite projects that can be sent to freelance translators and edited with DVX Editor (free).
Further information: www.atril.com

T-Remote Memory
------------------
Nature of the product: Add-on for existing CAT tool
Simultaneous database access over Internet: Yes (TM system must support queries via APIs or Web services)
Simultaneous database access over LAN: Yes
Increased consistency when working with several translators: Yes, translators can be situated anywhere
Centralized management of project configuration: Yes
Wider choice of translators: Yes
Increased control over intellectual property: Yes
Shorter turnaround times when working with translators in different locations: Yes
Software purchase required: Yes. Leasing possible.
Price (in euros): Standard version: €2100 (for 3 users, minimum)
Enterprise version: €8550 (for 10 users), available from July 1, 2003
Leased version: €1710 for 10 users for 3 months (minimum period)
Editor: MS Word
Additional features: Share several translation tools at the same time (translation memories, terminology databases, machine translation systems)
Further information: www.telelingua.com

Logoport
-----------
Nature of the product: CAT tool
Simultaneous database access over Internet: Yes
Simultaneous database access over LAN: Yes
Increased consistency when working with several translators: Yes, translators can be situated anywhere
Centralized management of project configuration: Yes
Wider choice of translators: Yes
Increased control over intellectual property: Yes
Shorter turnaround times when working with translators in different locations: Yes
Software purchase required: No. Lease access time.
Price (in euros): €.70/hour per user (discounts available for multiple users)
A Logoport server can also be installed on a company network; prices remain the same as for connecting to the remote server.
Editor: MS Word
Additional features: Context Matching, Logoport Messenger, terminology management, file format converters
Further information: www.logoport.net

TRADOS 6 Power LSP, TRADOS TM Server
-------------------------------------
Nature of the product: CAT tool
Simultaneous database access over Internet: No
Simultaneous database access over LAN: Yes
Increased consistency when working with several translators: Yes, but only in-house translators
Centralized management of project configuration: TRADOS 6 LSP: No
TRADOS TM Server: Yes
Wider choice of translators: No
Increased control over intellectual property: TRADOS 6 LSP: No
TRADOS TM Server: N/A
Shorter turnaround times when working with translators in different locations: N/A
Software purchase required: Yes
Price (in euros): TRADOS 6 Power LSP: €2595
TRADOS TM Server: no price indication provided by TRADOS
Editor: MS Word, plus other native editors
Additional features: TRADOS 6 LSP: Data mining tool, front-ends, terminology management, file-format converters
TRADOS TM Server: ConteXT TM optional
Further information: www.trados.com

SDL WorkFlow, SDLX Enterprise Server
-------------------------------------
Nature of the product: Multilingual content management system, CAT tool
Simultaneous database access over Internet: SDL WorkFlow: Yes.

SDLX Enterprise Server: Yes, via an Internet/Intranet Virtual Private Network (VPN) connection. At the end of this year, it is planned to implement this as a web service.
Simultaneous database access over LAN: Yes
Increased consistency when working with several translators: Yes
Centralized management of project configuration: Yes
Wider choice of translators: Yes
Increased control over intellectual property: Yes
Shorter turnaround times when working with translators in different locations: Yes
Software purchase required: Yes
Price (in euros): SDLX Enterprise: $1,000-1,500 per user (price is tiered and based on the number of users)
Editor: Native editor
Additional features: SDLX Enterprise Server: SDL Align, SDL Project Wizard, SDL Termbase, SDL Maintain, SDL Apply, SDL Analyse, SDLX AutoTrans and others.
Further information: www.sdlintl.com