An overview of currently available solutions

GARRY LEVITT
(Garry Levitt is a localization project manager at LinguaPoint GmbH. He can be reached at garry_levitt@linguapoint.de)

In “Internet-based Sharing of Translation Memory” (MultiLingual Computing & Technology #57 Volume 14 Issue 5, http://www.multilingual.com/levitt57), the concept of sharing translation memory (TM) over the Internet was discussed, and the benefits, drawbacks and obstacles of this particular type of workflow were addressed, as well as the potential impact on clients, agencies and freelance translators. At the time, a number of translation tool developers had new solutions in the pipeline, which have since been delivered to market. The aim of this article is to take a closer look at these new solutions and provide an update on existing and planned solutions in an attempt to find out whether internet-based TM sharing is really changing the paradigm of translation workflow.

TRADOS TM Server

In the second quarter of 2003, TRADOS announced the availability of the internet ve
rsion of TM Server, its first TM solution with support for internet-based sharing of TM. Prior to this release, TMs could only be shared via a local area network (LAN), which in effect meant that translators needed to be more or less in the same location in order to feed their translations into a master TM and benefit from the work of other translators, who would be simultaneously populating the same TM with their translations. Apart from the obvious drawback of this workflow, which requires translators to either already be on site or travel to the location where the team of translators is working, there were also limits to the performance and productivity that could be achieved. Previous releases of TRADOS Translator’s Workbench (TWB) were desktop based and allowed translators to open TMs on a local hard disk or on a shared network in the so-called file-sharing mode. This latter option has severe limitations in terms of network traffic, as it usually involves a heavy network load and is designed to work with a limited number of users only. With the availability of TM Server, a client/server architecture guarantees very low traffic, particularly in multiuser scenarios, and offers superior support for simultaneous access to TMs via LAN. Furthermore, TM Server also offers internet-based sharing of TM when used in combination with TM Anywhere, which adds internet capability to TM Server. The server-enabled version of TWB — used, for example, by freelance translators — now allows translators to connect to and simultaneously access TMs residing on a server from their client version of TRADOS while remaining in the comfort of their own homes or offices.

The internet-enabled client/server architecture of TM Server and TM Anywhere is targeted in particular at users who need simultaneous multiuser access to medium, large or very large TMs — from around 100,000 translation units (TUs) to millions of TUs — using intranet or internet environments. TM Server addresses the need for a higher productivity rate when processing very large translation volumes. Translators can remotely access any TM that is placed on the TM Server via intranet (LAN) or internet using TWB. This means that translators who are familiar with TRADOS do not need to acquaint themselves with new tools that have new functions and unfamiliar user interfaces.

TM Server is a high-performance, back-end database server for TM and is based on a scalable client/server architecture that supports both Microsoft SQL servers as well as Oracle databases under Windows and Unix. The underlying architecture can best be described as consisting of three tiers. First, a client layer — including the translation client (TWB) used by the translator to process the translatable files — and the TM Server Manager client used by administrators to configure TM Server, manage users and define TM access rights within the system. Second, a middleware server layer that manages the communication between client and server both in intranet and internet scenarios. This layer includes the TM Server and TM Anywhere server components. The latter supports both the HTTP and TCP protocols. Third, a database server layer hosts the database back-end and stores the TUs in the supported databases — called containers — and provides business logic such as fuzzy matching. Conventional file-based TMs comprising a set of five files have to be adapted for use with TM Server by leveraging the contents from the native format to an SQL database. This migration process is simple and is done by exporting file-based TMs and importing the data into the desired database. Import and export formats are the same for both server- and file-based TMs and can be migrated in both directions.

TM Server Manager, the administration client, includes two separate modules. The User Manager module is for managing users and setting access rights within the TM Server system. The Resource Manager module is used to configure the system, create server-based TMs and migrate TM data as described above. Once the administrator has created a server-based TM and has defined access rights that are expressed in terms of hierarchical TM roles, each representing a set of user rights (including, for example, read-write or read-only access to TM data), translators can access the TM from a server-enabled version of TWB using the Connect command. The TM selected is then made available for use during translation. In this scenario, all translatable files are handed off to the translator by e-mail or FTP.

TRADOS TeamWorks

Translation process management in an internet scenario can be further optimized with TRADOS TeamWorks, which is designed to support the collaboration of local and distributed translation teams around centralized translation assets. The TeamWorks module consists of an additional TeamWorks Server and a database. All team members use the TeamWorks client, which is configurable to the user’s role, to interface with the server. TeamWorks provides each member of the translation team with a host of additional capabilities. For the project manager (PM), who sets up the actual projects, this includes full control over tool/project settings, such as penalties and filters, for all project participants, which is not possible when using only TM Server/TM Anywhere. In addition, real-time project status tracking and progress monitoring are available. For example, a “Calculate translation progress” report allows the PM to see how much content of each file has been translated.

Translators, on the other hand, can share progress information with the PM, automatically connect to remote project TMs or synchronize to work locally. This last step allows translators to work offline and pulls all files, project TM and termbase locally. TMs remain in sync through the use of a special feature that automatically loads the offline user’s newly created TUs to the central TM, while loading new TUs from the central TM to the translator’s local project TM. The TeamWorks client also enables translators to seamlessly upload translated files and download translatable files into TWB, so there is no need to exchange files per e-mail or ftp. Furthermore, the quality of the TM data is safeguarded by a workflow engine, which ensures that translations are first only stored in a virtual project TM before being transferred to the central TM after a final quality check by an editor.

Both the TM Server/TM Anywhere and the TeamWorks solutions are suitable for agencies and client-side companies. When asked about the use in practice, product manager Nick In ‘t Ven said that TRADOS expects a significant shift from the use of standalone TMs to internet-enabled TMs as workflows can be simplified and data synchronization problems are avoided. “TeamWorks, as a comprehensive project management solution, will increase this trend,” he says.

SDLX and SDLWorkFlow

SDL International recently launched two new releases of products with support for web-enabled TM functionality. The first — SDLX 2004 from SDL Desktop Products, the localization tools division of SDL International — is available in different versions, which all support simultaneous access over a LAN with the exception of the Standard and Lite editions, which are targeted at standalone use. In addition, SDLX 2004 Enterprise Server also supports real-time online access over an internet/intranet virtual private network connection. In this scenario, SDLX is used in client/server mode to connect to a shared TM server, thereby allowing translators in different locations to access the same TM. With SDLX Enterprise Server the back end of SDLX runs on SQL or Oracle. Although this product addresses the needs of the vast majority of the enterprise market, a more customized solution for high-end customer environments with “ultra-large” TM requirements is also available in a UNIX version.

According to the developers, the second solution, SDLWorkFlow 2004, provides the most comprehensive translation management system — also known as globalization management system (GMS) — on the market today with more than 40 live installations. It offers improved translation quality and consistency through centralized TM, terminology, and dictionaries that can all be shared efficiently by in-house or agency resources through web-based access. Project management visibility is increased, and processes can therefore be improved through real-time reporting of translation performance metrics. These reports include information regarding key metrics such as TM leverage, productivity of linguistic resources and job status. Custom reports can be defined and output into HTML or Excel formats for further analysis. Reports can also be scheduled and circulated to the appropriate project members. This feature can, for example, be used to keep the project team informed about the progress of the translation, which can be extremely useful when the leverage of translations from the work of multiple translators is affecting the remaining workload of each individual translator.

The SDLWorkFlow solution, which combines the entire range of GMS functionality in a unified web-based server architecture, is a distributed environment that consists of three layers. First is the customer’s multilingual content repository, which can consist of different components such as a content management server, file server, FTP server, web server and database server. Second, the SDLWorkFlow 2004 environment includes a web server, workflow engine, TM, MT (optional) and terminology (optional). The third layer consists of the individual users responsible for carrying out the localization tasks and includes the customer, PM, translator, reviewer and engineer. Communication among these layers occurs over the internet, whereby data is transferred in and out of SDLWorkFlow through a secure XML-based messaging system.

The centralized TM repository of SDLWorkFlow can be accessed from remote SDLX desktops via LAN or by using online translation editing and review. For online access, SDLWorkFlow integrates the SDLX web-editing environment, leveraging TM and terminology in a single web screen. This provides the user with a browser-based, SDLX-like interface with access to centralized TM over the internet. Any client with a web browser can reach the SDLWorkFlow portal, but will require his or her own login and password to gain access to the system. SDLWorkFlow supports role-based security so that new users can be easily set up with a unique login and password together with the appropriate user role. In addition, SDLWorkFlow allows linguists to carry out editing tasks offline by letting them download content, TM and terminology, while keeping track of content location and the time taken for the task. Both SDLWorkFlow and SDLX also allow project TMs to be created from a master TM so that work can be handed off to offline translators without the need to send them the entire TM.

The workflow process of SDLWorkFlow allows for many project steps to be automated. For example, in a preprocessing step, a Master TM can be applied to the translatable files. A working TM can then be used during the actual translation and review steps, after which the master TM is automatically updated upon final approval, thus guaranteeing the final quality of the TM data.

Web-enabled TM functionality is just one of the benefits of SDLWorkFlow 2004, and it is difficult to say how many users actually take advantage of the web-based editing and review functionality that allows them to share TM data via the internet. SDLWorkFlow 2004 is currently used by a number of client-side companies, including Canon, PeopleSoft, Dell and Sun Microsystems.

When asked whether he was expecting the translation workflow paradigm to shift substantially from standalone TM use to synchronized web-based TM use, Terry Lawlor, vice president of marketing at SDL International, said: “My view of the future is that TMs will remain closely guarded assets and for this reason access will always be through some kind of secure connection. Managing TMs and terminology centrally is the way enterprise customers will go so that maximum leverage can be obtained. More of the editing and review will be done with web-based functionality for ease of deployment and maintenance. Desktop editing and review via a client/server connection will also increase, but offline editing and review with downloaded TMs will remain the most popular method for a long time.” According to Lawlor, the first groups to fully embrace this functionality will be those farthest up the value chain. “The driving force for change,” he says, “will be large enterprises looking to drive cost out of every aspect of the localization process and large service providers who need to be more competitive. Freelancers will follow, not lead, this change.”

Wordfast

Developed by Yves Champollion, Wordfast started out as a free translation tool and quickly acquired a budding user community. Although it currently only supports the LAN operation, support for internet-based TM sharing is planned for the end of 2004. This will allow workgroups of translators, all working on the same project, to share TMs and glossaries with no limit to the number of translators.

Currently, there are plans for both a solution for groups of freelance translators, as well as a client/server architecture targeting corporations and agencies. The first, however, will be a peer-to-peer solution enabling translators to create informal workgroups. The underlying architecture will make TUs available to the other translators in the workgroup every time a segment is translated so that all TMs remain in sync. In this first implementation, Wordfast will use a chat-like layer that circulates TUs among users of the same group. “It’s like working on a big, central TM, but with much less chance of being left out in the dark, as all TMs are local and remain in sync,” says Champollion, “which is in keeping with Wordfast’s overall philosophy to be a freelancer’s tool rather than an agency tool.”

The second implementation will be in the shape of a more classic client-server architecture, which will use one large, remote, centralized TM (residing on a Windows server running a server version of Wordfast) and clients equipped with a regular version of Wordfast and broadband internet access. Queries will be sent to the central TM with a one or two-TU buffer for smoothness. “Of course, any internet hiccup will be felt. All grand schemes of sharing TM through the net are only as good as the net anyway,” Champollion adds.

In Champollion’s view, reliability is key. “I can safely predict,” he says, “that client-server architectures will inevitably lead to extravagant hype and to no less frustration from translators actually working with such tools. The advantage of a chat-like connection is that a temporary internet connection lapse will not affect translation. Translators will carry on working with their local TMs and will barely even notice. Wordfast simply makes up for unsent/unreceived TUs at the first reconnection and moves forward,” he concludes.

STAR Ireland

Besides internet reliability, connection speed is another aspect that users and solution providers alike are concerned about. Damian Scattergood, managing director of STAR Ireland, sees two main schools of thought on TM sharing. First, centralized TM and working, whereby everything is at the heart of the system and team members log in; and second, centralized TM with project-level downloads whereby all data is centered at the core and downloaded with each new project. STAR focuses on giving translators the choice, supporting both options now with Transit XV and STAR James.

According to Scattergood, the internet is still not fast enough for all translators to use it. “The solution is project-level downloading,” he says, “where team members download translation kits. This option is much faster as kits are small, whereby the actual data remains centralized.” Agencies can manage the number of users in one go with a single install for all the licenses; for translators, the central management of software will mean that they always get the latest version of the tool. “The fundamental issue is that we will be required to work via the web in the future. How the actual TM is shared is a moot point. Most of the translators with whom I have talked emphasize the importance of actually being able to work with such a system. They need to have the ability to work either completely online or a combination of online/offline working with the same software setup, depending on their internet speed. The benefit of STAR’s solutions is that they have the choice,” Scattergood concludes.

Multilizer

Multilizer 6.0 is a software localization tool that also uses its own TM, which is designed especially for translating software user interfaces and system messages. XML and database contents can also be translated. Although the latest version, Multilizer 6.0, does not feature web-based TM, it can be shared in a multiuser environment on a local network. The developers are currently working on extending the current multiuser support to a web service for internet and intranet, which is expected to be available soon. In a first step towards internet-based TM, the company has centralized the TM on a single database and has provided tools to manage not only its content but also the users, profiles and access rights. Multilizer’s first server-based TM was introduced in early 2002 and provides scalability for demanding localization projects. The TMs are installed on database servers such as Oracle or MS SQL Server.

“In our current implementation, the behavior of the system is rather similar to an internet-based TM except for visibility, which is currently restricted to the network boundaries,” says Jari Sarras, CEO of Multilizer. The company expects the benefits of internet-based TM sharing to be tangible for all links of the localization workflow chain, including clients, vendors and freelancers, who will be able to use the system flexibly — that is, only when necessary — at a reduced cost, due to a flexible licensing model. We expect the workflow to shift towards a centralized and synchronized TM at least in the big segments of the market.”

Other Solutions

One of the first solutions to offer support for internet-based TM sharing was Logoport, which was developed by Logoport Software GmbH and has been available since 2001. One aspect that sets it apart from most other solutions is the fact that it requires no upfront investment in licenses or server infrastructure. Users lease access time on the system. Logoport is continually being developed. New features include support for Chinese, Japanese and Korean and client convenience measures such as integration of the various file filters into the client and enabling distribution of the system over multiple servers. Project setup and management features have been enhanced by allowing for project settings to be exported and automatically applied to a given project and through additional project reporting functionality aimed at empowering PMs. Prices for using the system have also been reduced. Logoport is currently used by both client-side companies and agencies, including Lionbridge, but also by freelance translators.

Telelingua’s T-Remote Memory (TRM), which is used as an add-on to an existing translation tool, allows translators situated anywhere in the world to access similarly located TMs via the internet. Translators are not required to purchase new tools. Companies can let their translators remotely use the licenses that the company has acquired for a particular TM tool via a proprietary interface. As an alternative, Telelingua has also developed its own TM server system, which works independent of third-party tools. TRM is no longer commercially available as a product, and Telelingua is currently focusing on consultancy and on carrying out larger projects with TRM.

Throughput Performance

Factors such as TM size will often influence the throughput and performance of solutions for internet-based TM sharing. For example, when used with a TM containing 300,000 TUs and a machine configuration as described in the table, TRADOS TM Server can typically handle 32,000 sentences per hour (540 per minute or 9 per second) and is considerably faster if the translatable text is highly repetitive. When accessing a very large TM with 2,000,000 TUs, the system can typically handle 7,000 sentences per hour (2 per second). With one database access per user per minute, the system would readily support over 100 simultaneous users. For comparative figures from other solution providers, see the table at the end of this article.

Besides TM size and the number of database accesses, the performance of client/ server solutions will also typically depend on the performance of the database server, meaning that a higher processor speed or more RAM on the database side will further increase performance. In addition, the server-side components can be distributed over a number of physically different machines, if applicable, in order to achieve optimum load balancing and performance depending on the number of users who will be accessing the TM databases.

Two other important factors are internet connection speed and bandwidth. The speed of an internet connection in terms of response time is measured in the time it takes in milliseconds for a packet of information to travel from a user’s computer to a remote server and back. This round-trip latency is called a ping and is generally lower (faster) when using a broadband connection rather than a dial-up connection. It is important to remember that data transmission can be delayed for a number of reasons, such as the distance to the remote server. Although some of the solutions described above allow translators from anywhere in the world to connect to a central TM, they may experience different response times when querying the remote server depending on the distance to the server. In other words, a fuzzy search of the central TM over an ISDN or dial-up connection may be fast enough if the distance to the server is not too far and the ping is low; in other cases a faster connection will be desirable in order to reduce otherwise long response times.

The required bandwidth will also vary according to the amount of information that is transferred to and from the central TM. In this respect, the requirements of the available solutions range from narrow-band mobile phone connections (9.6 Kbps) and analog modem connections (max. 56 Kbps) to much faster DSL, ADSL, cable connections and leased lines. Performance may still be satisfactory with a lower bandwidth, but it can be increased with the use of a broadband connection.

Conclusion

A number of exciting solutions for internet-based TM sharing are currently available, with yet further promising solutions in the pipeline. Some are suited to the needs of global corporations and large agencies, whereas others can also be used flexibly by smaller agencies, virtual translator collegiums or even individual freelancers. TM sharing may not make the world go around yet, but the influence it continues to exert on the entire cascaded localization supply chain is increasing as more players begin to incorporate these solutions in their workflows.

The general consensus among many solution providers seems to be that the internet will become an increasingly important aspect of translation workflow, which is reflected by an increasing demand for such solutions by clients. Some of those solution providers who believe that internet-based sharing of translation resources is part of the road ahead have also introduced solutions for sharing terminology databases via the web.

The fragmented nature of the translation workflow when splitting up translation projects among several translators means that this approach often involves a lot more work than flow for all the project team members. Coordinating large multilingual, time-critical projects can often be a daunting prospect, with the need for intermediate synching of TMs being just one aspect to be taken into account. With the use of the translation tools with support for TM sharing, project management functionality and other automated processes available today, this process can be simplified and optimized.

—————————-
This article reprinted from #67 Volume 15 Issue 7 of MultiLingual Computing & Technology published by MultiLingual Computing, Inc., 319 North First Ave., Sandpoint, Idaho, USA, 208-263-8178, Fax: 208-263-6310.

This entry was posted on Tuesday, September 27th, 2005 at 13:03 and is filed under Translation Memory. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.