FOSS Localization/Recommendations
Foreword — Acknowledgements — Introduction — Localization Efforts in the Asia-Pacific — Recommendations — Annex A: Key Concepts — Annex B: Technical Aspects — Further Reading — Resources and Tools — Glossary — About the Authors — About APDIP — About IOSN
Implementing Localized FOSS in Asia
[edit | edit source]Asia is the world's battleground for FOSS localization. The monopolistic proprietary companies have not yet established dominance in Asia. For reasons of national interest, many Asian governments have adopted policies that encourage software alternatives, primarily FOSS.
In some countries the localization to date has been the work of a few dedicated enthusiasts scattered around the globe. Few have been paid for their efforts, and the disorganized nature of their translations has unintentionally produced ambiguities.
For the most part, the volunteers are programmers, not linguists. They need help from translators, technical writers and testers. Since localization deals primarily with language rather than programming issues, non-technical staff should outnumber programmers five to one. Even before adopting a formal FOSS policy, direct support for technical dictionaries and standards for localization can begin. Local language specialists require only professional salaries and offices. Technical writers and testers can be trained within a few months.
Supporting international localization initiatives, where languages that share significant similarities (i.e., Thai, Lao and Khmer) share programming and technical resources, is very cost effective. The value created for society as a whole, with new dictionaries and a technical standard enabling programmers and translators to consistently localize any FOSS for a low price, is undeniable. Other Asian countries should follow the example of the CJK initiative.
Localization initiatives should have very clear objectives, and the resources required to meet those objectives. They require funding, professional management and technical expertise. In addition, thorough linguistic knowledge is critical to success.
These initiatives will result in the creation of local centres where the knowledge is dispersed to those who will perform the actual work. Such centres can be the product of governmental action or business partnerships, or operate as part of a university. Regardless of how the centres are founded, they should enjoy the full support of government policy.
It is time to professionalize the process of Asian software localization, especially for developing countries. A great opportunity can be lost if haphazard efforts lead to undependable results. Where public good cannot be achieved by individual effort, the government is expected to help. A professional group can request the help of volunteers, but central coordination and basic work ought to be done by a dedicated team of paid staff.
It could be that the yearly fees paid by a developing country for a single department's commercial software is enough to underwrite that country's participation in the FOSS movement.
Establish localization centres to be the focal point for FOSS developers to share information, develop skills, and build on existing accomplishments. Where different countries have linguistic commonalities, a regional localization centre could share the cost of development. Specialists who are familiar with source code, linguists and analysts could be available to assist a wide variety of projects and build a knowledge base to accelerate future development.
Sponsor the creation of technical dictionaries and standards so that consistency is retained in all FOSS projects. With standard terminology, computer users are less likely to encounter frustration, and with standard technological procedures and processes, FOSS code can remain comprehensible to all IT professionals. Adherence to such standards should be mandated in all software procurement policies implemented by the government.
Move fast. It is important to have things done correctly, but it is also important to do them quickly. Writing a good computer glossary for a low-technology language can take over a year, but a first glossary that will suffice for translating the first versions of the programs can be done very quickly (e.g., in three months). Future versions of the programs will use the final glossary, but first versions can be available within months. An official "portal" detailing the prescribed terminology and standards should be the first priority.
Encourage the distribution of FOSS operating systems, applications and platforms. With little cost, governments can distribute localized FOSS to schools, businesses and other organizations. This would jumpstart the rate of adoption of computers and software in general, and prevent the unnecessary illegal copying of proprietary software. Because FOSS often works with older machines, the total price of providing computing access to the masses would be lower than that for any other approach.
Provide FOSS training not only for computer professionals, but also in primary and secondary schools. In developing countries where educational budgets are stretched thin, the use of localized FOSS operating on low-cost computers is well suited for increasing educational opportunities in rural communities. The natural curiosity of the youth should quickly result in a new generation that knows how to use computers in their native language. Those students who show a special talent for using computers can be encouraged to learn programming through scholarships, contests and other age-appropriate activities.
Beyond establishing governmental purchasing policies that favour localized FOSS, governments have an important role in removing obstacles, providing funding and coordinating standards. Without governmental support, "anglicisms" and inconsistencies will severely hamper the continued localization of FOSS, and limit the possibilities for growth of an indigenous software industry.
Skills and Tools Required for Localization Projects
[edit | edit source]Localization often occurs when the country is already using computers in a foreign language. Computer scientists and trainers are used to an English or French computer vocabulary. Localization therefore requires creating training materials based on the language used in the glossary, so that trainers and new users will start using the local language.
As it is difficult to engage linguists, preparatory work can be done first, such as looking for different translation options for each term.
After this, the work is mainly that of translators, who follow glossary guidelines and rules. There should be professional translators and computer scientists in the same team to assure linguistic and technical correctness of the terms used.
Localization can increasingly be performed without too many technical resources, once the first layer of the work is done (fonts, language support, etc.). In the future, it will become easier, since almost all FOSS projects are adopting new tools and techniques to make it easier for non-experts to perform the work.
The skilled workers who can perform software localization are often already available, or can be trained locally or abroad. Regional software localization training and coordination centres could act as clearinghouses and colleges for individuals to improve their skills, and thereby produce new workers for the years ahead. Fortunately, only the programmers need to have specialized knowledge of FOSS. The other professionals can have previous experience with any type of software.
Office space that is sufficient and appropriate for the work at hand is a must for any project where work is not distributed ad hoc around the world. For a professional localization effort, and especially for multilingual regional localization centres, a commercial space is best. This includes stable low-cost broadband connections to the Internet, LAN and development servers, sufficient client computers for each employee and three or four terminals for each tester.
Active participation and cooperation from universities, especially linguists and translators of English, should be solicited. Publishing rights for scholars who make significant contributions to technical dictionaries and standards should be granted, as well as public recognition for student volunteers.
Typically, the following people need to be trained, organized and provided with the tools to succeed:
- Project managers - technical and translation.
- Analysts and linguists.
- FOSS programmers.
- Translators and technical writers.
- Testers.
- Trainers.
Project management for localization should be split into two jobs: (i) Technical Managers direct the actual editing of code to ensure proper language support; and (ii) Translation Managers coordinate the creative efforts of linguists, technical writers and trainers.
Analysts and linguists work together with project managers, sociologists and programme sponsors to identify the technical challenges to be overcome and the cultural-linguistic requirements to be met. Their work results in requirement specifications and a project description that the project manager uses to guide the project to completion. This roadmap guides the programmers in their work, and provides the benchmark against which the software will be tested. The analysts are also responsible for gathering, organizing and disseminating the technical standards and specifications required by the programmers to perform their work.
Since both the operating system user interface and various application user interfaces should be localized, often several different types of programmers will be required. Enthusiasts can perform this work remotely with others worldwide, but only if the problem has been thoroughly documented by the analysts. Wherever possible, local programming staff should be used for this stage of the work. The lessons they learn, and write down for later reference, can be spread to others who are performing localization. They may also create the technical standards for that language if none exists. Compliance with such FOSS standards as "G11N, I18N, L10N" (please see Glossary) and others, will ensure that work proceeds quickly with the confidence that successive developers can continue to update and improve the software.
Translators and technical writers perform the lion's share of the work. All the error messages, buttons, menus, commands, help files and user guides must be translated. In consultation with linguists for consistency and accuracy, translators and technical writers compile technical dictionaries, often coining new technical words and phrases that enable future developers to communicate more effectively with their colleagues. Just as the technical standards for localization are vital to programmers, the technical dictionary used by the writers and translators is vital to the project's success.
Testers use the requirement specifications to check the complete work of both the programmers and technical writers. Their painstaking work identifies errors and inconsistencies to be corrected, and rechecked, before release to the users of the software. Additional apprentice testers, especially those who speak no English and who are computer novices, can provide excellent feedback for programmers and translators.
Trainers introduce the localized software to the users. Often, local teachers who have been taught how to use the system give seminars, answer questions and mentor computer enthusiasts. Local businesses and governments may also hire trainers to educate their workforce. It is important to ensure that these software trainers are locally recruited and speak the native language, rather than being English speakers imported at great expense.
For both proprietary and free software, training on how the software works is essential. To teach local users how to operate the software, one needs:
- Training equipment and materials.
- Classrooms.
- Instructors.
Most often, developers of the software 'train the trainer', who then instructs novices to make them advanced users. Training can be further divided between user training, system administrator training and developer training. Except for user training, which should be widespread, most specialized FOSS training takes place in educational institutions. Countries that have advanced quickly in FOSS localization have all devoted considerable resources to training and education. Without actual adoption of the software by a large segment of the population, the work of localization is an exercise in futility.
Tools and equipment for FOSS development and localization are less expensive than those required for proprietary localization. Version control, project management, documentation change management, and development tool kits for programmers are all available either free of charge or at a low cost. To work on FOSS, experience shows that it is best to use free/open source tools.
All other equipment, including most development computers, should be up to date, in a secure environment. A separate budget should be set aside for libraries, references and language tools specific to the language to be localized. If these materials do not already exist, they must be created.
Wherever possible, information on FOSS localization should be shared with the international FOSS development community so that the necessary tools do not have to be recreated by every team.
Costs of FOSS Localization
[edit | edit source]Technically speaking, localizing FOSS costs about as much as localizing commercial software. Only the techniques of programming are significantly different, since the linguistic and operational challenges exist no matter what type of software is to be localized. To localize any software, the following are needed:
- Office space.
- Office equipment and tools.
- Technical staff.
- Access to technical information.
- Access to linguists and translators.
The largest cost will be staff salaries. The total cost of a project depends heavily on the wage expectations of local technical, translation, writing and testing staff, and their individual levels of experience with software localization, language and cultural issues.
The programmers and project managers probably require a higher-than-average education and salary, but most of the other staff utilizes skills that are not particular to software and can be found more readily in the general population.
Trainers are hired when the software is near finalization, and presumably remain employed in teaching new users, system administrators and developers how to use the software.
For countries seeking independence from proprietary English language software, a permanent local office whose purpose is to train and disseminate technical information about localization could yield exponential savings. This establishment could be associated with a public library or university, where interested parties can access information at little or no cost.
FOSS can often operate well on older computers. This offers advantages to both developed countries with an overstock of used computers they must dispose of, and developing countries that can configure these computers to operate FOSS in the local language.
The total cost of localizing any particular piece of software is highly variable. Each project requires individual analysis for complexity, experience and availability of technical staff, and the characteristics of the local language.
Software cost and schedule estimating is not a simple calculation. In addition to a rough estimate based on the number of message strings to be translated, other factors must be considered.
- Experience
- Do the programmers, translators and testers have previous experience with this kind of work? If not, it will require extra time and effort to train them in the processes and standards of localization. But translators learn very quickly, and productivity increases dramatically after the first month or two. With a stable team, the members become very productive.
- Environment
- Does the staff have the tools and equipment needed to perform the work in a professional manner? Without modern office space, tools and techniques, it is unrealistic to expect the staff to perform at top efficiency.
- Linguistic factors
- How different is the local language from English? Translating from English to Swedish, for example, is fairly simple. The grammar, length of words and vocabulary is very similar. There is near universal fluency in English, and translators are easy to find. On the other hand, translating from English to Lao is very difficult. The grammar, spelling conventions, word length, collation, and other factors are not similar at all. So the size and position of user interface elements must be changed. In addition, a lack of experienced translators or even of a basic technical glossary means that projects would begin from practically nothing and take much more time and effort.
- Scope
- How much is enough? Is it acceptable to merely change the primary user interface menus and commands? Should the help files also be translated? What about documentation and user training materials? Are anglicismsacceptable? How many new words will be introduced into the language? To avoid failure, a very clear definition of the project's scope is necessary.
- Metrics
- Professional software cost and estimating relies on the experience of previous projects for determining future schedules. If there is little heuristic evidence to rely on for estimating, the first few project estimates can only be educated guesses. After several projects have been completed, the actual time for completion can be compared to the initial estimates in order to refine future estimates. So it is important that accurate records of person hours, resources, experience and other factors are collected for future reference. productive.
With the points mentioned above in mind, consider the following formula as a very rough "rule of thumb" for estimating localization project schedules.
Example: Estimating Localization Project Schedules: Rule of Thumb
- • less experience = more time (Scale 1.5 = less experience, and .75 = experienced)
- * to find person hours
- + always less than 40 per week; usually around 20 per week
Note: If no English/local language technical dictionary is available, it must be created before the work can begin. This is a separate project. Once the technical dictionary is completed, it must then be entered into a Translation Memory database (such as KBabel) to allow for consistency in the translation. Without these essential tools, no software can realistically be localized.
Example: Case 1
- 10,000 message strings to be translated
- 10 minutes per message string
- Less than average experience with software localization translating tools and processes.
- 10 staff members
Example: Case 1 Estimate
- 10,000 x 10 minutes (divided by 60) = 1,666 person-hours
- 20 "actual" man hours per week = 83.33 person-weeks
- Add 16.66 man weeks for testing and editing = 99.99 person-weeks
- Add 16.66 man weeks for management and training = 116.65 person-weeks
- Multiply by 1.5 to reflect lack of experience = 174.97 person-weeks
- Divide by 10 staff members = 17.4975 weeks
In other words, such a project would require a staff of 10 people working almost five months. If the average salary for these professionals is a thousand dollars a month, costs for staff alone is USD50,000.00. Add 10 computers, office space, Internet connectivity, copy machines and other routine expenses, and a rough estimate of the overall cost of localization can be made.
Consider the same example with the following change: average experience with software localization translating tools and processes.
- Multiply by 1.0 to reflect average level of experience = 116.65 person-weeks
- Divide by 10 staff members = 11.665 weeks.
A project with experienced staff would require less than four months. If the average salary for these professionals is a thousand dollars a month, costs for staff alone is USD40,000.00.
Consider the same example with the following change: staff with more experience with software localization, translation tools and processes.
- Multiply by .75 to reflect above average experience = 81.24 person-weeks
- Divide by 10 staff members = 8.124 weeks.
An above average staff of 10 people, requiring no additional training, will need only about two months. If the average salary for these professionals is a thousand dollars a month, cost for staff is only USD20,000.00. Compared to proprietary products, FOSS is ideal for localizing. When a few projects have been completed, the costs drop quickly because the underlying concepts and techniques remain the same. The first few projects must develop new dictionaries, tools and specialties relating to language and technical processes. The experience of the staff is the key to increased productivity.
Once these are in place and well understood, the time and money required to complete additional projects are reduced. Because FOSS developers tend to adhere to open standards, the developers of a local version do not have to reverse-engineer the code to guess what work must be done. The localization process should be very similar from project to project. With commercial proprietary closed software, the opposite is true.
As we have seen, a major pitfall of proprietary software is that only the owner of the copyright can maintain or modify it. With FOSS, any person with the appropriate skills can do the work. So, instead of users being locked into an expensive maintenance contract with a single foreign vendor, support and maintenance of software can be freely contracted to a wide variety of local companies.
Programmers working alone cannot localize software. When estimating the cost, time and effort required for any localization, set aside only about 10 percent of the budget for technical issues. All the rest goes to the time-consuming task of translating, writing, testing and training.
The Work of Localization
[edit | edit source]When a proprietary company localizes software, it first determines whether the effort would be commercially viable. Then it hires localization experts, including linguists and cultural experts, to develop a technical dictionary. Meanwhile, well-paid analysts and programmers modify the software to accept and display the script for the language.
The lion's share of localization work involves translating and then replacing the labels, menu items, error messages and help files of the software. Sometimes the appearance of the software must also be modified to fit awkwardly long or short words. The technical dictionary, in some cases, is the first of its kind for that language, and new terms are invented.
Closely following the technical dictionary and programming standards, teams of technical writers enter the new phrases alongside the English original. When it is all done, testers ensure not only that every message has been translated, but also that the terminology is consistent and logical.
Work of this sort follows an exponential curve, where the initial work is painfully slow, and then accelerates rapidly when the technical dictionary and standards are well established. After a few programs have been localized, an experienced team can localize additional software at greatly reduced costs. Marketing and training determine how successful they are in getting users to adopt the software. Often, they publish their technical dictionaries and standards, selling them for a profit or making them available free of charge to governments, universities and the online community.
Language Considerations
[edit | edit source]FOSS has typically been localized by a few volunteers working remotely, without the benefit of linguists or a technical dictionary for translation. The work can take a long time and it can be riddled with inconsistencies or errors.
The pace of FOSS localization is uneven. In countries where the language is similar to English and there are many bilingual volunteers, FOSS localization is well established. Where governments and other agencies have stepped in to provide financial support for localization, the results have also been impressive. (The "CJK" partnership of China, Japan, and Korea stands out as an example.)
In countries without much technical infrastructure, localization of both commercial software and FOSS is slow. It is far slower when the language is not of the Indo-European group. Commercial companies see little profit in the work, and few local professionals have the time or skills to localize FOSS. Even though the source code is freely available for localization work to begin, few specialized technical standards or technical dictionaries exist.
Some languages, particularly those with Latin-based scripts, are relatively easy to localize. Others can be very difficult. As an example, both Lao and Thai share a 42-consonant script, with vowel and intonation marks. These scripts follow complex rules of layout involving consonants, vowels, special symbols, conjuncts and ligatures. All of these writing systems share certain characteristics: spaces are not necessarily used to separate words, and vowels appear before and after, under, over, and after consonants.
Thai and Lao volunteers responsible for localizing FOSS have saved a great deal of time and avoided frustration by cooperating on technical issues, and sharing information on resources and tools.
Across Asia, opportunities exist for shared localization efforts at the inter-governmental level. Many other Asian languages share similarities, and often the programming tasks are nearly identical across similar language groups. Properly funded and organized, pan-Asian software localization is a realistic goal.