Nora Díaz on Translation, Teaching, and Other Stuff: Multiterm 2011: Avoiding Duplicate Entries When Importing Glossaries

Thursday, November 8, 2012

Multiterm 2011: Avoiding Duplicate Entries When Importing Glossaries

While following the standard import process to reimport the same glossary into Multiterm (for example, after the glossary has grown) will result in duplicate entries, the steps below help prevent this problem.

For the basics on importing glossaries into Multiterm, read this blog entry: Importing Excel Glossaries into Multiterm.

When you're ready to import your glossaries, you will see two standard import definitions in Multiterm, called "Default import definition" and "Synchronize on Entry Number". Neither one of these can be used to avoid duplicate entries, but they can be used as the basis to create a new one.

1. Right click on Synchronize on Entry Number and choose Duplicate. This will make a copy of the import definition.

2. Right click on Copy of Synchronize on Entry Number and choose Edit. This will open the Import Wizard, which lists the steps the wizard will go through. The change we need to make to avoid duplicate entries is on Step 4 (Exclude invalid entries from the import), so click Next.

3. Give a more user-friendly name to your new import definition. I have chosen "Avoid Duplicates" for this example. You can also add a description. Click Next.

4. Choose the file that contains the entries you want to import. Alternatively, you can leave these fields blank and choose your import files later. Click Next. When the wizard asks for an Exclusion File name under Validation Settings, type any name.

5. This takes us to the key step, Step 4. The default here is "Synchronize entries on entry number", but choosing this will result on duplicate entries if for some reason your glossaries got rearranged. Select "Synchronize entries on index term". Click Next.

6. Now we choose our Advanced Options for synchronizing on index terms. To keep things simple, I leave the Index as Source, and under "Index term does not exist in the target termbase", leave "Add import entry as new", because otherwise the new entries will not be added to the termbase. For "Index term exists in the target termbase", change the Action to "Omit import entry" if you want to skip it altogether, or choose one of the other options based on your needs.

Click Next a couple more times and the wizard closes, leaving you with a new import definition like this:

Finally, to make sure you can use this new import definition with any of your termbases, be sure to save it by right-clicking and choosing Save. This will save a file with an xdi extension, which you can later load into any other termbase, by right clicking the Import Definition area and choosing Load.

Now, whenever you need to reimport a glossary and you want to avoid getting duplicate entries in your termbase, run this new import definition by right-clicking it and choosing Process.

9 comments:

tanFebruary 8, 2013 at 2:54 AM
Thank you for this post! And what if I already have some entries imported which are repeated? Is there any solution to this?
ReplyDelete
Replies
KrysantemFebruary 22, 2013 at 7:40 AM
Hi Nora!
We have tried the procedure you described above several times before reading your post, but it does not suit our needs. At step 6 - Advanced Options for synchronizing on index terms, none of the options available under "Index term exists in the target termbase" give the expected result.
Background:
We have a term base. We send it to a translator and ask him to use the termbase while translating the file. The translator is allowed (encouraged) to add new terms and/or edit existing terms if needed. At the end of the translation, the translator is asked to deliver only edited and/or newly added terms (default export to XML).
When we import this XML file, using your procedure, we have the following options at step 6:
- Add import entry as new -> we don't want this, as this will create duplicates.
- Omit Import entry -> we don't want this, because we want the newly added terms to be added.
- Omit Import entry and write to output file -> we don't want this because we want the newly added terms to be added.
- Merge entries -> we don't want this, because merging does not delete possible obsolete information in the existing entry.
- Overwrite existing entry with import entry -> we don't want this, because the import entry might have less fields (including indexes) than the existing one.

So, what we do after all is to accept duplicates (“Add import entry as new” option), and then filter duplicated entries, and check them (using the entry number, I know which entry is new, which entry is old) and we merge them (we delete obsolete fields in the process if needed).
But this is manual work, so time consuming and error-prone. Besides, that work can only be done by an in-house native speaker with enough knowledge of the customer in question. We don’t have such persons for all our term bases.

Do you have hints on how we could proceed?
Thanks in advance!
ReplyDelete
Replies
KrysantemFebruary 24, 2013 at 11:34 PM
Hi,

Thanks for the quick answer.
Indeed, it would indeed be easier to replace our master term base with the new/udpated term base from the translator. But since we work with multilingual term bases, we cannot do that. At the end of the project, we would receive term bases from different translators, and we really need to find a way to merge all of them (or import terms for different language pairs into the master term base).

Also, because our term bases have several fields, we don't want new/edited entries to overwrite existing entries. Moreover, we cannot use the Glossary Converter (only converts indexes).

I guess we will have to do with MultiTerm for the time being, until we find a tool with better import options.

Thank you!
ReplyDelete
Replies
Judith PattinsonApril 15, 2013 at 6:26 PM
Thank you so much for this post! Without expert contributions like yours, I would be completely unable to use Multiterm to consolidate my termbases!

ReplyDelete
Replies
Seb WinkelMay 4, 2013 at 8:09 AM
Hola Nora:
Mi nombre es Carlos y tengo un problema con Multiterm 2011.
He convertido un glosario DE-ES-EN de Multiterm 5 a MT 2011 y cuando cargo las entradas en MT 2011 los registros españoles me salen con caracteres chinos. Los alemanes e ingleses salen correctamente.
¿Podrías ayudarme?
Gracias de antemano
ReplyDelete
Replies
UnknownAugust 17, 2014 at 5:48 AM
This comment has been removed by a blog administrator.
ReplyDelete
Replies
giaonhanquocteMarch 3, 2020 at 7:43 PM
Thanks for sharing, nice post! Post really provice useful information!

An Thái Sơn chia sẻ trẻ sơ sinh nằm nôi điện có tốt không hay võng điện có tốt không và giải đáp cục điện đưa võng giá bao nhiêu cũng như mua máy đưa võng ở tphcm địa chỉ ở đâu uy tín.
ReplyDelete
Replies

Add comment