Friday, December 11, 2015

Segmentation Exceptions in SDL Trados Studio

In addition to creating custom segmentation rules, Studio allows us to add exceptions to new or existing rules. This post will explain how to do just that in Studio 2015.

A sample use case

In the example below, the translation for "Made in China" will propagate from segment 2 to segment 4, but this is not desirable in Spanish, as the verb needs to be singular in segment 2 and plural in segment 4, a distinction that exists in Spanish but not in English. Furthermore, there is gender to be taken into account, which in this case doesn't matter, but in other segments will make a difference.

Adding an exception to a segmentation rule

Segmentation rules are modified in each TM's settings. For this example, we will go to Project Settings - Language Pairs - All Language Pairs - Translation Memory and Automated Translation, select the appropriate TM and click Settings, then Language Resources - Segmentation Rules.

I will then select the Colon rule and click Edit.

At the bottom of the Edit Segmentation Rule window, select the Add button next to the Exceptions pane, and then the Advanced View button in the window that opens, to get to the Add Rule Exception window.

This is where we will tell Studio when NOT to segment after a colon, so let's keep our source text in mind to make sure we don't miss anything.

Battery: Made in China
Batteries: Made in China

I want Studio to ignore the colon segmentation rule whenever it comes across the phrase "Made in". The part that I need to change in the rule is what comes after the break, that is, when "Made in" appears after a colon, the colon rule should be skipped. It looks simple enough, but there is one little thing to keep in mind: there is also a space after the colon, and if I don't add that, the exception won't work. So what I really need is for the colon segmentation rule to be ignored when "space + Made in" is found after a colon.

\s represents any whitespace in regular expressions, and it's already included in the "After break" pane as shown above, so I just need to add "Made in" right after it.

After clicking OK, I see that the exception has now been added to the rule.

After closing all the open windows, it's time to process the file with the new segmentation. If my file had been added to my project before adding the exception, I will need to remove it, add it back and reprocess it for the exception to take effect.

And here's the resulting segmentation, just what I needed!

This example uses simple text content for the exception, but with regular expressions, even more powerful exceptions can be added to fit a variety of scenarios.