That is pretty straightforward, as the Tab is one of the segmentation options offered in the list of Break characters.
But what if we want to add a new segmentation rule for soft returns? No such option in the dropdown menu, so we need to use a Regex expression. The steps are detailed below.
In order to get to the Segmentation Rules window, we first need to do the following:
Go into Project Settings, All Language Pairs, then Translation Memory Settings:
In the window that opens, select Language Resources on the left, and then Segmentation Rules on the right, then click on Edit:
This brings up one more box. For this example, since I want Studio to create a new segment every time it finds a soft return, I need to choose Add:
To create a segmentation rule for soft returns, add a name in the description field, choose "Anything" in the "Before break" dropdown menu and "Anything" in the "After break" dropdown menu. Since a soft return is not one of the options in the "Break characters" menu, we need to go to the Advanced View by clicking the button to the right of the Description.
This is where we add the Regex expression for a soft return, which should look exactly like this (feel free to copy from below and paste into Studio):
.[\n]+
Disclaimer: My knowledge of Regex is extremely limited; I got this expression from one of Paul Filkin's posts in a forum and simply typed it in. Thank you, Paul!
After this, click OK several times to close all the open dialog boxes, and that's it, from now on, in files processed with this TM, a new segment will be created whenever Studio encounters a soft return.
Thanks Nora! Your post was a big help for me.
ReplyDeleteWow, that made a nasty project so much easier! Thank you!
ReplyDeleteAny idea how to do that for an excel line break? Thanks a lot!
ReplyDelete