Thursday, July 23, 2015

A Collection of Regular Expressions for the Regex Match AutoSuggest Provider

The Regex Match AutoSuggest Provider app is one of my absolute favorite add-ons for Studio, and it has been a great reason to continue learning about regular expressions.

What does it do?
Based on rules that you can set up on-the-fly, the plug-in offers AutoSuggest proposals, as shown in this example:


Depending on how you set up your rules, the plug-in can:

  1. Offer an autosuggestion that is exactly the same as the word or phrase found in your source segment, so all you have to type is the first letter, instead of the whole thing. For example, you can have a rule that will suggest an entire phone number string, such as "1-800-012-3456", as soon as you type the first number.
  2. Offer an autosuggestion where the source text has been replaced by its translation. So, for example, when "July 22" is found in the source, "22 de julio" will be offered as an autosuggestion as soon as you type the first "2".
What do you need to use it?
First, you need to download the plug-in from the SDL Open Exchange and install it. The Regex Match AutoSuggest Provider is available for Studio 2014 (and soon for Studio 2015).

Once installed, it is found under the View tab in Studio. Clicking on the button will open the pane where you can start entering your RegEx Entries.

The developer provides a very clear explanation of the set-up here.

You can safely close this pane when you're not entering any new rules, and open it again as needed. I find it useful to have it open as I work, so I can quickly add new rules as I need them.

Some examples
Here are some rules and examples of their application, with a disclaimer: I'm a novice regex user, so please don't expect these to be the most elegant or sophisticated regex rules out there. In venturing into creating regex rules to use with Studio, I decided to follow Paul Filkin's advice about "economy of accuracy" and move past my worry of creating clunky regular expressions, so you may find some clunkiness here, but hopefully these will be useful either as they are, or as a starting point to achieve something else.

Date: First day of the month

Description
Regex Pattern
Replace Pattern
Dates: First of month Ex: July 1, 2015 --> 1° de julio de 2015
(#Months#)\s(1|1st),\s(\d{4})
1° de $1 de $3
 Note: The "#Months#" part of the expression is a reference to a variable from a list, in this case, the names of the months, another very clever feature of the plug-in. You can watch Paul Filkin's video Manipulating Dates with RegexMatch AutoSuggest for a clear explanation of this feature.



Date: Full date


Description
Regex Pattern
Replace Pattern
Dates: July 22, 2015 ---> 22 de julio de 2015
(#Months#)\s(\d{1,2}),\s(\d{4})
$2 de $1 de $3



Date: DD/MM/YY 

Description
Regex Pattern
Replace Pattern
Dates: 8/7/15 to 7/8/15
(\d{1,2})/(\d{1,2})/(\d{2})
$2/$1/$3



Date: Month and Year

Description
Regex Pattern
Replace Pattern
Month and year  Ex. July 2015 --> julio de 2015
(#Months#)\s(\d{4})
$1 de $2



Date: Day and Month

Description
Regex Pattern
Replace Pattern
Month and day w/o year. Ex. July 22 --> 22 de julio
(#Months#)\s(\d{1,2})(?!.*\d{4})
$2 de $1



1x

Description
Regex Pattern
Replace Pattern
Ex: 1x ---> 1 vez
\b1x\b
1 vez



50x

Description
Regex Pattern
Replace Pattern
Ex: 10x ---> 10 veces - excluding "1x"
(?!1x)(\d{1,2})x
$1 veces



5-6

Description
Regex Pattern
Replace Pattern
Ex: 8-10 ---> de 8 a 10
(\d{1,2})-(\d{1,2})
de $1 a $2



1-800 number

Description
Regex Pattern
Replace Pattern
Full phone numbers Ex. 1-800-123-1234
\d-(\d{3})-(\d{3})-(\d{4})
$0



8:00 a.m.

Description
Regex Pattern
Replace Pattern
Time plus a.m. or p.m. Ex. 9:00 a.m.
(\d{1,2}:\d{2})\s(a\.|p\.)m\.
$0



Numbers with backslashes


Description
Regex Pattern
Replace Pattern
Numbers with points and slashes Ex. 36.5/93
\d+\.\d+/\d+
$0



Percentages

Description
Regex Pattern
Replace Pattern
Percentages  Ex. 35%
\d{1,3}%
$0



Trademark names

Description
Regex Pattern
Replace Pattern
Words with TM Ex. AnyBrand™
\w+™
$0



Registered names

Description
Regex Pattern
Replace Pattern
Words with (R) Ex. AnyBrand®
\w+®
$0



Watch a video demonstration of the above here.


And here's the full list of the examples shown above.
Description
Regex Pattern
Replace Pattern
Dates: First of month Ex: July 1, 2015 --> 1° de julio de 2015
(#Months#)\s(1|1st),\s(\d{4})
1° de $1 de $3
Ex: 1x ---> 1 vez
\b1x\b
1 vez
Ex: 10x ---> 10 veces - excluding "1x"
(?!1x)(\d{1,2})x
$1 veces
Dates: July 22, 2015 ---> 22 de julio de 2015
(#Months#)\s(\d{1,2}),\s(\d{4})
$2 de $1 de $3
Ex: 8-10 ---> de 8 a 10
(\d{1,2})-(\d{1,2})
de $1 a $2
Dates: 8/7/15 to 7/8/15
(\d{1,2})/(\d{1,2})/(\d{2})
$2/$1/$3
Month and day w/o year. Ex. July 22 --> 22 de julio
(#Months#)\s(\d{1,2})(?!.*\d{4})
$2 de $1
Full phone numbers Ex. 1-800-123-1234
\d-(\d{3})-(\d{3})-(\d{4})
$0
Time plus a.m. or p.m. Ex. 9:00 a.m.
(\d{1,2}:\d{2})\s(a\.|p\.)m\.
$0
Numbers with points and slashes Ex. 36.5/93
\d+\.\d+/\d+
$0
Month and year  Ex. July 2015 --> julio de 2015
(#Months#)\s(\d{4})
$1 de $2
Percentages  Ex. 35%
\d{1,3}%
$0
Words with TM Ex. AnyBrand™
\w+™
$0
Words with (R) Ex. AnyBrand®
\w+®
$0




And that's all there is to it. What I like about the Regex Match AutoSuggest Provider is that it's easy to use (although some basic knowledge of regex is required) and it's project-independent, which means the entries are available all the time, for all files and projects, so it's a great way to customize Studio's autosuggestions to our individual needs.

2 comments:

  1. This is a great list, Nora! I'd never thought of adding a Spanish number range (1 a 3) to my Regex Match rules. This one's particularly annoying because Studio recognises "1 a" as a token, which is completely misleading.
    Thanks for the idea!
    Emma

    ReplyDelete