MrC-temp: Difference between revisions
No edit summary |
(Regex section: Draft 1.) |
||
Line 15: | Line 15: | ||
* '''Regular expression''': ''This is the regular expression pattern used to perform matches and captures on the '''String to test'''.<br />The regular expression language assigns special meaning to many characters. A few of these meta-characters, such as forward slash "/", comma "," and parenthesis "(" and ")", are also used by the Media Center expression language. To force the Media Center expression engine to ignore the meta-characters in regular expressions, surround the entire regular expression with '''/#''' and '''#/'''. This is Media Center's form of escapement, which tells the expression engine to ignore everything inside, so that the entire, uninterpreted regular expression can be provided to the Regex() regular expression evaluator. Although '''/#''' and '''#/''' is not necessary when no conflicting characters are in use, and you may manually escape such characters with a forward slash "/", it is best practice to always encase a regular expression in '''/#''' and '''#/'''.'' |
* '''Regular expression''': ''This is the regular expression pattern used to perform matches and captures on the '''String to test'''.<br />The regular expression language assigns special meaning to many characters. A few of these meta-characters, such as forward slash "/", comma "," and parenthesis "(" and ")", are also used by the Media Center expression language. To force the Media Center expression engine to ignore the meta-characters in regular expressions, surround the entire regular expression with '''/#''' and '''#/'''. This is Media Center's form of escapement, which tells the expression engine to ignore everything inside, so that the entire, uninterpreted regular expression can be provided to the Regex() regular expression evaluator. Although '''/#''' and '''#/''' is not necessary when no conflicting characters are in use, and you may manually escape such characters with a forward slash "/", it is best practice to always encase a regular expression in '''/#''' and '''#/'''.'' |
||
* '''Mode''': ''Sets the mode in which Regex() will run. Optional, defaults to 0.'' |
* '''Mode''': ''Sets the mode in which Regex() will run. Optional, defaults to 0.'' |
||
** '''0''': ''Runs in test mode, returning 1 or 0, indicating whether the string matched (1) or did not match (0) the pattern. This mode is useful within an '''if()''' test, so that different true (1) or false (0) actions may be taken. |
** '''0''': ''Runs in test mode, returning 1 or 0, indicating whether the string matched (1) or did not match (0) the pattern. This mode is useful within an '''if()''' test, so that different true (1) or false (0) actions may be taken. Output will be a 0 or 1. This mode is the default.'' |
||
** '''1''' to '''9''': ''Outputs the specified N<sup>th</sup> capture group's contents, where N ranges from 1 to 9. Currently, only a single capture is output in this mode, but all captures are available in the [R1] ... [R9] capture variables. This mode is used to easily output a single matching sub-string.'' |
** '''1''' to '''9''': ''Outputs the specified N<sup>th</sup> capture group's contents, where N ranges from 1 to 9. Currently, only a single capture is output in this mode, but all captures are available in the [R1] ... [R9] capture variables. This mode is used to easily output a single matching sub-string.'' |
||
** '''-1''': ''Runs in silent mode, with no output. This mode is useful as a means to capture portions of the string, and later use those captures in subsequent portions of an expression.'' |
** '''-1''': ''Runs in silent mode, with no output. This mode is useful as a means to capture portions of the string, and later use those captures in subsequent portions of an expression.'' |
||
Line 23: | Line 23: | ||
|- valign="top" |
|- valign="top" |
||
! scope="row" style="background: #A8E4A0; color: black; border: 1px solid black;" | Examples |
! scope="row" style="background: #A8E4A0; color: black; border: 1px solid black;" | Examples |
||
|style="background: #ecfeea; color: black; border: 1px solid black" | ''' |
|style="background: #ecfeea; color: black; border: 1px solid black" | '''Example: Modes 1 through 9''' |
||
The examples in this section use one of the modes from 1 through 9, to output the specified capture. |
The examples in this section use one of the modes from 1 through 9, to output the specified capture. |
||
Line 38: | Line 38: | ||
:It's a Bigman Thing |
:It's a Bigman Thing |
||
‾‾‾‾ |
|||
'''<span style="font-family: Consolas, monospace;"><nowiki> |
'''<span style="font-family: Consolas, monospace;"><nowiki> |
||
Line 51: | Line 52: | ||
:(w/Emmylou Harris) |
:(w/Emmylou Harris) |
||
‾‾‾‾ |
|||
'''<span style="font-family: Consolas, monospace;"><nowiki> |
'''<span style="font-family: Consolas, monospace;"><nowiki> |
||
Line 62: | Line 64: | ||
---- |
---- |
||
'''Examples: Mode 0''' |
'''Examples: Mode 0''' |
||
The examples in this section use mode 0, to test if a string |
The examples in this section use mode 0, to test if a string does or does not match the pattern. The result of the test may be used to drive a conditional statement such as an if() statement. |
||
'''<span style="font-family: Consolas, monospace;"><nowiki> |
'''<span style="font-family: Consolas, monospace;"><nowiki> |
||
Line 79: | Line 80: | ||
: / --> Bootsy Collins/Fatboy Slim |
: / --> Bootsy Collins/Fatboy Slim |
||
‾‾‾‾ |
|||
'''<div style="font-family: Consolas, monospace;"><nowiki> |
'''<div style="font-family: Consolas, monospace;"><nowiki> |
||
Line 103: | Line 105: | ||
::: Seal and Jeff Beck |
::: Seal and Jeff Beck |
||
: → No Punctuation |
: → No Punctuation |
||
‾‾‾‾ |
|||
'''<span style="font-family: Consolas, monospace;"><nowiki> |
|||
if(Regex([Album], /#^([^-]+[^\s])- (.*)$#/, 0), [R1]: [R2], / No Change Necessary) |
|||
</nowiki></span>''' |
|||
''Some album names contain characters that are not legal in Windows, and after pulling properties from the file name, such characters will be translated into a dash "-" character (e.g. "Staring at the Sea: The Singles" becomes "Staring at the Sea- The Singles"). If you'd like to identify such possibly re-named album, an expression such the one above might help. The expression matches characters from the beginning of the line that do not contain a dash, followed by a non-space character, followed by a dash, space and everything else. Wrapped in an if() statement, these file names become apparent in an expression column. |
|||
---- |
---- |
||
Line 108: | Line 118: | ||
The examples in this section use mode -1, which cause Regex() to suppress output. This mode is only useful with captures, where the captures are utilized in subsequent portions of an expression. |
The examples in this section use mode -1, which cause Regex() to suppress output. This mode is only useful with captures, where the captures are utilized in subsequent portions of an expression. |
||
The previous example (which helped to identify album titles that may have been changed after tags were updated from file properties) used mode 0 to guide an if() evaluation. By using that expression column and selecting only the files whose album name should be changed, the identical Regex() statement can be used to easily fix the album property by changing the mode from 0 to -1: |
|||
'''<span style="font-family: Consolas, monospace;"><nowiki> |
|||
=Regex([Album], /#^([^-]+[^\s])- (.*)$#/, -1)[R1]: [R2] |
|||
</nowiki></span>''' |
|||
Editing the Album tag and pasting the expression above into the edit field will set the munged album name to use a colon rather than dash. Note: take care to select the correct files, and ensure the tags were changed as desired. Use Undo if not. |
|||
A safety mechanism can be installed into the tag assignment. By using the same Regex() statement from the last mode 0 example, and setting the false side of the if() statement to [Album], the expression would effectively only change those files whose album names matched the pattern. Non-matched album names would be assigned to themselves, essentially acting as a no-op. For completeness, that statement would be: |
|||
'''<span style="font-family: Consolas, monospace;"><nowiki> |
|||
=if(Regex([Album], /#^([^-]+[^\s])- (.*)$#/, 0), [R1]: [R2], [Album]) |
|||
</nowiki></span>''' |
|||
‾‾‾‾<br /> |
|||
'''<span style="font-family: Consolas, monospace;"><nowiki> |
'''<span style="font-family: Consolas, monospace;"><nowiki> |
||
Regex([Name], /#(\d{1,2})\.(\d{1,2}).(\d{4})#/, -1)[R3]//[R1]//[R2] |
Regex([Name], /#(\d{1,2})\.(\d{1,2}).(\d{4})#/, -1)[R3]//[R1]//[R2] |
||
Line 119: | Line 144: | ||
: 2009/01/22 |
: 2009/01/22 |
||
‾‾‾‾ |
|||
'''<span style="font-family: Consolas, monospace;"><nowiki> |
'''<span style="font-family: Consolas, monospace;"><nowiki> |
||
Line 124: | Line 150: | ||
</nowiki></span>''' |
</nowiki></span>''' |
||
''This example shows how rearranging segments of track titles can help call out naming inconsistencies. The expression captures parenthetical information at the end of a track, and moves it to the beginning. In an expression column, inconsistencies become clearer. |
''This example shows how rearranging segments of track titles can help call out naming inconsistencies. The expression captures parenthetical information at the end of a track, and moves it to the beginning. In an expression column, inconsistencies become clearer.'' |
||
Sample output: |
Sample output: |
||
Line 134: | Line 160: | ||
: feat Amel Larrieux: Gaze |
: feat Amel Larrieux: Gaze |
||
‾‾‾‾ |
|||
⚫ | |||
'''<span style="font-family: Consolas, monospace;"><nowiki> |
'''<span style="font-family: Consolas, monospace;"><nowiki> |
||
Line 140: | Line 166: | ||
</nowiki></span>''' |
</nowiki></span>''' |
||
⚫ | |||
... IN PROGRESS ... |
|||
|} |
|} |
||
<div style="text-align:right;">([[#top|Back to top)]]</div> |
<div style="text-align:right;">([[#top|Back to top)]]</div> |
Revision as of 23:05, 28 August 2011
Note: The File Properties page is now at its permanent home ==> File_Properties_(tags)
Note: This is MrC's working space for Wiki page work-in-progress.
Regex(...): Regular expression pattern matching
Regex() | This function performs regular expression (RE) pattern matching on its input. It can be used in one of three different modes: a test mode to test for a match, a capture output mode to output the specified captured pattern, and a silent, capture-only mode. All match captures are placed into special variables referenced as [R1], [R2], ... [R9], which can be used in subsequent expressions. The contents of the captures [R1] ... [R9] are available until the entire expression completes, or Regex() is run again, whereby they are replaced. (Available since build 16.0.155.) |
---|---|
Construction | Regex(String to test, Regular expression, Mode, Case sensitivity)
|
Examples | Example: Modes 1 through 9
The examples in this section use one of the modes from 1 through 9, to output the specified capture. Regex([Name], /#(Big.*Man)#/, 1) Matches track names that contain Big followed by Man, with anything (including nothing) in between, and outputs the matched tracks. Sample output:
‾‾‾‾ Regex([Artist], /#([(].+)$#/, 1) Matches against the Artist field and returns items that contain an opening (left) parenthesis followed by additional characters until the end of the artist string. Only the sub-string from any opening parenthesis until the end of the string will be returned, since this is the only captured portion. Sample output:
‾‾‾‾ Regex([Name], /#([(][^)]+)$#/, 1) Similar to the previous example, but matches track names that contain a opening (left) parenthesis, but are missing the closing (right) parentheses through the end of the track name. This might be useful to help detect tagging inconsistencies Sample output:
Examples: Mode 0 The examples in this section use mode 0, to test if a string does or does not match the pattern. The result of the test may be used to drive a conditional statement such as an if() statement. if(Regex([Artist],/#([[:punct:]])#/, 0),[R1] --> [Artist], No Punctuation) Matches against the Artist field looking for any punctuation character. The results of the Regex() expression will be a 0 (false) or 1 (true) since the mode is set to 0, The true side of the if() test is set to output the first (and only) capture, which is expressed as [R1], and is followed by the string " --> " and then the artist name. In the false case, the string "No Punctuation" is output. Sample output:
‾‾‾‾
if(Regex([Artist], /#([[:punct:]])#/, 0),
A more complex example, similar to the previous one. When used inside an expression column, builds an expandable tree with headings Contains Punctuation and No Punctuation that group artists based on whether or not their names contain any punctuation characters. Because semicolon and backslash are list separator characters for the expression language, for the example expression to work properly, these must be replaced (otherwise the list will not build as desired). In the list of punctuation, both backslash and semicolon characters have been replaced with their English equivalent words. In artist names, a semicolon is often used as a separator between the main artist and featured artists, so the expression replaces semicolons within an artist name with the word "and". Likewise, backslashes have been replaced with forward slashes. Sample output:
‾‾‾‾ if(Regex([Album], /#^([^-]+[^\s])- (.*)$#/, 0), [R1]: [R2], / No Change Necessary) Some album names contain characters that are not legal in Windows, and after pulling properties from the file name, such characters will be translated into a dash "-" character (e.g. "Staring at the Sea: The Singles" becomes "Staring at the Sea- The Singles"). If you'd like to identify such possibly re-named album, an expression such the one above might help. The expression matches characters from the beginning of the line that do not contain a dash, followed by a non-space character, followed by a dash, space and everything else. Wrapped in an if() statement, these file names become apparent in an expression column. Examples: Mode -1 The examples in this section use mode -1, which cause Regex() to suppress output. This mode is only useful with captures, where the captures are utilized in subsequent portions of an expression. The previous example (which helped to identify album titles that may have been changed after tags were updated from file properties) used mode 0 to guide an if() evaluation. By using that expression column and selecting only the files whose album name should be changed, the identical Regex() statement can be used to easily fix the album property by changing the mode from 0 to -1: =Regex([Album], /#^([^-]+[^\s])- (.*)$#/, -1)[R1]: [R2] Editing the Album tag and pasting the expression above into the edit field will set the munged album name to use a colon rather than dash. Note: take care to select the correct files, and ensure the tags were changed as desired. Use Undo if not. A safety mechanism can be installed into the tag assignment. By using the same Regex() statement from the last mode 0 example, and setting the false side of the if() statement to [Album], the expression would effectively only change those files whose album names matched the pattern. Non-matched album names would be assigned to themselves, essentially acting as a no-op. For completeness, that statement would be: =if(Regex([Album], /#^([^-]+[^\s])- (.*)$#/, 0), [R1]: [R2], [Album]) ‾‾‾‾ Matches and captures a date formatted as dd.mm.yyyy anywhere within a filename, and rearranges it in a standard format of yyyy/mm/dd. Since Mode is set to -1, no output occurs. However, captured match segments are made available for subsequent use. The three captures, [R1], [R2] and [R3] are arranged in the textual output such that the year, month and day ordering are as desired. Sample output:
‾‾‾‾ Regex([Name], /#^(.+?) \(([^(]+)\)$#/, -1)[R2]: [R1] This example shows how rearranging segments of track titles can help call out naming inconsistencies. The expression captures parenthetical information at the end of a track, and moves it to the beginning. In an expression column, inconsistencies become clearer. Sample output:
‾‾‾‾ listbuild(1, \, Regex([Name], /#^(.+?) \(([^(]+)\)$#/, -1)[R2],[R1])&datatype=[list] Wrapping the previous expression in a listbuild() to create an expandable list provides quick grouping for even easier spotting of naming irregularities, especially when combined with Search to reduce the list size. |