Coldfusion Regex Unicode Characters
I was having a grand time writing some regex patterns in my Regular Expression Tester extension for FireFox. I needed to match some special characters because the site I was working on catered to international users. I would need to be able to validate strings containing acute, umlauts, etc. So, a pattern that I was happy with that would work in typical regex fashion:
[\u0021-\u007E
|\u00A1-\u00FF
|\u2013-\u2030
|\u20AC]
This matches exclamation to tilde, inverted exclamation to lower “y” with umlauts, en dash to per thousand sign, and the euro symbol. I allowed a couple non-html 4/4.01 characters just to keep the pattern within reason. Now, I’m a CFer and the typical format of escaping “u” followed by the unicode hex does not work. What I was not sure of was whether or not I could create a range of characters with Coldfusion’s CHR() business. True enough. Check your local ascii table for more info on which specific characters you want to match. Here’s some code testing the method:
<cfoutput>
<cfset pattern = "[#CHR(33)#-#CHR(126)#
|#CHR(161)#-#CHR(255)#
|#CHR(8211)#-#CHR(8240)#
|#CHR(8364)#]" />
#IsValid("regext","@",pattern)#
#IsValid("regext","Ñ",pattern)#
#IsValid("regext","€",pattern)#
</cfoutput>
yes yes yes
PS, sorry about how narrow the content section is; Some code won’t work because I added line breaks.
There are no comments yet, add one below.