-
Regex Evaluation Issues
Hi Group
I hope someone can help.
I have a regex evaluation step and the expression to find the email with a string
([a-zA-Z0-9._-]+@[a-zA-Z0-9._-]+\.[a-zA-Z0-9_-]+)
The string is as follows
Username: uceses9Name: Test SharifTelephone (including code): 01344322585Email: Testf958@gmail.com
The expression works with online regex testers yet in the PDI whether I use the regex evaluation step or replace in string step brings back null values
Can someone help me in the right direction
Thanks in Advance
Chirag
-
It's becuase PDI adds an implicit ^ and $ to either end of the RegEx
so your RegEx that is actually being run is ^([a-zA-Z0-9._-]+@[a-zA-Z0-9._-]+\.[a-zA-Z0-9_-]+)$
Try that in your Online verifier.
Note that your RegEx is not actually RFC5322 compliant (https://en.wikipedia.org/wiki/Email_address). Testf958+Pentaho@gmail.com will reach your mailbox, but will not pass your RegEx.
Also note that "Testf958@Pentaho"@gmail.com could actually be a valid email address.
In short... Don't try to write your own validation of email addresses. Split your incoming string (on space perhaps?) and then put the pieces into the correct place afterwards.
-
Hi
Thank You for your swift reply. I've used the verifer with the implicit ^ and $ and it comes up with no matches. The problem I have the text I have is unformatted with no delimiters so Regex is my only
solution. I just keep getting null as result what should the syntax be for ([a-zA-Z0-9._-]+@[a-zA-Z0-9._-]+\.[a-zA-Z0-9_-]+) to work
Thanks
Chirag
-
If you absolutely have to use this RegEx (Bad Idea!)
Try:
.*?([a-zA-Z0-9._-\+]+@[a-zA-Z0-9._-]+\.[a-zA-Z0-9_-]+).*?
Since you know your line contains "EMail:" you can redirect your row using a Filter Rows entry, and then process just that row using your formatted rules.
RegEx probably shouldn't be your go-to, especially if you are asking how to build specific RegEx values here.
Take the time to learn *ALL* the steps, it will increase what others can do to maintain your transformations for you.
-