<b>whatever text</b>I do NOT want to do anything here<b>another text</b>
And you'd like to replace every instance of the text surrounded by <b></b> with the word BOLD so that the following text is the result of the substitution:
<b>BOLD</b>I do NOT want to do anything here<b>BOLD</b>
How do I do it with regular expression?
I'll use PHP to show the solution but you can adapt it to any programming language.
Solution #1
Suppose you have a variable $s that holds the following text:
<b>whatever text</b>I do NOT want to do anything here<b>another text</b>
The following line of PHP code gives you the desired result:
preg_replace('/(<b>)(.+?)(<\/b>)/e', '"$1".str_replace("$2","BOLD","$2")."$3"', $s)Note preg_replace() does a global replace when you do not supply the limit in the parameters.
We use modifier 'e' so that we can evaluate a PHP function to yield the desired result.
This line of code says "Replace every block of text surrounded by <b> and </b> with BOLD using un-greedy matching".
In this case what's matched by (<b>) is stored in $1, what's matched by (.+?) is stored in $2, and what's matched by (<\/b>) is stored in $3. We simply concatenate $1, the result of replacing $2 with BOLD, and $3 to get the final result.
Solution #2
Suppose you have a variable $s that holds the following text:
<b>whatever text</b>I do NOT want to do anything here<b>another text</b>
The following line of PHP code gives you the desired result:
preg_replace('/(?<=<b>)(.+?)(?=<\/b>)/e', 'str_replace("$1","BOLD","$1")', $s)Note preg_replace() does a global replace when you do not supply the limit in the parameters.
We use modifier 'e' so that we can evaluate a PHP function to yield the desired result.
We use the look-ahead and look-behind operators to achieve this effect. This line of code basically says "Replace every block of text that immediately follows <b> and immediately ends before </b> with BOLD using un-greedy matching".
We use reference $1 because look-behind and look-ahead assertions are NOT stored in back references. Therefore what's matched by (.+?) is stored in $1.
Easy right?
If you have any questions let me know and I will do my best to help you!