Regular Expressions

The following is an ever-expanding cheat sheet of regular expressions I've found useful.

Patterns

Everything between two strings

Note that This only retrieves the first instance of the match

Expression: #START#.*?#END#

Content: #START#sadfasdf#END# sadfasdf asdf as #START#sadfasdf#END# sadfasdf asdf as

Result: #START#sadfasdf#END# sadfasdf asdf as #START#sadfasdf#END# sadfasdf asdf as

With line breaks

Expression: #START#\n.*?\n#END#

Content:

#START#
sadfasdf
#END#

sadfasdf asdf  as #START#sadfasdf#END# sadfasdf asdf  as

Result:

**#START#
sadfasdf
#END#**

sadfasdf asdf  as #START#sadfasdf#END# sadfasdf asdf  as

Everything between two strings on multiple lines

([^;]*)

Similar but different strings

Content

            <td class="c3" colspan="1" rowspan="1">
               <p class="c12"><span class="c8">Department</span></p>
            </td>
            <td class="c6" colspan="1" rowspan="1">
               <p class="c12"><span class="c8">Office Location</span></p>
            </td>

Goal: Remove all the class attributes

Expression: class="c."

The . is a wildcard that refers to any character.

You might notice, though, that something like <td class="c33"> is not being matched. That's because the . refers to just one character. So you could do another search for class="c..", and get any classes with two numbers, but what if you just want to get all of them at once?

Do this: class="c.*"

From the excellent guide at regex101.com/, where "a" is the character(s) you want to target:

  • a? = "match any expression with either no a or exactly one a
  • a* = "match any expression with either no a or a number of a
  • a+ = "match any expression with either one a or a number of a

Resources