Abertay Ethical Hacking Society
  • Home
  • Information
    • Constitution
    • Change Logs
      • Website
      • Discord
      • Github
      • Server
    • Meetings
      • 2021-2022
    • Honourary Members
  • Contributing
    • Contributions
      • Using Git
      • Formatting and Best Practise
  • Society Events
    • Securi-Tay
      • History
  • Help Guides
    • Programming / Scripting
      • AWK
      • Bash Scripting
      • C Coding
      • Java Coding
      • LaTeX
      • Markdown
      • Project Ideas
      • Python Scripting
      • Tools
    • Software
      • Operating Systems
        • Installing Arch
        • Installing Kali
        • Linux Commands for Beginners
        • MacOS
      • Tools
        • PGP
          • A guide to using PGP on Android
          • A guide to using PGP on macOS
          • PGP
        • Radare2
        • Nmap
        • Regular Expressions
        • The Browser Exploitation Framework (BeEF)
        • Vim
        • Vimium
        • Zsh
    • Networking
      • Domain Name System (DNS)
      • Remote access to your Abertay network drive
      • Secure Shell (SSH)
      • TLS 1.3
      • Wireshark
      • Subnetting
    • Techniques
      • A guide to creating malicious macro-enabled Excel worksheets
      • Open Source Intelligence (OSINT)
      • Google-Fu
    • Jobs
      • Common Interview Questions
    • Home Lab
      • PiHole
  • Glossary
    • Infosec Terms
    • Computing Terms
    • Hardware Terms
    • General Terms
    • Development Terms
    • Networking Terms
  • Members
    • Profiles
      • AG
      • Isaac
      • Sam
  • Other
    • Other
      • Data Dumps
      • Meetups
      • Meltdown & Spectre
      • Movies
      • Project topic suggestions
      • Recommended Reading
Powered by GitBook
On this page
  • Intro
  • Basic Usage
  • BRE vs ERE
  • List of Metacharacters
  • More Examples
  • Further Reading

Was this helpful?

  1. Help Guides
  2. Software
  3. Tools

Regular Expressions

PreviousNmapNextThe Browser Exploitation Framework (BeEF)

Last updated 3 years ago

Was this helpful?

Intro

In theoretical computer science, a regular expression is a sequence of characters that define a search pattern. It's basically a fancy way of doing text searches. Very useful in combination with sed and

Basic Usage

If you wanted to search through a long file looking for email addresses you might do something like

grep -E “[a-z]+@[a-z]+\.(com|org)” file.txt

That looks like someone's mashed their face on the keyboard so lets break it down into separate components to make it easier to understand.

[a-z] matches anything inside the square brackets exactly once (in this case, it's looking for any lowercase letter)

+ this means the preceding element gets matched one or more times (so multiple letters)

@ matches the character found in the middle of an email address

[a-z]+ same again, a sequence of one or more lowercase characters

\. this tells it to use the actual . character instead of using it as a metacharacter

(com|org) the brackets get interpreted as a subexpression. in this case 'com' or 'org

This is obviously just an example to show you some features of regex. An is unreadable. In practice I'd probably just do \w+@\w+.\w (word@word.word)

BRE vs ERE

This tutorial assumes you're using ERE (Extended Regular Expressions). Basic, or BRE, is just the same but you have to backslash brackets and you can't use ?,+ or |. That's also why we used grep -E instead of grep -e

List of Metacharacters

Metacharacter

Description

.

Matches any single character (whether this includes newlines sometimes depends on the application)

[ ]

A bracket expression. Matches a single character that is contained within the brackets. For example, [abc] matches “a”, “b”, or “c”. [a-z] specifies a range which matches any lowercase letter from “a” to “z”.

[^ ]

Matches a single character that is not contained within the brackets. For example, [^abc] matches any character other than “a”, “b”, or “c”. [^a-z] matches any single character that is not a lowercase letter from “a” to “z”.

^

Matches the starting position within the string. In line-based tools, it matches the starting position of any line.

$

Matches the ending position of the string or the position just before a string-ending newline. In line-based tools, it matches the ending position of any line.

( )

Defines a marked subexpression. The string that gets matched in the parentheses can be recalled later but that's a bit more advanced.

\|

The choice (also known as alternation or set union) operator matches either the expression before or the expression after the operator. For example, abc \| def matches “abc” or “def”.

*

Matches the preceding element zero or more times. For example, ab*c matches “ac”, “abc”, “abbbc”, etc. [xyz]* matches “”, “x”, “y”, “z”, “zx”, “zyx”, “xyzzy”, and so on. (ab)* matches “”, “ab”, “abab”, “ababab”, and so on.

?

Matches the preceding element zero or one time. For example, ab?c matches only “ac” or “abc”.

+

Matches the preceding element one or more times. For example, ab+c matches “abc”, “abbc”, “abbbc”, and so on, but not “ac”.

{m,n}

Matches the preceding element at least m and not more than n times. For example, a{3,5} matches only “aaa”, “aaaa”, and “aaaaa”. This is not found in a few older instances of regexes.

More Examples

.at matches any three-character string ending with “at”, including “hat”, “cat”, and “bat”.

[hc]at matches “hat” and “cat”.

[^b]at matches all strings matched by .at except “bat”.

[^hc]at matches all strings matched by .at other than “hat” and “cat”.

^[hc]at matches “hat” and “cat”, but only at the beginning of the string or line.

[hc]at$ matches “hat” and “cat”, but only at the end of the string or line.

\[.\] matches any single character surrounded by “[” and “]” since the brackets are escaped, for example: “[a]” and “[b]”.

s.* matches s followed by zero or more characters, for example: “s” and “saw” and “seed”.

[hc]+at matches “hat”, “cat”, “hhat”, “chat”, “hcat”, “cchchat”, and so on, but not “at”.

[hc]?at matches “hat”, “cat”, and “at”.

[hc]*at matches “hat”, “cat”, “hhat”, “chat”, “hcat”, “cchchat”, “at”, and so on.

cat|dog matches “cat” or “dog”.

Further Reading

- if you don't want to reinvent the wheel

Wikipedia
regex debugger
more visual debugger
crossword to test your skills
more advanced puzzles
Regexlib
awk
RFC 822 compliant regex
xkcd 208 - Regular Expressions