Regex lookahead and lookbehind are zero-width assertions: they check whether a pattern does or does not appear next to your match, without consuming any characters or including them in the result. Lookahead (?=...) checks what follows, lookbehind (?<=...) checks what precedes, and both have negative forms for "not followed by" and "not preceded by."
The word "zero-width" is the part that trips everyone up. A normal regex token matches a character and moves the cursor forward. An assertion matches a position. It looks at the surrounding text, returns true or false, and leaves the cursor exactly where it was. That is why you can stack several lookaheads at the same spot, which is the trick behind almost every password-validation pattern you have ever pasted in.
The Four Lookaround Assertions#
There are exactly four lookarounds, and they split cleanly along two axes: direction (ahead or behind) and polarity (positive or negative). Once you see them in a grid, the syntax stops feeling arbitrary.
| Assertion | Syntax | Meaning | Plain English |
|---|---|---|---|
| Positive lookahead | (?=...) | Match must be followed by ... | "is followed by" |
| Negative lookahead | (?!...) | Match must not be followed by ... | "is not followed by" |
| Positive lookbehind | (?<=...) | Match must be preceded by ... | "comes after" |
| Negative lookbehind | (?<!...) | Match must not be preceded by ... | "does not come after" |
The dots inside each assertion are a normal sub-pattern. They are evaluated, the engine notes whether they matched, then it throws away the cursor movement. Nothing inside a lookaround ends up in your match or your capture groups.
Tip: if you ever want to confirm an assertion is truly zero-width, look at the matched text length. A pattern made only of lookarounds matches an empty string at a position. The interesting output is where it matched, not what.
Positive Lookahead: Match Something Followed By Something Else#
Say you have prices like 42USD, 99EUR, and 7GBP, and you want only the number that is immediately followed by USD, without grabbing the currency code itself.
\d+(?=USD)
Against 42USD 99EUR 7USD, this matches 42 and 7. The USD is required for the match to happen, but it is never captured, so you get clean numbers you can parse directly. Compare that to \d+USD, which forces you to strip the USD afterward.
That is the entire value of lookahead: a condition that gates the match without becoming part of it.
Negative Lookahead: Match Not Followed By#
Flip the assertion to (?!...) and you get "match a thing that is not followed by another thing." A classic case is matching a word that is not followed by a specific suffix.
\bcat(?!alog)\b
Against cat catalog category cats, this matches the standalone cat and the cat inside cats (because cats is not catalog), but it skips catalog. The \b word boundaries keep it from matching the cat buried in unrelated text.
Negative lookahead also powers "match a line that does not contain X" patterns when you anchor it:
^(?!.*ERROR).*$
This matches any full line that does not contain the word ERROR anywhere. The lookahead scans the whole line first; if it finds ERROR, the assertion fails and the line is skipped. This is genuinely useful for filtering logs.
Lookbehind: Checking What Comes Before#
Lookbehind does the same job in the other direction. Positive lookbehind (?<=...) requires a preceding pattern; negative lookbehind (?<!...) forbids one.
A common case is extracting an amount that follows a currency symbol, without grabbing the symbol:
(?<=\$)\d+(\.\d{2})?
Against Price: $42.50 and free, this matches 42.50 and leaves the $ out of the result. Negative lookbehind is just as handy. To match a number that is not preceded by a dollar sign:
(?<!\$)\b\d+\b
This matches loose numbers like 2024 while ignoring monetary $99 values. Lookbehind is where flavor differences start to bite, which is the next thing you need to know before you ship a pattern.
The Portability Trap: Lookbehind Is Not Universal#
This is the part most tutorials skip, and it is the part that breaks production code. Lookahead is supported almost everywhere. Lookbehind is not, and even where it exists, the rules about whether it can be variable-length differ by engine.
| Engine / flavor | Lookahead | Lookbehind | Variable-length lookbehind |
|---|---|---|---|
| JavaScript (ES2018+) | Yes | Yes | Yes |
| PCRE / PCRE2 (PHP, many tools) | Yes | Yes | No (fixed-length only) |
Python (re) | Yes | Yes | No (fixed-length only) |
Python (regex module) | Yes | Yes | Yes |
Go (regexp, RE2) | No | No | No |
Two things jump out. First, Go's standard regexp package uses the RE2 engine, which deliberately has no lookaround at all, because RE2 guarantees linear-time matching and lookarounds break that guarantee. If you write (?<=\$) in Go, it will not compile. You restructure with capture groups instead.
Second, "fixed-length lookbehind" means PCRE and Python's built-in re reject a lookbehind whose width can vary. (?<=cat|dog) works because both alternatives are 3 characters; (?<=cats?) fails in those engines because the match could be 3 or 4 characters wide. JavaScript and Python's third-party regex module both allow variable-length lookbehind, so the same pattern that works in your browser console can throw in a Python backend.
Warning: "works on regex101" is not the same as "works in my language." regex101 lets you pick a flavor, but it is easy to test under PCRE and then ship to a Go or Python service that rejects the pattern. Always confirm against your actual target engine.
A Real Use Case: Password Policy Validation#
The most common place developers meet lookahead is password rules, and it is the clearest example of stacking zero-width assertions. Suppose your policy is: at least 8 characters, with at least one lowercase letter, one uppercase letter, and one digit.
^(?=.*[a-z])(?=.*[A-Z])(?=.*\d).{8,}$
Walk through it left to right:
^anchors to the start of the string.(?=.*[a-z])asserts that somewhere ahead there is a lowercase letter. Cursor does not move.(?=.*[A-Z])asserts an uppercase letter exists, again from the start.(?=.*\d)asserts a digit exists..{8,}$finally consumes 8 or more characters to the end.
Each lookahead independently scans the whole string from the same starting position, because none of them moved the cursor. That is exactly why you can express "must contain all of these, in any order" as a flat list of assertions. Without lookahead, you would need an ugly alternation covering every possible ordering.
To forbid spaces, add a negative lookahead: (?!.*\s). To require a special character, add (?=.*[!@#$%^&*]). The pattern grows by one assertion per rule, which is far more readable than the alternative.
Step 1: Build the assertions one rule at a time#
Start with the anchors and the length, then add a single lookahead, and confirm it behaves before adding the next. Paste the partial pattern into a free regex tester and try a string that should pass and one that should fail. If you add all four assertions at once and it misbehaves, you will not know which one is wrong.
Step 2: Test the boundary cases, not just the happy path#
A password regex that accepts Abcdef12 is not proven correct. Try the strings that should fail: an all-lowercase string, one with no digit, one that is exactly 7 characters. Confirm each is rejected for the right reason. Watching the live highlight in a tester tells you whether a failure came from the length or from a missing assertion.
Step 3: Lock the flavor before you copy the pattern out#
Once it passes, set the tester to the engine you will actually run it in (JavaScript, PCRE, Python, or Go) and re-run the same cases. If you used a variable-length lookbehind or any lookaround at all in Go, this is where you catch it, in the editor, not in a failing CI build.
Lookahead vs Capture Groups: When to Use Which#
A frequent point of confusion is when to reach for a lookahead versus a plain capture group, since both let you "find X near Y." The distinction is about what ends up in your match.
- Use a capture group
(...)when you want to keep the surrounding text and pull out a piece of it. The whole match includes everything; the group isolates a part. - Use a lookaround when the surrounding text is only a condition and you do not want it in the result. The match is just the target.
Concretely, (\d+)USD matches 42USD and captures 42 in group 1. \d+(?=USD) matches only 42 and captures nothing. Both give you 42, but the first keeps USD in the overall match and the second does not. If you are doing a search-and-replace and want to leave USD untouched, the lookahead is cleaner because there is nothing to put back.
For deeper pattern work, our walkthrough on how to master regular expressions with a live regex tester covers groups, anchors, and flags alongside lookarounds.
Quick Reference Cheat Sheet#
Keep this next to your editor:
(?=foo)match position followed byfoo(?!foo)match position not followed byfoo(?<=foo)match position preceded byfoo(?<!foo)match position not preceded byfoo- Lookaheads are zero-width: they assert, never consume.
- Stack lookaheads at
^to require multiple conditions in any order. - Lookbehind is unsupported in Go (RE2) and must be fixed-length in PCRE and Python's built-in
re.
When you need to validate or untangle a pattern, paste it into the Molixa regex tester and watch the matches and groups update live, then flip flavors to confirm it ports. If you are also wrangling API payloads, the JSON formatter pairs well for cleaning responses before you regex them, and if you have used regex101, our take on a free regex101 alternative explains the differences.
Frequently Asked Questions#
What is the difference between lookahead and lookbehind in regex?
Lookahead (?=...) checks the text that comes after the current position, while lookbehind (?<=...) checks the text that comes before it. Both are zero-width assertions, meaning they test a condition without including the surrounding text in your match. Each also has a negative form, (?!...) and (?<!...), for "not followed by" and "not preceded by."
Does JavaScript support regex lookbehind?
Yes. JavaScript has supported both lookbehind and variable-length lookbehind since ES2018, so (?<=\$)\d+ works in modern browsers and Node. Older runtimes predating ES2018 do not, so if you target a legacy environment, test there or restructure with a capture group instead of a lookbehind.
Why does my lookbehind work in JavaScript but fail in Python?
Most likely it is variable-length. Python's built-in re module only allows fixed-length lookbehind, so (?<=cats?) fails because the width varies. JavaScript allows variable-length lookbehind, so the same pattern passes there. Use the third-party regex module in Python for variable-length support, or rewrite the assertion to a fixed width.
How do I use lookahead for password validation?
Stack one positive lookahead per rule at the start of the pattern, then consume the characters. For example, ^(?=.*[a-z])(?=.*[A-Z])(?=.*\d).{8,}$ requires a lowercase letter, an uppercase letter, a digit, and a minimum length of 8. Because lookaheads are zero-width, each one independently scans the whole string from the same position.
Does Go support regex lookahead and lookbehind?
No. Go's standard regexp package uses the RE2 engine, which has no lookaround support at all, by design, to guarantee linear-time matching. Patterns with (?=...) or (?<=...) will not compile in Go. You restructure the logic with capture groups, or use a third-party PCRE binding if you truly need assertions.
When should I use a lookahead instead of a capture group?
Use a lookahead when the neighboring text is only a condition and you do not want it in your result. Use a capture group when you want the surrounding text in the match and need to isolate a part of it. For example, \d+(?=USD) returns just the number, while (\d+)USD returns the number but keeps USD in the overall match.


