Chapter 3: Quantifiers and Repetition
Chapter 3: Quantifiers and Repetition
Learning Objectives
- Master basic quantifiers: *, +, ?
- Learn to use precise quantifiers: {n}, {n,}, {n,m}
- Understand the difference between greedy and non-greedy matching
- Master the use of non-greedy quantifiers: *?, +?, ??
3.1 Overview of Quantifiers
Quantifiers are used to specify how many times the preceding character or pattern should match. This is the core of regex flexibility and power.
Target Objects of Quantifiers
Quantifiers apply to the character or group immediately preceding them:
a* # * applies to character 'a'
(abc)* # * applies to group 'abc'
[0-9]+ # + applies to character class [0-9]
3.2 Asterisk (*) - Zero or More Times
The asterisk matches the preceding character or group zero or more times.
Basic Usage
a* # Matches "", "a", "aa", "aaa", ...
ab* # Matches "a", "ab", "abb", "abbb", ...
[0-9]* # Matches "", "1", "123", "999999", ...
Practical Applications
// Match optional whitespace characters
const pattern1 = /\s*/;
// Match digits (including empty string)
const pattern2 = /\d*/;
// Match filename (starts with letter, followed by any number of letters or digits)
const filename = /[a-zA-Z][a-zA-Z0-9]*/;
Common Pitfalls
const text = "abc";
const result = text.match(/a*b*/);
console.log(result[0]); // "ab"
const text2 = "xyz";
const result2 = text2.match(/a*b*/);
console.log(result2[0]); // "" (matches empty string)
3.3 Plus (+) - One or More Times
The plus sign matches the preceding character or group one or more times.
Basic Usage
a+ # Matches "a", "aa", "aaa", ... (doesn't match empty string)
ab+ # Matches "ab", "abb", "abbb", ...
[0-9]+ # Matches "1", "123", "999999", ... (at least one digit)
Practical Applications
// Match one or more digits
const numbers = /\d+/g;
const text = "I have 123 apples and 456 oranges";
console.log(text.match(numbers)); // ["123", "456"]
// Match words (at least one letter)
const words = /[a-zA-Z]+/g;
const sentence = "Hello world! 123";
console.log(sentence.match(words)); // ["Hello", "world"]
Difference from Asterisk
const text = "bcd";
// Using a*
console.log(/a*/.exec(text)[0]); // "" (matches empty string)
// Using a+
console.log(/a+/.exec(text)); // null (no match)
3.4 Question Mark (?) - Zero or One Time
The question mark matches the preceding character or group zero or one time, used to represent optional items.
Basic Usage
a? # Matches "" or "a"
ab? # Matches "a" or "ab"
colou?r # Matches "color" or "colour"
Practical Applications
// Optional protocol part
const url = /https?:\/\//;
console.log(url.test("http://example.com")); // true
console.log(url.test("https://example.com")); // true
// Optional negative sign
const number = /-?\d+/;
console.log(number.test("123")); // true
console.log(number.test("-123")); // true
// British and American spelling
const spelling = /colou?r/;
console.log(spelling.test("color")); // true
console.log(spelling.test("colour")); // true
3.5 Precise Quantifiers {n}, {n,}, {n,m}
Precise quantifiers allow specifying exact match counts.
{n} - Exactly n Times
a{3} # Matches "aaa"
\d{4} # Matches exactly 4 digits
[A-Z]{2} # Matches exactly 2 uppercase letters
{n,} - At Least n Times
a{3,} # Matches "aaa", "aaaa", "aaaaa", ...
\d{2,} # Matches at least 2 digits
\w{5,} # Matches at least 5 alphanumeric characters
{n,m} - Between n and m Times
a{2,5} # Matches "aa", "aaa", "aaaa", "aaaaa"
\d{3,6} # Matches 3 to 6 digits
[a-z]{1,10} # Matches 1 to 10 lowercase letters
Practical Applications
// Validate phone number (11 digits)
const phone = /^\d{11}$/;
console.log(phone.test("13812345678")); // true
// Validate password length (6-20 characters)
const password = /^.{6,20}$/;
console.log(password.test("123456")); // true
// Match postal code (6 digits)
const zipCode = /^\d{6}$/;
console.log(zipCode.test("100000")); // true
// Match hexadecimal color (3 or 6 digits)
const hexColor = /#([0-9a-fA-F]{3}|[0-9a-fA-F]{6})/;
console.log(hexColor.test("#F00")); // true
console.log(hexColor.test("#FF0000")); // true
3.6 Greedy vs Non-Greedy Matching
This is an important concept in regular expressions.
Greedy Matching (Default Behavior)
Greedy quantifiers match as many characters as possible:
const text = "<div>Hello</div><div>World</div>";
// Greedy matching
const greedy = /<div>.*<\/div>/;
console.log(text.match(greedy)[0]);
// Result: "<div>Hello</div><div>World</div>"
// Matches from the first <div> to the last </div>
Problem with Greedy Matching
const html = '<p class="text">Content 1</p><p class="text">Content 2</p>';
const greedyPattern = /<p.*<\/p>/;
const result = html.match(greedyPattern);
console.log(result[0]);
// Matches the entire string, not individual <p> tags
Non-Greedy Matching (Lazy Matching)
Adding ? after a quantifier makes it non-greedy:
const text = "<div>Hello</div><div>World</div>";
// Non-greedy matching
const nonGreedy = /<div>.*?<\/div>/;
console.log(text.match(nonGreedy)[0]);
// Result: "<div>Hello</div>"
// Only matches the first complete div tag
// Match all div tags
const allDivs = text.match(/<div>.*?<\/div>/g);
console.log(allDivs);
// Result: ["<div>Hello</div>", "<div>World</div>"]
3.7 All Non-Greedy Quantifiers
Non-Greedy Quantifier List
*? # Zero or more times (non-greedy)
+? # One or more times (non-greedy)
?? # Zero or one time (non-greedy)
{n,m}? # n to m times (non-greedy)
{n,}? # At least n times (non-greedy)
Practical Comparison Examples
const text = "aaaa";
// Greedy matching
console.log(text.match(/a+/)[0]); // "aaaa"
console.log(text.match(/a{2,}/)[0]); // "aaaa"
// Non-greedy matching
console.log(text.match(/a+?/)[0]); // "a"
console.log(text.match(/a{2,}?/)[0]); // "aa"
HTML Tag Extraction Example
const html = `
<h1>Title 1</h1>
<p>Paragraph 1</p>
<h1>Title 2</h1>
<p>Paragraph 2</p>
`;
// Greedy matching - wrong approach
const greedyTags = html.match(/<h1>.*<\/h1>/g);
console.log(greedyTags);
// May match overly long content
// Non-greedy matching - correct approach
const lazyTags = html.match(/<h1>.*?<\/h1>/g);
console.log(lazyTags);
// ["<h1>Title 1</h1>", "<h1>Title 2</h1>"]
3.8 Practical Use Cases
Extracting Content in Quotes
const text = 'He said "Hello" and she replied "Hi there!"';
// Use non-greedy matching to extract quoted content
const quotes = text.match(/".*?"/g);
console.log(quotes); // ['"Hello"', '"Hi there!"']
// Extract only the text inside quotes (without quotes)
const quotedText = text.match(/"(.*?)"/g).map(match => match.slice(1, -1));
console.log(quotedText); // ['Hello', 'Hi there!']
Matching Repeated Characters
// Match repeated characters
const repeatedChars = /(.)\1+/g;
const text = "aabbcccddddd";
console.log(text.match(repeatedChars)); // ["aa", "bb", "ccc", "ddddd"]
Format Validation
// QQ number validation (5-11 digits, cannot start with 0)
const qqPattern = /^[1-9]\d{4,10}$/;
// Username validation (starts with letter, 3-16 alphanumeric or underscore)
const usernamePattern = /^[a-zA-Z][a-zA-Z0-9_]{2,15}$/;
// Password validation (8-16 characters, contains letters and numbers)
const passwordPattern = /^(?=.*[a-zA-Z])(?=.*\d)[a-zA-Z\d]{8,16}$/;
3.9 Performance Considerations
Performance Issues with Greedy Matching
// Potential performance issue
const text = "a".repeat(10000) + "b";
const problematicPattern = /a*a*b/;
// Better approach
const betterPattern = /a*b/;
Quantifier Efficiency Comparison
// Less efficient: multiple individual character matches
const inefficient = /a?a?a?aaa/;
// More efficient: using appropriate quantifiers
const efficient = /a{3,6}/;
3.10 Common Mistakes and Pitfalls
Mistake 1: Forgetting the Scope of Quantifiers
// Wrong: only 's' is optional
const wrong = /cats?/; // Matches "cat" or "cats"
// Correct: entire "cats" is optional
const correct = /(cats)?/; // Matches "" or "cats"
Mistake 2: Over-Matching with Greedy Matching
// Wrong: will match the entire line
const wrong = /<!--.*-->/;
// Correct: use non-greedy matching
const correct = /<!--.*?-->/;
Mistake 3: Edge Cases with Quantifiers
// Note empty string matching
const pattern = /\d*/;
console.log("abc".match(pattern)[0]); // "" (empty string)
// If you need at least one digit
const pattern2 = /\d+/;
console.log("abc".match(pattern2)); // null
3.11 Practice Exercises
Exercise 1: Basic Quantifier Usage
Write regular expressions:
- Match domain names with optional “www.” prefix
- Match one or more consecutive digits
- Match exactly 8-digit numeric password
// Answers
const domain = /(www\.)?[a-zA-Z0-9-]+\.[a-z]{2,}/;
const numbers = /\d+/;
const password = /^\d{8}$/;
Exercise 2: Greedy vs Non-Greedy
Given an HTML string, extract all tag content:
const html = "<p>Paragraph 1</p><div>Content</div><p>Paragraph 2</p>";
// Use non-greedy matching
const tags = html.match(/<[^>]+>.*?<\/[^>]+>/g);
console.log(tags);
Exercise 3: Practical Application
Write regular expressions to validate:
- Chinese mobile phone number (11 digits, starting with 1)
- ID card number (18 digits)
- Email address (simple version)
// Answers
const phone = /^1\d{10}$/;
const idCard = /^\d{18}$/;
const email = /^[^\s@]+@[^\s@]+\.[^\s@]+$/;
Summary
Quantifiers are important tools for controlling match counts in regular expressions:
- Basic Quantifiers:
*(zero or more),+(one or more),?(zero or one) - Precise Quantifiers:
{n},{n,},{n,m}provide exact count control - Greedy vs Non-Greedy: Default is greedy matching, add
?for non-greedy - Performance Considerations: Proper use of quantifiers can improve matching efficiency
- Common Pitfalls: Pay attention to quantifier scope and edge cases
Mastering quantifier usage is a key step in writing efficient and accurate regular expressions.