Chapter 1: Introduction to Regular Expressions
Chapter 1: Introduction to Regular Expressions
Learning Objectives
- Understand the basic concepts and uses of regular expressions
- Master the basic syntax structure of regular expressions
- Learn to use online tools to test regular expressions
- Understand regular expression support in different programming languages
1.1 What are Regular Expressions
Regular expressions (regex or regexp) are patterns used to match character combinations in strings. They are a powerful text processing tool that can be used to:
- Validate input formats (such as email addresses, phone numbers)
- Search and replace text
- Extract specific information
- Data cleaning and format conversion
1.2 Basic Structure of Regular Expressions
Regular expressions consist of the following parts:
- Literal characters: Directly matched characters, such as
hellomatches “hello” in text - Metacharacters: Characters with special meanings, such as
.*+?, etc. - Character classes: Sets of characters enclosed in square brackets, such as
[abc] - Quantifiers: Specify number of matches, such as
{3}{2,5} - Anchors: Specify position, such as
^for start,$for end
1.3 Regular Expression Syntax Examples
Basic Examples
cat # Matches "cat"
c.t # Matches "cat", "cot", "cut", etc.
c[ao]t # Matches "cat" or "cot"
c[a-z]t # Matches c + any lowercase letter + t
Common Metacharacters
.- Matches any character (except newline)*- Matches the preceding character 0 or more times+- Matches the preceding character 1 or more times?- Matches the preceding character 0 or 1 time^- Matches the beginning of a line$- Matches the end of a line
1.4 Online Testing Tools
Recommended regular expression testing websites:
- RegexPal (https://regexpal.com/)
- Regex101 (https://regex101.com/)
- RegExr (https://regexr.com/)
- RegexTester (https://www.regextester.com/)
Features of these tools:
- Real-time testing and match result display
- Syntax highlighting and error hints
- Detailed match explanations
- Support for different programming language syntaxes
1.5 Programming Language Support
JavaScript
const regex = /hello/;
const result = "hello world".match(regex);
Python
import re
pattern = r"hello"
result = re.search(pattern, "hello world")
Java
import java.util.regex.*;
Pattern pattern = Pattern.compile("hello");
Matcher matcher = pattern.matcher("hello world");
PHP
$pattern = "/hello/";
preg_match($pattern, "hello world", $matches);
1.6 Why We Need Regular Expressions
Limitations of Traditional String Operations
// Traditional way to validate email - complex and incomplete
function validateEmailOld(email) {
return email.includes("@") &&
email.includes(".") &&
email.indexOf("@") > 0 &&
email.lastIndexOf(".") > email.indexOf("@");
}
Advantages of Using Regular Expressions
// Regular expression approach - concise and accurate
function validateEmailNew(email) {
const regex = /^[^\s@]+@[^\s@]+\.[^\s@]+$/;
return regex.test(email);
}
1.7 Application Scenarios for Regular Expressions
-
Data Validation
- Email address validation
- Phone number format checking
- Password strength validation
-
Text Search and Replace
- Code refactoring
- Batch text processing
- Log analysis
-
Data Extraction
- Web data scraping
- Log information extraction
- Configuration file parsing
-
Data Cleaning
- Removing extra whitespace
- Unifying data formats
- Removing special characters
1.8 Learning Suggestions
- Step by Step: Start with simple literal matching, gradually learning complex syntax
- Practice More: Use online tools to practice various patterns frequently
- Practical Application: Learn in combination with specific project requirements
- Reference Documentation: Familiarize yourself with the regular expression documentation of your target programming language
Summary
Regular expressions are a powerful tool for text processing. Although the syntax may seem complex, through systematic learning and practice, they can greatly improve text processing efficiency. Understanding their basic concepts and structure is the foundation for further in-depth learning.
Practice Exercises
-
Use online tools to test the following regular expressions:
catmatches text “The cat is sleeping”c.tmatches text “cat, cot, cut, c@t”^hellomatches text “hello world”
-
Try writing a simple regular expression to match 3 digits.