Chapter 4: Parameterized Tests and Data-Driven Testing
Haiyue
16min
Chapter 4: Parameterized Tests and Data-Driven Testing
Learning Objectives
- Master the @pytest.mark.parametrize decorator
- Learn parameterization with multiple parameters and complex data
- Understand best practices for parameterized testing
- Master dynamic parameter generation
Knowledge Points
Parameterized Testing Concept
Parameterized testing allows running the same test function multiple times with different input data, which:
- Improves test coverage: Validate the same logic with multiple data sets
- Reduces code duplication: Avoid writing duplicate code for similar tests
- Data-driven testing: Separate test data from test logic
- Boundary value testing: Easily test various boundary conditions
Parameterization Decorator Syntax
@pytest.mark.parametrize("parameter_name", [parameter_value_list])
@pytest.mark.parametrize("param_name1,param_name2", [(value1, value2), ...])
@pytest.mark.parametrize("parameter_name", [value_list], ids=[identifier_list])
Advantages of Parameterization
| Advantage | Description |
|---|---|
| Code reuse | One test function handles multiple cases |
| Clear reports | Each parameter combination has independent test results |
| Easy to maintain | Adding new test data only requires modifying parameter list |
| Parallel execution | Tests with different parameters can run in parallel |
Example Code
Basic Parameterized Tests
# test_parametrize_basic.py
import pytest
def add(a, b):
"""Simple addition function"""
return a + b
@pytest.mark.parametrize("a,b,expected", [
(2, 3, 5),
(0, 0, 0),
(-1, 1, 0),
(10, -5, 5),
(100, 200, 300)
])
def test_add_function(a, b, expected):
"""Parameterized test for addition function"""
result = add(a, b)
assert result == expected
# Single parameter parameterization
@pytest.mark.parametrize("number", [1, 2, 3, 4, 5])
def test_positive_numbers(number):
"""Test positive numbers"""
assert number > 0
assert isinstance(number, int)
# String parameterization
@pytest.mark.parametrize("text", [
"hello",
"world",
"pytest",
"parametrize"
])
def test_string_length(text):
"""Test string length"""
assert len(text) > 0
assert isinstance(text, str)
Complex Data Structure Parameterization
# test_parametrize_complex.py
import pytest
# Dictionary parameterization
@pytest.mark.parametrize("user_data", [
{"name": "Alice", "age": 30, "email": "alice@example.com"},
{"name": "Bob", "age": 25, "email": "bob@example.com"},
{"name": "Charlie", "age": 35, "email": "charlie@example.com"}
])
def test_user_validation(user_data):
"""Test user data validation"""
assert "name" in user_data
assert "age" in user_data
assert "email" in user_data
assert user_data["age"] > 0
assert "@" in user_data["email"]
# List parameterization
@pytest.mark.parametrize("numbers", [
[1, 2, 3],
[10, 20, 30, 40],
[100],
[5, 5, 5, 5, 5]
])
def test_list_operations(numbers):
"""Test list operations"""
assert len(numbers) > 0
assert sum(numbers) > 0
assert max(numbers) >= min(numbers)
# Nested data structures
@pytest.mark.parametrize("test_case", [
{
"input": {"username": "alice", "password": "secret123"},
"expected": {"status": "success", "user_id": 1}
},
{
"input": {"username": "bob", "password": "password456"},
"expected": {"status": "success", "user_id": 2}
},
{
"input": {"username": "invalid", "password": "wrong"},
"expected": {"status": "error", "message": "Invalid credentials"}
}
])
def test_login_scenarios(test_case):
"""Test login scenarios"""
def mock_login(username, password):
"""Mock login function"""
users = {
"alice": {"password": "secret123", "user_id": 1},
"bob": {"password": "password456", "user_id": 2}
}
if username in users and users[username]["password"] == password:
return {"status": "success", "user_id": users[username]["user_id"]}
else:
return {"status": "error", "message": "Invalid credentials"}
input_data = test_case["input"]
expected = test_case["expected"]
result = mock_login(input_data["username"], input_data["password"])
assert result["status"] == expected["status"]
if "user_id" in expected:
assert result["user_id"] == expected["user_id"]
if "message" in expected:
assert result["message"] == expected["message"]
Multiple Parameterization and Combination Tests
# test_multiple_parametrize.py
import pytest
# Multiple parameterization - Cartesian product
@pytest.mark.parametrize("x", [1, 2, 3])
@pytest.mark.parametrize("y", [10, 20])
def test_multiplication_combinations(x, y):
"""Test multiplication combinations (3×2=6 tests)"""
result = x * y
assert result > 0
assert result == x * y
# Operating system and browser combination tests
@pytest.mark.parametrize("os", ["Windows", "macOS", "Linux"])
@pytest.mark.parametrize("browser", ["Chrome", "Firefox", "Safari"])
def test_browser_os_compatibility(os, browser):
"""Test browser and OS compatibility"""
# Simulate compatibility check
incompatible = [
("Linux", "Safari"),
("Windows", "Safari")
]
if (os, browser) in incompatible:
pytest.skip(f"{browser} not available on {os}")
# Execute compatibility test
assert f"Testing {browser} on {os}" is not None
# Data type and operation combination
@pytest.mark.parametrize("data_type", ["list", "tuple", "set"])
@pytest.mark.parametrize("operation", ["len", "bool", "iter"])
def test_data_type_operations(data_type, operation):
"""Test data type and operation combinations"""
# Create test data
test_data = {
"list": [1, 2, 3],
"tuple": (1, 2, 3),
"set": {1, 2, 3}
}
data = test_data[data_type]
if operation == "len":
assert len(data) == 3
elif operation == "bool":
assert bool(data) is True
elif operation == "iter":
assert list(iter(data)) is not None
Custom Test IDs and Descriptions
# test_custom_ids.py
import pytest
# Use ids parameter to customize test identifiers
@pytest.mark.parametrize("input,expected", [
(2, 4),
(3, 9),
(4, 16),
(5, 25)
], ids=["square_of_2", "square_of_3", "square_of_4", "square_of_5"])
def test_square_with_ids(input, expected):
"""Square test with custom IDs"""
assert input ** 2 == expected
# Use function to generate test IDs
def test_id_func(val):
"""Function to generate test IDs"""
if isinstance(val, dict):
return f"user_{val.get('name', 'unknown')}"
elif isinstance(val, list):
return f"list_len_{len(val)}"
else:
return str(val)
@pytest.mark.parametrize("test_data", [
{"name": "Alice", "age": 30},
{"name": "Bob", "age": 25},
[1, 2, 3, 4],
[10, 20],
"simple_string"
], ids=test_id_func)
def test_with_id_function(test_data):
"""Test using function-generated IDs"""
assert test_data is not None
# Use pytest.param to set marks for individual parameters
@pytest.mark.parametrize("value", [
pytest.param(1, id="positive"),
pytest.param(0, id="zero"),
pytest.param(-1, id="negative"),
pytest.param(1000, id="large_positive", marks=pytest.mark.slow),
pytest.param(-1000, id="large_negative", marks=pytest.mark.slow)
])
def test_with_param_marks(value):
"""Test using pytest.param"""
assert isinstance(value, int)
Dynamic Parameter Generation
# test_dynamic_params.py
import pytest
def generate_test_data():
"""Dynamically generate test data"""
base_cases = []
# Generate numeric test cases
for i in range(1, 6):
base_cases.append((i, i * 2, i + i))
# Generate boundary value test cases
boundary_cases = [
(0, 0, 0),
(-1, -2, -1 + -1),
(100, 200, 100 + 100)
]
return base_cases + boundary_cases
@pytest.mark.parametrize("a,b,expected", generate_test_data())
def test_dynamic_addition(a, b, expected):
"""Addition test with dynamically generated parameters"""
assert a + a == expected
assert a * 2 == b
# Read test data from external file
def read_test_data_from_config():
"""Read test data from configuration"""
# Simulate reading from file or database
return [
{"url": "https://api.example.com/users", "method": "GET", "expected_status": 200},
{"url": "https://api.example.com/posts", "method": "POST", "expected_status": 201},
{"url": "https://api.example.com/invalid", "method": "GET", "expected_status": 404}
]
@pytest.mark.parametrize("api_config", read_test_data_from_config())
def test_api_endpoints(api_config):
"""API endpoint test"""
def mock_api_call(url, method):
"""Mock API call"""
if "invalid" in url:
return 404
elif method == "POST":
return 201
else:
return 200
result = mock_api_call(api_config["url"], api_config["method"])
assert result == api_config["expected_status"]
# Conditional parameter generation
def get_math_test_cases():
"""Generate math test cases based on conditions"""
import sys
basic_cases = [
(2, 3, 5),
(10, 5, 15)
]
# Add additional tests on specific Python versions
if sys.version_info >= (3, 8):
basic_cases.append((100, 200, 300))
return basic_cases
@pytest.mark.parametrize("a,b,expected", get_math_test_cases())
def test_conditional_math(a, b, expected):
"""Conditional math test"""
assert a + b == expected
Parameterization Combined with Fixtures
# test_params_with_fixtures.py
import pytest
@pytest.fixture
def database_connection():
"""Database connection fixture"""
class MockDB:
def __init__(self):
self.data = {
1: {"name": "Alice", "age": 30},
2: {"name": "Bob", "age": 25},
3: {"name": "Charlie", "age": 35}
}
def get_user(self, user_id):
return self.data.get(user_id)
return MockDB()
@pytest.mark.parametrize("user_id,expected_name", [
(1, "Alice"),
(2, "Bob"),
(3, "Charlie")
])
def test_database_queries(database_connection, user_id, expected_name):
"""Parameterized database query test"""
user = database_connection.get_user(user_id)
assert user is not None
assert user["name"] == expected_name
# Parameterized fixture
@pytest.fixture(params=["json", "xml", "yaml"])
def data_format(request):
"""Parameterized data format fixture"""
return request.param
@pytest.fixture(params=[10, 100, 1000])
def sample_size(request):
"""Parameterized sample size fixture"""
return request.param
def test_data_processing(data_format, sample_size):
"""Test data processing (will run 3×3=9 times)"""
assert data_format in ["json", "xml", "yaml"]
assert sample_size in [10, 100, 1000]
# Simulate data processing
processing_time = sample_size * 0.001 # Assumed processing time
if data_format == "xml":
processing_time *= 1.5 # XML processing is slower
assert processing_time > 0
Conditional Skip and Marks
# test_conditional_params.py
import pytest
import sys
@pytest.mark.parametrize("platform,feature", [
pytest.param("windows", "ntfs", marks=pytest.mark.skipif(
sys.platform != "win32", reason="Windows only")),
pytest.param("linux", "ext4", marks=pytest.mark.skipif(
sys.platform == "win32", reason="Linux only")),
pytest.param("macos", "apfs", marks=pytest.mark.skipif(
sys.platform != "darwin", reason="macOS only")),
("any", "generic")
])
def test_platform_features(platform, feature):
"""Platform-specific feature test"""
assert platform is not None
assert feature is not None
# Slow test marks
@pytest.mark.parametrize("size", [
10,
100,
pytest.param(10000, marks=pytest.mark.slow),
pytest.param(100000, marks=pytest.mark.slow)
])
def test_performance_with_size(size):
"""Performance test (large datasets marked as slow)"""
# Simulate processing time
import time
if size > 1000:
time.sleep(0.01) # Simulate slow operation
assert size > 0
# Expected failure parameters
@pytest.mark.parametrize("input_val,expected", [
(1, 1),
(2, 4),
(3, 9),
pytest.param(4, 15, marks=pytest.mark.xfail(reason="Known bug in square calculation"))
])
def test_square_with_known_bug(input_val, expected):
"""Square test with known bug"""
result = input_val ** 2
assert result == expected
Parameterized Testing Best Practices
Test Data Organization
# test_data_organization.py
import pytest
# Separate test data at module level
VALID_EMAILS = [
"user@example.com",
"test.email+tag@domain.co.uk",
"user123@test-domain.com"
]
INVALID_EMAILS = [
"invalid-email",
"@domain.com",
"user@",
""
]
@pytest.mark.parametrize("email", VALID_EMAILS)
def test_valid_email_format(email):
"""Test valid email format"""
assert "@" in email
assert "." in email.split("@")[1]
@pytest.mark.parametrize("email", INVALID_EMAILS)
def test_invalid_email_format(email):
"""Test invalid email format"""
def is_valid_email(email_str):
return "@" in email_str and "." in email_str.split("@")[1] if "@" in email_str else False
assert not is_valid_email(email)
Running Parameterized Tests
# Run all parameterized tests
pytest -v
# Run tests with specific parameters
pytest -k "square_of_3" -v
# Run slow tests
pytest -m slow -v
# Skip slow tests
pytest -m "not slow" -v
# Show detailed information about parameterized tests
pytest --collect-only
Parameterized Testing Tips
- Meaningful naming: Use descriptive parameter names and test IDs
- Separate data: Separate test data from test logic
- Boundary testing: Include boundary values and exceptional cases
- Performance considerations: Use appropriate marks for large datasets
- Readability: Maintain readability and maintainability of parameterized tests
Considerations
- Avoid over-parameterization: Don’t parameterize just for the sake of it
- Memory usage: Many parameters may consume significant memory
- Execution time: Parameterized tests increase total execution time
- Debugging difficulty: Debugging parameterized tests may be more complex
Parameterized Test Output Example
$ pytest test_parametrize_basic.py -v
test_parametrize_basic.py::test_add_function[2-3-5] PASSED
test_parametrize_basic.py::test_add_function[0-0-0] PASSED
test_parametrize_basic.py::test_add_function[-1-1-0] PASSED
test_parametrize_basic.py::test_add_function[10--5-5] PASSED
test_parametrize_basic.py::test_add_function[100-200-300] PASSED
test_parametrize_basic.py::test_positive_numbers[1] PASSED
test_parametrize_basic.py::test_positive_numbers[2] PASSED
test_parametrize_basic.py::test_positive_numbers[3] PASSED
test_parametrize_basic.py::test_positive_numbers[4] PASSED
test_parametrize_basic.py::test_positive_numbers[5] PASSED
Parameterized testing is a powerful feature of pytest that can significantly improve test coverage and code reusability, making it a core tool for implementing data-driven testing.