URL Filter
Advanced URL detection and filtering guardrail that prevents access to unauthorized domains. Uses comprehensive regex patterns and robust URL parsing to detect various URL formats, validates them against security policies, and filters based on a configurable allow list.
Key Security Features:
- Prevents credential injection attacks (
user:pass@domain
) - Blocks typosquatting and look-alike domains
- Restricts dangerous schemes (
javascript:
,data:
) - Supports IP addresses and CIDR ranges
- Configurable subdomain matching
Configuration
{
"name": "URL Filter",
"config": {
"url_allow_list": ["example.com", "192.168.1.100", "https://api.service.com/v1"],
"allowed_schemes": ["https"],
"block_userinfo": true,
"allow_subdomains": false
}
}
Parameters
-
url_allow_list
(optional): List of allowed domains, IP addresses, CIDR ranges, or full URLs.- Default:
[]
(blocks all URLs)
- Default:
-
allowed_schemes
(optional): Set of allowed URL schemes/protocols.- Default:
["https"]
(HTTPS-only for security)
- Default:
-
block_userinfo
(optional): Whether to block URLs containing userinfo (user:pass@domain
) to prevent credential injection attacks.true
(default): Blocks URLs containing userinfofalse
: Allows URLs containing userinfo
-
allow_subdomains
(optional): Whether to allow subdomains of allowed domains.false
(default): Only exact domain matches (e.g.,example.com
allowsexample.com
andwww.example.com
)true
: Allows subdomains (e.g.,example.com
allowsapi.example.com
)
Implementation Notes
- Detects URLs, domains, and IP addresses using regex patterns
- Validates URL schemes and security policies
- Supports exact domain matching or subdomain inclusion
- Handles IP addresses and CIDR ranges
What It Returns
Returns a GuardrailResult
with the following info
dictionary:
{
"guardrail_name": "URL Filter (Direct Config)",
"config": {
"allowed_schemes": ["https"],
"block_userinfo": true,
"allow_subdomains": false,
"url_allow_list": ["example.com"]
},
"detected": ["https://example.com", "https://user:pass@malicious.com"],
"allowed": ["https://example.com"],
"blocked": ["https://user:pass@malicious.com"],
"blocked_reasons": ["https://user:pass@malicious.com: Contains userinfo (potential credential injection)"],
"checked_text": "Visit https://example.com or login at https://user:pass@malicious.com"
}
Response Fields
guardrail_name
: Name of the guardrail that was executedconfig
: Applied configuration including allow list, schemes, userinfo blocking, and subdomain settingsdetected
: All URLs detected in the text using regex patternsallowed
: URLs that passed all security checks and allow list validationblocked
: URLs that were blocked due to security policies or allow list restrictionsblocked_reasons
: Detailed explanations for why each URL was blockedchecked_text
: Original input text that was scanned