Add .slinkignore support for URL and path exclusions

Introduce a new .slinkignore file format to allow users to specify paths and URLs to ignore during scanning. Update the CollectURLs and CollectURLsProgress functions to respect these ignore rules. Add tests to verify the functionality of the .slinkignore file, ensuring that specified paths and URLs are excluded from results. Update README.md to document the new feature and its usage.
This commit is contained in:
Luke Hagar
2025-09-12 20:56:45 +00:00
parent ae5fcf868c
commit 54d7797089
7 changed files with 122 additions and 3 deletions

View File

@@ -70,3 +70,28 @@ Notes:
- Skips likely binary files and files > 2 MiB.
- Uses a browser-like User-Agent to reduce false negatives.
### .slinkignore
Place a `.slinkignore` file at the repository root to exclude paths and/or specific URLs from scanning and reporting. The format is JSON with two optional arrays:
```json
{
"ignorePaths": [
"**/vendor/**",
"**/*.bak"
],
"ignoreURLs": [
"https://example.com/this/path/does/not/exist",
"*localhost:*",
"*internal.example.com*"
]
}
```
- ignorePaths: gitignore-style patterns evaluated against repository-relative paths (uses doublestar `**`).
- ignoreURLs: patterns applied to the full URL string. Supports exact matches, substring contains, and doublestar-style wildcard matches.
Examples:
- Ignore generated folders: `"**/dist/**"`, backups: `"**/*.bak"`.
- Ignore known example or placeholder links: `"*example.com*"`, `"https://example.com/foo"`.