Skip to content

Add option to filter duplicate results and deprecate --remove-extensions#1436

Merged
maurosoria merged 4 commits intomasterfrom
filter
Jun 5, 2025
Merged

Add option to filter duplicate results and deprecate --remove-extensions#1436
maurosoria merged 4 commits intomasterfrom
filter

Conversation

@shelld3v
Copy link
Copy Markdown
Collaborator

@shelld3v shelld3v commented Nov 8, 2024

Description

Close #1293

@shelld3v shelld3v mentioned this pull request Nov 13, 2024
2 tasks
@mikhailevtikhov
Copy link
Copy Markdown

Hi @shelld3v, sorry for the importunity :c It works great! But there is a scenario in which this logic will skip FP, I have met them in reality. They can be found in the API, for example, if we have api paths in the dictionary, then when accessing a non-existent API, it will return the information "{error: ... $uri ... not found}" or WAF, which blocked us on 1000 words out of 10000 and started returning a template with a lock for each of our subsequent requests, which reflects the $uri.
An example to reproduce:
You can raise nginx with the configuration

server {
    listen       80;
    listen  [::]:80;
    server_name  localhost;

    location ^~ /admin {
        return 200 "You had been blocked, because u want to check ($uri)";
    }

    location / {
        return 200 $uri;
    }

    error_page   500 502 503 504  /50x.html;
    location = /50x.html {
        root   /usr/share/nginx/html;
    }
}

In this case, the result will be:

python3 dirsearch.py --url=http://exaple.com:80/ --extensions=json,txt,configz --threads=1 --timeout=4 --wordlist=/PATH/dirserach_dict.txt --filter-threshold=1

  _|. _ _  _  _  _ _|_    v0.4.3
 (_||| _) (/_(_|| (_| )

Extensions: json, txt, configz | HTTP method: GET | Threads: 1 | Wordlist size: 25

Target: http://exaple.com/

[16:08:19] Scanning:
[16:08:21] 200 -    54B - /admin
[16:08:21] 200 -    55B - /admin/
[16:08:22] 200 -    64B - /admin/something
[16:08:22] 200 -    59B - /admin/test

Task Completed

Do you think there is an opportunity to do something about it, or is it redundant functionality?

@shelld3v
Copy link
Copy Markdown
Collaborator Author

Hi @mikhailevtikhov, thanks for your feedback! Yes, I'm aware of your problem already but storing and comparing anything more than a hash is just way too expensive, I don't want people to complain about memory and performance. An idea is to store just a part of the responses, and to find a method to identify potential duplicates before performing high-level comparison. However, I'm avoiding any big changes before the release of v0.4.5, so I won't work on it at the moment

@maurosoria
Copy link
Copy Markdown
Owner

hello @shelld3v can u resolve and merge?

@shelld3v
Copy link
Copy Markdown
Collaborator Author

@maurosoria Done 👍

@maurosoria
Copy link
Copy Markdown
Owner

is this ready?

@shelld3v
Copy link
Copy Markdown
Collaborator Author

@maurosoria Yes

@maurosoria maurosoria merged commit 8534bed into master Jun 5, 2025
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Suggestions for a filter flag to improvie accuracy

3 participants