A simple contact info scraping tool written in Python and available as an executable file. It has a CLI & a GUI. This scripts main function is also integrated into ClatScope Info Tool.
This Python script is a comprehensive tool designed to extract and display contact information from websites. It retrieves phone numbers, fax numbers, email addresses, and social media profile URLs directly from a given website's HTML content.
The primary use of this script is for Open Source Intelligence (OSINT), digital marketing, customer relations management, competitive analysis, or investigative tasks that require identification and cataloguing of contact details from various online sources.
- Automatic Dependency Installation: Automatically checks and installs required Python libraries (
requests,BeautifulSoup). - Detailed Information Extraction: Precisely extracts phone numbers, fax numbers, emails, and social media URLs.
- Regular Expression-Based Parsing: Uses robust regex patterns to reliably identify contact information.
- Contextual Identification: Differentiates between phone and fax numbers by context within the webpage.
- Duplicate and Spam Reduction: Filters redundant or repetitive contact information.
- Python 3.x installed on your system.
- Save the script file to your local machine.
- Ensure Python is installed and available via the command line.
Run the script from a command line or terminal
- Upon execution, the script prompts the user for a full URL.
- Enter the complete URL (e.g.,
https://example.com) to scrape contact details. - Pressing
Enterwithout entering a URL will terminate the script gracefully.
The extracted contact information is clearly formatted as JSON:
{
"Phone numbers": "555-123-4567, 555-765-4321",
"Fax number": "555-111-2222",
"Email addresses": "info@example.com, contact@example.org",
"Social media profiles": "https://twitter.com/example, https://linkedin.com/company/example"
}The script includes basic error handling and provides feedback if the URL is invalid or inaccessible.
- Relies on publicly visible HTML content; will not extract data hidden behind JavaScript-heavy pages or authenticated portals.
- Sensitive to changes in website HTML structures.
This project is licensed under the Apache 2.0 open source license.
Joshua M Clatney (Clats97)
Ethical Pentesting Enthusiast
Copyright © 2024-2025 Joshua M Clatney (Clats97) All Rights Reserved
DISCLAIMER: This project comes with no warranty, express or implied. The author is not responsible for abuse, misuse, or vulnerabilities. Please use responsibly and ethically in accordance with relevant laws, regulations, legislation and best practices.
