Thank you for your interest in contributing to simple_NER! This document provides guidelines and instructions for contributing.
- Be respectful and inclusive
- Provide constructive feedback
- Focus on what's best for the community
- Fork the repository
- Clone your fork:
git clone https://github.com/your-username/simple_NER.git - Create a branch:
git checkout -b feature/your-feature-name
- Python 3.10 or higher
- pip
-
Create a virtual environment:
python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install development dependencies:
pip install -e ".[dev,all]" -
Install pre-commit hooks (optional but recommended):
pre-commit install
We use the following tools to maintain code quality:
- Ruff: For linting and formatting
- mypy: For type checking
Run these tools before submitting changes:
# Format code
ruff format .
# Check for issues
ruff check .
# Type check
mypy simple_NER/Add type hints to your code:
from typing import Any, Generator
def extract_entities(text: str, as_json: bool = False) -> Generator[Entity | dict[str, Any], None, None]:
...- Add docstrings to public functions and classes
- Update the README.md if you add new features
- Include examples in the
examples/directory
Run all tests:
pytest test/ -vRun with coverage:
pytest test/ --cov=simple_NER --cov-report=term-missing- Write tests for new features
- Aim for good coverage of edge cases
- Follow the existing test structure
- Use descriptive test names:
test_<feature>_<scenario>_<expected_result>
Example:
class TestEmailNER:
def test_single_email(self):
ner = EmailNER()
text = "my email is test@example.com"
results = list(ner.extract_entities(text))
assert len(results) == 1
assert results[0].value == "test@example.com"Write clear, concise commit messages:
- Use the present tense ("Add feature" not "Added feature")
- Use the imperative mood ("Move cursor to..." not "Moves cursor to...")
- Limit the first line to 72 characters or less
- Reference issues and pull requests liberally after the first line
Example:
Add email entity extractor
- Implement EmailNER class with regex-based email detection
- Add tests for single and multiple email extraction
- Update documentation with usage examples
Closes #123
- Update the CHANGELOG.md with your changes
- Ensure all tests pass and coverage is maintained
- Update documentation as needed
- Submit a pull request with a clear description
- Request review from maintainers
## Description
Brief description of changes
## Type of Change
- [ ] Bug fix (non-breaking change that fixes an issue)
- [ ] New feature (non-breaking change that adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to change)
- [ ] Documentation update
## Checklist
- [ ] My code follows the style guidelines
- [ ] I have performed a self-review
- [ ] I have commented my code, particularly in hard-to-understand areas
- [ ] I have added tests that prove my fix/feature works
- [ ] All tests pass locally
- [ ] Documentation has been updated
- [ ] CHANGELOG.md has been updatedFeel free to open an issue for any questions or discussions.
Thank you for contributing! 🎉