Contributing to the Hub
Parsers, Scenarios, and Collections allow the CrowdSec Security Engine
to detect and block malevolent behavior. Supporting new services or improving the detection capabilities on existing software is a great way to contribute to the CrowdSec ecosystem.
Sharing your parsers, scenarios and collections on the hub allows other users to use them to protect themselves.
Communication
The main communication channels for hub contributions are:
- Discord: Best for live interactions and quick questions about hub development
- Discourse: Great for discussing ideas, suggesting improvements, or asking detailed questions
- GitHub Issues: Use for bug reports and feature requests
Getting Started
Anyone can open an issue about parsers/scenarios, or contribute a change with a pull request (PR) to the crowdsecurity/hub GitHub repository. You need to be comfortable with git and GitHub to work effectively.
Find something to work on
Here are some things you can do today to start contributing:
- Help improve existing parsers and scenarios
- Add support for new services and applications
- Create comprehensive test coverage
- Help triage issues and review pull requests
- Improve documentation and examples
Find a good first topic
The hub repository has beginner-friendly issues that are a great place to get started:
- Look for issues labeled
good first issue
- these don't require high-level CrowdSec knowledge - Issues labeled
help wanted
indicate that community help is particularly welcome
Prerequisites
Before contributing, make sure you have:
- Basic understanding of YAML syntax
- Familiarity with grok patterns (for parsers)
- Git and GitHub knowledge
- A local CrowdSec installation for testing (optional but recommended)
Contribution Workflow
The basic workflow for contributing:
- Have a look at open issues and pull requests
- Fork and clone the hub repository
- Create/Modify parsers/scenarios/collections
- Create/Modify tests to ensure proper coverage
- Open a pull request
Testing
Before submitting your contribution, ensure proper testing using cscli hubtest
:
Local Testing with cscli hubtest
cscli hubtest
is the primary tool for testing hub components. It creates tests for parsers, scenarios, and collections.
1. Create a test
# Create a test for a parser
cscli hubtest create my-parser-test --type syslog
# Create a test for a scenario (skip parser testing)
cscli hubtest create my-scenario-test --type syslog --ignore-parsers
# Create a test for specific components
cscli hubtest create my-test --parsers crowdsecurity/nginx --scenarios crowdsecurity/http-probing
2. Configure your test
Edit the generated configuration file (.tests/<test-name>/config.yaml
):
parsers:
- crowdsecurity/syslog-logs
- ./parsers/s01-parse/crowdsecurity/my-parser.yaml
scenarios:
- ./scenarios/crowdsecurity/my-scenario.yaml
postoverflows:
log_file: my-test.log
log_type: syslog
ignore_parsers: false
3. Add test data and assertions
- Log file: Add sample logs to
.tests/<test-name>/<test-name>.log
- Parser assertions: Define expected parsed fields in
parser.assert
- Scenario assertions: Define expected alerts in
scenario.assert
Note: When you first run cscli hubtest run
, it will output the generated assertions that you need to fill out in parser.assert
or scenario.assert
files. You can find examples of assertion files in the hub repository at .tests/<existing-test>/parser.assert
and .tests/<existing-test>/scenario.assert
.
4. Run your test
# Run a specific test
cscli hubtest run my-parser-test
# Run all tests
cscli hubtest run --all
# Run with verbose output
cscli hubtest run my-parser-test -v
5. Debug your test
# Explain test results
cscli hubtest explain my-parser-test
# Show only failures
cscli hubtest explain my-parser-test --failures
# Verbose explanation
cscli hubtest explain my-parser-test -v
CI/CD Testing
The hub repository uses GitHub Actions for automated testing:
- All parsers and scenarios are automatically tested when you open a PR
- Tests include syntax validation, pattern matching, and integration tests
- Make sure all tests pass before requesting review
Git Workflow / Branch Management
We receive contributions on the master
branch. To contribute:
- Fork the repository on GitHub
- Clone your fork locally:
git clone https://github.com/yourusername/hub.git
cd hub - Create a dedicated branch for your contribution:
git checkout -b feature/your-feature-name
- Make your changes and commit them with descriptive messages
- Push to your fork:
git push origin feature/your-feature-name
- Open a Pull Request targeting the
master
branch
Branch Naming Convention
Use descriptive branch names that indicate the type of contribution:
feature/parser-nginx-access-logs
fix/scenario-ssh-bruteforce-labels
docs/collection-apache-examples
Commit Messages
Write clear, descriptive commit messages:
- Use imperative mood: "Add parser for Apache access logs"
- Reference issues when applicable: "Fix scenario labels (#123)"
- Keep the first line under 50 characters
Guidelines
YAML Best Practices
When creating parsers, scenarios, and collections, follow these YAML guidelines:
Avoid YAML Anchors
❌ Don't use YAML anchors - they make YAML files harder to maintain and understand:
# BAD - Using anchors in nodes
nodes:
- grok:
pattern: 'pattern1'
apply_on: message
nodes: &message_parsers
- grok:
pattern: "user pattern"
apply_on: parsedmessage
statics:
- meta: sub_type
value: user_enumeration
- grok:
pattern: 'pattern2'
apply_on: message
nodes: *message_parsers
✅ Duplicate the node structure instead of using anchors:
# GOOD - No anchors, explicit node definitions
nodes:
- grok:
pattern: 'pattern1'
apply_on: message
nodes:
- grok:
pattern: "user pattern"
apply_on: parsedmessage
statics:
- meta: sub_type
value: user_enumeration
- grok:
pattern: 'pattern2'
apply_on: message
nodes:
- grok:
pattern: "user pattern"
apply_on: parsedmessage
statics:
- meta: sub_type
value: user_enumeration
AI-Assisted Generation
We do allow AI-assisted generation of parsers, scenarios, and collections, but with important requirements:
Requirements for AI-Generated Contributions
- Follow all guidelines: AI-generated code must still follow all the guidelines in this document
- Include comprehensive tests: A pull request with AI-generated code but no tests will be immediately closed
- Proper documentation: Include complete documentation and examples as required
- Code quality: Ensure the generated code follows CrowdSec conventions and best practices
Disclosure Requirement
Important: When submitting AI-assisted contributions, you must check the "AI was used to generate any/all content of this PR" box in the PR template. You must understand and be able to explain yourself what was generated. AI-generated code will receive additional scrutiny to ensure quality and correctness.
What We Expect
- Test coverage using
cscli hubtest
- Proper error handling and edge cases
- Clear documentation and examples
- Adherence to CrowdSec patterns and conventions
- Human review and validation of the AI output
AI is a powerful tool, but it should augment human expertise, not replace proper testing and review processes.
Technical Documentation
The following explains how to create and test:
Collections
Collections group related parsers, scenarios, and postoverflows together. It often makes sense for a new parser or scenario to be added to an existing collection, or create a new one.
When to create a new collection:
- Your parsers and/or scenarios cover a new or specific service
- You're adding multiple related components that work together
- The existing collections don't fit your use case
When to add to existing collections:
- Adding a parser for a specific web server's access logs that would benefit from existing HTTP-related scenarios
- Your contribution enhances an existing service's detection capabilities
- Your scenario complements existing parsers in a collection
Collection structure:
Each collection should include:
- Descriptive name and description
- Proper labels and tags
- Documentation with setup instructions
- Example acquisition configuration
- Related parsers, scenarios, and postoverflows
Scenarios
Scenarios define the logic for detecting attacks and suspicious behavior. When you create a scenario, you must fill some fields in the labels
, else the CI won't accept the contribution.
Required Labels
Those labels
are:
classification
: this array contains the CVE ID and the Mitre Techniques related to the scenario (when applicable)spoofable
: between 0 and 3, is the chance that the attacker behind the attack can spoof its originconfidence
: between 0 and 3, is the confidence that the scenario will not trigger false positivebehaviors
: an existing behavior in this filelabel
: a human readable name for the scenariocti
: (optional) true or false, used to specify that a scenario is mainly used for audit rather than detecting a threat
Here is the labels
documentation for more information.
Example Scenario Labels
labels:
service: ssh
confidence: 3
spoofable: 0
classification:
- attack.T1110
label: "SSH Bruteforce"
behavior: "ssh:bruteforce"
remediation: true
Label Guidelines
- Confidence: Start with 3 (high confidence) and reduce if you're unsure
- Spoofable: Consider if the attack source can be easily faked (0 = cannot be spoofed, 3 = easily spoofed)
- Classification: Use MITRE ATT&CK techniques when applicable
- Behavior: Choose from the predefined taxonomy to ensure consistency
Preparing your contribution
Before asking for a review of your PR, please ensure you have the following:
Tests
Test creation is covered in parsers creation and scenarios creation. Ensure that each of your parser or scenario is properly tested:
- Parser tests: Include sample log files that cover various scenarios (success, failure, edge cases)
- Scenario tests: Test with different input data to ensure proper triggering and no false positives
- Integration tests: Verify that your components work well with existing parsers/scenarios
Documentation
Please provide a .md
file with the same name as each of your parser, scenario or collection. The markdown is rendered in the hub.
Documentation Requirements
- Clear description of what the component does
- Setup instructions and prerequisites
- Configuration examples
- Troubleshooting tips
- Links to relevant documentation
Collection Documentation
If you're creating a collection targeting a specific log file, be sure to provide an acquisition example:
filenames:
- /var/log/xxx/*.log
labels:
type: something
Code Quality
- Follow existing naming conventions
- Use consistent indentation and formatting
- Add comments for complex logic
- Ensure all required fields are properly filled
Opening your PR
Everything is all set, you can now open a PR that will be reviewed and merged!
PR Checklist
Before opening your PR, ensure you can check all items in the PR template. Additional requirements:
- Documentation is complete and accurate
- Code follows the project's style guidelines
- Commit messages are clear and descriptive
- PR description explains the changes and motivation
- Related issues are referenced (if applicable)
Review Process
- PRs are reviewed by maintainers and community members
- Feedback may be requested for improvements
- All CI checks must pass before merging
- Maintainers will merge approved PRs
Troubleshooting
Common Issues
Parser Issues
- Grok patterns not matching: Use online grok testers to validate patterns
- Missing fields: Ensure all required fields are extracted
Scenario Issues
- False positives: Adjust thresholds and conditions
- Not triggering: Check that events are properly parsed and available
- Label validation: Ensure all required labels are present and valid
Testing Issues
- Hubtest creation fails: Ensure you're in the hub repository root directory
- Test configuration errors: Check YAML syntax in
.tests/<test-name>/config.yaml
- Parser assertions failing: Use
cscli hubtest explain <test-name>
to debug parser output - Scenario assertions failing: Verify scenario logic and thresholds with
cscli hubtest explain <test-name>
- Missing test data: Provide comprehensive log samples and assertion files
- CI tests failing: Review the GitHub Actions logs for specific errors
Getting Help
If you encounter issues:
- Check existing GitHub Issues for similar problems
- Ask for help on Discord or Discourse
- Open a new issue with detailed information about your problem