Version: Next

Contributing to the Hub

Parsers, Scenarios, and Collections allow the CrowdSec Security Engine to detect and block malevolent behavior. Supporting new services or improving the detection capabilities on existing software is a great way to contribute to the CrowdSec ecosystem.

Sharing your parsers, scenarios and collections on the hub allows other users to use them to protect themselves.

Communication

The main communication channels for hub contributions are:

Discord: Best for live interactions and quick questions about hub development
Discourse: Great for discussing ideas, suggesting improvements, or asking detailed questions
GitHub Issues: Use for bug reports and feature requests

Getting Started

Anyone can open an issue about parsers/scenarios, or contribute a change with a pull request (PR) to the crowdsecurity/hub GitHub repository. You need to be comfortable with git and GitHub to work effectively.

Find something to work on

Here are some things you can do today to start contributing:

Help improve existing parsers and scenarios
Add support for new services and applications
Create comprehensive test coverage
Help triage issues and review pull requests
Improve documentation and examples

Find a good first topic

The hub repository has beginner-friendly issues that are a great place to get started:

Look for issues labeled good first issue - these don't require high-level CrowdSec knowledge
Issues labeled help wanted indicate that community help is particularly welcome

Prerequisites

Before contributing, make sure you have:

Basic understanding of YAML syntax
Familiarity with grok patterns (for parsers)
Git and GitHub knowledge
A local CrowdSec installation for testing (optional but recommended)

Contribution Workflow

The basic workflow for contributing:

Have a look at open issues and pull requests
Fork and clone the hub repository
Create/Modify parsers/scenarios/collections
Create/Modify tests to ensure proper coverage
Open a pull request

Testing

Before submitting your contribution, ensure proper testing using cscli hubtest:

Local Testing with cscli hubtest

cscli hubtest is the primary tool for testing hub components. It creates tests for parsers, scenarios, and collections.

1. Create a test

# Create a test for a parser
cscli hubtest create my-parser-test --type syslog

# Create a test for a scenario (skip parser testing)
cscli hubtest create my-scenario-test --type syslog --ignore-parsers

# Create a test for specific components
cscli hubtest create my-test --parsers crowdsecurity/nginx --scenarios crowdsecurity/http-probing

2. Configure your test

Edit the generated configuration file (.tests/<test-name>/config.yaml):

parsers:
    - crowdsecurity/syslog-logs
    - ./parsers/s01-parse/crowdsecurity/my-parser.yaml
scenarios:
    - ./scenarios/crowdsecurity/my-scenario.yaml
postoverflows:
log_file: my-test.log
log_type: syslog
ignore_parsers: false

3. Add test data and assertions

Log file: Add sample logs to .tests/<test-name>/<test-name>.log
Parser assertions: Define expected parsed fields in parser.assert
Scenario assertions: Define expected alerts in scenario.assert

Note: When you first run cscli hubtest run, it will output the generated assertions that you need to fill out in parser.assert or scenario.assert files. You can find examples of assertion files in the hub repository at .tests/<existing-test>/parser.assert and .tests/<existing-test>/scenario.assert.

4. Run your test

# Run a specific test
cscli hubtest run my-parser-test

# Run all tests
cscli hubtest run --all

# Run with verbose output
cscli hubtest run my-parser-test -v

5. Debug your test

# Explain test results
cscli hubtest explain my-parser-test

# Show only failures
cscli hubtest explain my-parser-test --failures

# Verbose explanation
cscli hubtest explain my-parser-test -v

CI/CD Testing

The hub repository uses GitHub Actions for automated testing:

All parsers and scenarios are automatically tested when you open a PR
Tests include syntax validation, pattern matching, and integration tests
Make sure all tests pass before requesting review

Git Workflow / Branch Management

We receive contributions on the master branch. To contribute:

Fork the repository on GitHub

Clone your fork locally:

git clone https://github.com/yourusername/hub.git
cd hub

Create a dedicated branch for your contribution:
```
git checkout -b feature/your-feature-name
```
Make your changes and commit them with descriptive messages

Push to your fork:

git push origin feature/your-feature-name

Open a Pull Request targeting the master branch

Branch Naming Convention

Use descriptive branch names that indicate the type of contribution:

feature/parser-nginx-access-logs
fix/scenario-ssh-bruteforce-labels
docs/collection-apache-examples

Commit Messages

Write clear, descriptive commit messages:

Use imperative mood: "Add parser for Apache access logs"
Reference issues when applicable: "Fix scenario labels (#123)"
Keep the first line under 50 characters

Guidelines

YAML Best Practices

When creating parsers, scenarios, and collections, follow these YAML guidelines:

Avoid YAML Anchors

❌ Don't use YAML anchors - they make YAML files harder to maintain and understand:

# BAD - Using anchors in nodes
nodes:
- grok:
    pattern: 'pattern1'
    apply_on: message
  nodes: &message_parsers
  - grok:
      pattern: "user pattern"
      apply_on: parsedmessage
    statics:
      - meta: sub_type
        value: user_enumeration
- grok:
    pattern: 'pattern2'
    apply_on: message
  nodes: *message_parsers

✅ Duplicate the node structure instead of using anchors:

# GOOD - No anchors, explicit node definitions
nodes:
- grok:
    pattern: 'pattern1'
    apply_on: message
  nodes:
  - grok:
      pattern: "user pattern"
      apply_on: parsedmessage
    statics:
      - meta: sub_type
        value: user_enumeration
- grok:
    pattern: 'pattern2'
    apply_on: message
  nodes:
  - grok:
      pattern: "user pattern"
      apply_on: parsedmessage
    statics:
      - meta: sub_type
        value: user_enumeration

AI-Assisted Generation

We do allow AI-assisted generation of parsers, scenarios, and collections, but with important requirements:

Requirements for AI-Generated Contributions

Follow all guidelines: AI-generated code must still follow all the guidelines in this document
Include comprehensive tests: A pull request with AI-generated code but no tests will be immediately closed
Proper documentation: Include complete documentation and examples as required
Code quality: Ensure the generated code follows CrowdSec conventions and best practices

Disclosure Requirement

Important: When submitting AI-assisted contributions, you must check the "AI was used to generate any/all content of this PR" box in the PR template. You must understand and be able to explain yourself what was generated. AI-generated code will receive additional scrutiny to ensure quality and correctness.

What We Expect

Test coverage using cscli hubtest
Proper error handling and edge cases
Clear documentation and examples
Adherence to CrowdSec patterns and conventions
Human review and validation of the AI output

AI is a powerful tool, but it should augment human expertise, not replace proper testing and review processes.

Technical Documentation

The following explains how to create and test:

Collections

Collections group related parsers, scenarios, and postoverflows together. It often makes sense for a new parser or scenario to be added to an existing collection, or create a new one.

When to create a new collection:

Your parsers and/or scenarios cover a new or specific service
You're adding multiple related components that work together
The existing collections don't fit your use case

When to add to existing collections:

Adding a parser for a specific web server's access logs that would benefit from existing HTTP-related scenarios
Your contribution enhances an existing service's detection capabilities
Your scenario complements existing parsers in a collection

Collection structure:

Each collection should include:

Descriptive name and description
Proper labels and tags
Documentation with setup instructions
Example acquisition configuration
Related parsers, scenarios, and postoverflows

Scenarios

Scenarios define the logic for detecting attacks and suspicious behavior. When you create a scenario, you must fill some fields in the labels, else the CI won't accept the contribution.

Required Labels

Those labels are:

classification: this array contains the CVE ID and the Mitre Techniques related to the scenario (when applicable)
spoofable: between 0 and 3, is the chance that the attacker behind the attack can spoof its origin
confidence: between 0 and 3, is the confidence that the scenario will not trigger false positive
behaviors: an existing behavior in this file
label : a human readable name for the scenario
cti : (optional) true or false, used to specify that a scenario is mainly used for audit rather than detecting a threat

Here is the labels documentation for more information.

Example Scenario Labels

labels:
  service: ssh
  confidence: 3
  spoofable: 0
  classification:
    - attack.T1110
  label: "SSH Bruteforce"
  behavior: "ssh:bruteforce"
  remediation: true

Label Guidelines

Confidence: Start with 3 (high confidence) and reduce if you're unsure
Spoofable: Consider if the attack source can be easily faked (0 = cannot be spoofed, 3 = easily spoofed)
Classification: Use MITRE ATT&CK techniques when applicable
Behavior: Choose from the predefined taxonomy to ensure consistency

Preparing your contribution

Before asking for a review of your PR, please ensure you have the following:

Tests

Test creation is covered in parsers creation and scenarios creation. Ensure that each of your parser or scenario is properly tested:

Parser tests: Include sample log files that cover various scenarios (success, failure, edge cases)
Scenario tests: Test with different input data to ensure proper triggering and no false positives
Integration tests: Verify that your components work well with existing parsers/scenarios

Documentation

Please provide a .md file with the same name as each of your parser, scenario or collection. The markdown is rendered in the hub.

Documentation Requirements

Clear description of what the component does
Setup instructions and prerequisites
Configuration examples
Troubleshooting tips
Links to relevant documentation

Collection Documentation

If you're creating a collection targeting a specific log file, be sure to provide an acquisition example:

filenames:
- /var/log/xxx/*.log
labels:
  type: something

Code Quality

Follow existing naming conventions
Use consistent indentation and formatting
Add comments for complex logic
Ensure all required fields are properly filled

Opening your PR

Everything is all set, you can now open a PR that will be reviewed and merged!

PR Checklist

Before opening your PR, ensure you can check all items in the PR template. Additional requirements:

Documentation is complete and accurate
Code follows the project's style guidelines
Commit messages are clear and descriptive
PR description explains the changes and motivation
Related issues are referenced (if applicable)

Review Process

PRs are reviewed by maintainers and community members
Feedback may be requested for improvements
All CI checks must pass before merging
Maintainers will merge approved PRs

Troubleshooting

Common Issues

Parser Issues

Grok patterns not matching: Use online grok testers to validate patterns
Missing fields: Ensure all required fields are extracted

Scenario Issues

False positives: Adjust thresholds and conditions
Not triggering: Check that events are properly parsed and available
Label validation: Ensure all required labels are present and valid

Testing Issues

Hubtest creation fails: Ensure you're in the hub repository root directory
Test configuration errors: Check YAML syntax in .tests/<test-name>/config.yaml
Parser assertions failing: Use cscli hubtest explain <test-name> to debug parser output
Scenario assertions failing: Verify scenario logic and thresholds with cscli hubtest explain <test-name>
Missing test data: Provide comprehensive log samples and assertion files
CI tests failing: Review the GitHub Actions logs for specific errors

Getting Help

If you encounter issues:

Check existing GitHub Issues for similar problems
Ask for help on Discord or Discourse
Open a new issue with detailed information about your problem

Communication​

Getting Started​

Find something to work on​

Find a good first topic​

Prerequisites​

Contribution Workflow​

Testing​

Local Testing with cscli hubtest​

1. Create a test​

2. Configure your test​

3. Add test data and assertions​

4. Run your test​

5. Debug your test​

CI/CD Testing​

Git Workflow / Branch Management​

Branch Naming Convention​

Commit Messages​

Guidelines​

YAML Best Practices​

Avoid YAML Anchors​

AI-Assisted Generation​

Requirements for AI-Generated Contributions​

Disclosure Requirement​

What We Expect​

Technical Documentation​

Collections​

When to create a new collection:​

When to add to existing collections:​

Collection structure:​

Scenarios​

Required Labels​

Example Scenario Labels​

Label Guidelines​

Preparing your contribution​

Tests​

Documentation​

Documentation Requirements​

Collection Documentation​

Code Quality​

Opening your PR​

PR Checklist​

Review Process​

Troubleshooting​

Common Issues​

Parser Issues​

Scenario Issues​

Testing Issues​

Getting Help​

Useful Resources​