A RESTBase queuing module for Apache Kafka
The purpose of the change propagation service is executing actions based on events. The service listens to kafka topics, and executes handlers for events according to configurable rules. Currently, a rule could issue HTTP requests, produce new messages, or make an HTCP purge request. The list of supported actions is easily expandable by creating new modules with internal HTTP endpoints and calling them from the rules.
A Rule
is a semantically meaningful piece of service functionality. For example, 'Rerender RESTBase if the page was changed', or 'Update summary if RESTBase render was changed' are both rules. To specify the rules, you need to add a property to the kafka
module config template property. Each rule is executed by a single worker, but internal load-balancing mechanism tries to distribute rules to workers equally.
The rule can contain the following properties:
match
properties were satisfied by the message. Properties could be nested objects, constants or a regex. Regex could contain capture groups and captured values will later be accessible in the exec
part of the rule. Capture groups could be named, using the (?<name>group)
syntax, then the captured value would be accessible under match.property_name.capture_name
within the exec
part. Named and unnamed captures can not be mixed together.exec
part of the rule. The match_not
may be an array with the semantics of logical OR - if any of the array items match, the match_not
matches.context
that has message
global property with an original message, and match
property with values extracted by the match.consumer
property of the kafka
module config. See the librdkafka documentation for available properties.Here's an example of the rule, which would match all resource_change
messages, emitted by RESTBase
, and purge varnish caches for the resources by issuing an HTTP request to a special internal module, that would convert it to HTCP purge and make an HTCP request:
purge_varnish: topic: resource_change match: meta: uri: '/^https?:\/\/[^\/]+\/api\/rest_v1\/(?<rest>.+)$/' tags: - restbase exec: method: post uri: '/sys/purge/' body: - meta: uri: '//{{message.meta.domain}}/api/rest_v1/{{match.meta.uri.rest}}'
For testing locally you need to setup and start Apache Kafka and set the KAFKA_HOME
environment variable to point to the Kafka home directory and KAFKA_VERSION
environment variable to specify the desired Kafka version (1.1.0) Here's a sample script you need to run:
export KAFKA_HOME=<your desired kafka install path> export KAFKA_VERSION=1.1.0 echo "KAFKA_HOME=$KAFKA_HOME" >> ~/.bash_profile echo "PATH=\$PATH:\$KAFKA_HOME/bin" >> ~/.bash_profile npm install npm run install-kafka
Apart from the above you need to have a running Redis server locally on your machine.
Before starting the development version of change propagation or running test you need to start Zookeeper and Kafka with start-kafka
npm script. To stop Kafka and Zookeeper tun stop-kafka
npm script.
To run tests against local schemas one must be using Node 7.6.0 or higher and set DEV_BASE_URI
to the directory of the schemas:
export DEV_BASE_URI=<directory_of_schemas> npm test
To run the service locally, you need to have to have kafka and zookeeper installed and run. Example of installation and configuration can be found in the Testing section of this readme. After kafka is installed, configured, and run with npm run start-kafka
command, copy the example config and run the service:
cp config.example.yaml config.yaml npm start
Also, before using the service you need to ensure that all topics used in your config exist in kafka. Topics should be prefixed with a datacenter name (default is default
). Also, each topic must have a retry topic. So, if you are using a topic named test_topic
, the follwing topics must exist in kafka:
- 'default.test_topic' - 'default.change-prop.retry.test_topic'
The service is maintained by the Wikimedia Services Team. For bug reporting use EventBus project on Phabricator or #wikimedia-services IRC channel on freenode.