Stream sliding window quote analytics

Filtered stream of Nasdaq major bank quote analytics

Provide trading consumer application with quotes analytics for all major banks trading on the nasdaq stock market for the current business week

Sliding window analytics

Resources

Getting started project can be found here.

Key takeaways

The tutorial will teach you how to use Joules OOTB features to filter, perform sliding window analytics and publish AVRO formatted events to a Kafka topic and csv file.

As a first process we have covered a number of key features:

  • Subscribe and consume events Subscribe, consume, parse and present events ready for pipeline processing using Kafka.

  • Event filtering Apply filter for a subset of events using Javascript expressions.

  • Sliding Window Analytics Define a set of analytics grouped by symbol to be executed over a sliding window of events.

  • Publishing events Send processed events to a persistent Parquet file and to a Kafka topic using a defined AVRO domain data structure.

Use case development

1

Define the use case objective

Provide trading consumer application with quotes analytics for all major banks trading on the nasdaq stock market for the current business week.

Additionally:

  • The use case should only be processing for a single defined market business week.

  • Events to be sent to a Kafka topic and a persistent parquet file using the same data format.

Change the the valid from and to dates.

2

Define processing pipeline

This use case jumps in to Joule's analytic window features:

  • Filter events by 'Major Banks' industry

  • Apply an analytic sliding time window to calculate aggregate functions and window functions. Window definition: Analytics calculated using a 500ms sliding window over a total window size of 2.5 seconds.

  • Send a quote analytics record with following attributes for every event; symbol, ask_EMA, bid_EMA, volume_SUM, volatility_MEAN, ask_MINMAX_NORM, bid_MINMAX_NORM, ask_ZSCORE and bid_ZSCORE.

Stream definition

3

Subscribe to data sources

We shall use the getting started data simulator by defining the source feed subscribe to live nasdaq quote data (note we are using simulated data)

Source definition

4

Define output destinations

Parquet file output

A quick and easy way to validate your use case processing is to send the resulting events to a parquet file.

Avro Schema

Publish events to consumers

  1. The user emit projection is transformed to provided domain data type using the same AVRO schema definition used for Parquet file output, see above.

  2. The resulting events are then published on to the nasdaq_major_bank_quote_analytics Kafka topic.

A quick recap of how events will be transformed to AVRO data structures:

The same events published to parquet file are published using the same AVRO domain schema on to a Kafka consumer topic.

Sink Definition

5

Deploying the use case

Now we have all the use case definitions we can now deploy to Joule via the Rest API using Postman. Following the same getting started deployment steps for this project.

Go to the "Build your first use case" folder under the Joule - Banking demo / Tutorials Postman examples within the getting started project

6

Review parquet file contents

Open up your favorite parquet view to review the output.

Example tools

  • IntelliJ Parquet viewer

  • PyPI parquet-tools

  • Visual Studio parquet-viewer

Summary

This example covers a number of key features:

Last updated

Was this helpful?