SignalFx Developers Guide

partition_filter()

partition_filter returns a filter object that restricts its output timeseries to a subset of the entire set of input time series. Use partition_filter to create independent parallel data streams, each of which processes disjunct subsets of timeseries. The union of all the subsets covers the entire set of timeseries you want. collectively cover the entire set of time series of interest.

Syntax

partition_filter(index, size)

Table 1. Parameter definitions
Parameter Type Description

index

number

Specifies the index of the timeseries partition to include. The range of this value is 1 to size, inclusive. For example, if you split the input timeseries into 5 partitions, specifying index=5 returns the 5th partition.

size

number

Total number of partitions to split the timeseries into

Returns a filter object.

Throws

invalid argument
  • Either argument is <= 0

  • index > size

Usage

Use partition_filter to work around stream size limits. It lets you perform unified computations over large numbers of time series. SignalFlow computations have a limit on the maximum number of time series that can be processed in a stream; the limit is 5000 unless your organization is specifically configured for a different limit.

For example, use partition_filter to create several streams, each of which takes part of the overall stream based on the specified index.

Examples

The following examples assume that the total number of timeseries is approximately three times the stream size limit so the timeseries have to be split into 3 partitions.

To compute the global maximum across all time series:

1
2
3
4
A = data('jvm.cpu.load', filter=partition_filter(1, 3)).max()
B = data('jvm.cpu.load', filter=partition_filter(2, 3)).max()
C = data('jvm.cpu.load', filter=partition_filter(3, 3)).max()
max(A, B, C).publish('global max')

To compute the global mean across all time series:

1
2
3
4
5
6
A = data('jvm.cpu.load', filter=partition_filter(1, 3))
B = data('jvm.cpu.load', filter=partition_filter(2, 3))
C = data('jvm.cpu.load', filter=partition_filter(3, 3))
total = A.sum() + B.sum() + C.sum()
count = A.count() + B.count() + C.count()
(total / count).publish('global mean')

To compute the global top 10 across all time series:

1
2
3
4
A = data('jvm.cpu.load', filter=partition_filter(1, 3)).top(10)
B = data('jvm.cpu.load', filter=partition_filter(2, 3)).top(10)
C = data('jvm.cpu.load', filter=partition_filter(3, 3)).top(10)
union(A, B, C).top(10).publish('global top 10')

© Copyright 2019 SignalFx.

Third-party license information