Collections API¶

Collections¶

`Streaming`([stream, example])	Superclass for streaming collections
`Streaming.map_partitions`(func, args, *kwargs)	Map a function across all batch elements of this stream
`Streaming.accumulate_partitions`(func, *args, …)	Accumulate a function with state across batch elements
`Streaming.verify`(x)	Verify elements that pass through this stream

Batch¶

`Batch`
`Batch.filter`
`Batch.map`
`Batch.pluck`
`Batch.to_dataframe`
`Batch.to_stream`

Dataframes¶

`DataFrame`
`DataFrame.groupby`
`DataFrame.rolling`
`DataFrame.assign`
`DataFrame.sum`
`DataFrame.mean`
`DataFrame.cumsum`
`DataFrame.cumprod`
`DataFrame.cummin`
`DataFrame.cummax`
`DataFrame.plot`

`GroupBy`
`GroupBy.count`
`GroupBy.mean`
`GroupBy.size`
`GroupBy.std`
`GroupBy.sum`
`GroupBy.var`

`Rolling`(sdf, window, min_periods)	Rolling aggregations
`Rolling.aggregate`(args, *kwargs)	Rolling aggregation
`Rolling.count`(args, *kwargs)	Rolling count
`Rolling.max`()	Rolling maximum
`Rolling.mean`()	Rolling mean
`Rolling.median`()	Rolling median
`Rolling.min`()	Rolling minimum
`Rolling.quantile`(args, *kwargs)	Rolling quantile
`Rolling.std`(args, *kwargs)	Rolling standard deviation
`Rolling.sum`()	Rolling sum
`Rolling.var`(args, *kwargs)	Rolling variance

`DataFrame.window`
`Window.apply`
`Window.count`
`Window.groupby`
`Window.sum`
`Window.size`
`Window.std`
`Window.var`

`Rolling.aggregate`(args, *kwargs)	Rolling aggregation
`Rolling.count`(args, *kwargs)	Rolling count
`Rolling.max`()	Rolling maximum
`Rolling.mean`()	Rolling mean
`Rolling.median`()	Rolling median
`Rolling.min`()	Rolling minimum
`Rolling.quantile`(args, *kwargs)	Rolling quantile
`Rolling.std`(args, *kwargs)	Rolling standard deviation
`Rolling.sum`()	Rolling sum
`Rolling.var`(args, *kwargs)	Rolling variance

Random([freq, interval, dask]) A streaming dataframe of random data

Details¶

class streamz.collection.Streaming(stream=None, example=None)¶

Superclass for streaming collections

Do not create this class directly, use one of the subclasses instead.

Parameters:

stream: streamz.Stream

example: object

An object to represent an example element of this stream

See also

Streaming.accumulate_partitions

verify(x)¶: Verify elements that pass through this stream

class streamz.dataframe.Rolling(sdf, window, min_periods)¶

Rolling aggregations

This intermediate class enables rolling aggregations across either a fixed number of rows or a time window.

Examples

>>> sdf.rolling(10).x.mean()  
>>> sdf.rolling('100ms').x.mean()  

Methods

`aggregate`(args, *kwargs)	Rolling aggregation
`count`(args, *kwargs)	Rolling count
`max`()	Rolling maximum
`mean`()	Rolling mean
`median`()	Rolling median
`min`()	Rolling minimum
`quantile`(args, *kwargs)	Rolling quantile
`std`(args, *kwargs)	Rolling standard deviation
`sum`()	Rolling sum
`var`(args, *kwargs)	Rolling variance

aggregate(*args, **kwargs)¶: Rolling aggregation

count(*args, **kwargs)¶: Rolling count

max()¶: Rolling maximum

mean()¶: Rolling mean

median()¶: Rolling median

min()¶: Rolling minimum

quantile(*args, **kwargs)¶: Rolling quantile

std(*args, **kwargs)¶: Rolling standard deviation

sum()¶: Rolling sum

var(*args, **kwargs)¶: Rolling variance

class streamz.dataframe.Random(freq='100ms', interval='500ms', dask=False)¶

A streaming dataframe of random data

The x column is uniformly distributed. The y column is poisson distributed. The z column is normally distributed.

This class is experimental and will likely be removed in the future

Parameters:

freq: timedelta

The time interval between records

interval: timedelta

The time interval between new dataframes, should be significantly larger than freq

Attributes

`columns`
`dtypes`
`index`

Methods

`accumulate_partitions`(func, args, *kwargs)	Accumulate a function with state across batch elements
`assign`(**kwargs)	Assign new columns to this dataframe
`cummax`()	Cumulative maximum
`cummin`()	Cumulative minimum
`cumprod`()	Cumulative product
`cumsum`()	Cumulative sum
`emit`(x)
`groupby`(other)	Groupby aggreagtions
`map_partitions`(func, args, *kwargs)	Map a function across all batch elements of this stream
`mean`()	Average
`plot`([backlog, width, height])	Plot streaming dataframe as Bokeh plot
`rolling`(window[, min_periods])	Compute rolling aggregations
`round`([decimals])	Round elements in frame
`stop`()
`sum`()	Sum frame
`tail`([n])	Round elements in frame
`to_frame`()	Convert to a streaming dataframe
`verify`(x)	Verify consistency of elements that pass through this stream