streamsx.topology.schema

Schemas for streams.

Overview

A stream represents an unbounded flow of tuples with a declared schema so that each tuple on the stream complies with the schema. A stream’s schema may be one of:

  • StreamsSchema structured schema - a tuple is a sequence of attributes, and an attribute is a named value of a specific type.

  • Json a tuple is a JSON object.

  • String a tuple is a string.

  • Python a tuple is any Python object, effectively an untyped stream.

Structured schemas

A structured schema is a sequence of attributes, and an attribute is a named value of a specific type. For example a stream of sensor readings can be represented as a schema with three attributes sensor_id, ts and reading with types of int64, int64 and float64 respectively.

This schema can be declared a number of ways:

Python 3.6:

class SensorReading(typing.NamedTuple):
    sensor_id: int
    ts: int
    reading: float

sensors = raw_readings.map(parse_sensor, schema=SensorReading)

Python 3:

SensorReading = typing.NamedTuple('SensorReading',
    [('sensor_id', int), ('ts', int), ('reading', float)]

sensors = raw_readings.map(parse_sensor, schema=SensorReading)

Python 3:

sensors = raw_readings.map(parse_sensor,
    schema='tuple<int64 sensor_id, int64 ts, float64 reading>')

The supported types are defined by IBM Streams and are listed in StreamSchema.

Structured schemas provide type-safety and efficient network serialization when compared to passing a dict using Python streams.

Streams with structured schemas can be interchanged with any IBM Streams application using publish() and subscribe() maintaining type safety.

Defining a stream’s schema

Every stream within a Topology has defined schema. The schema may be defined explictly (for example map() or subscribe()) or implicity (for example filter() produces a stream with the same schema as its input stream).

Explictly defining a stream’s schema is flexible and various types of values are accepted as the schema.

  • Builtin types as aliases for common schema types:

    • json (module) - for Json

    • str - for String

    • object - for Python

  • Values of the enumeration CommonSchema

  • An instance of typing.NamedTuple (Python 3)

  • An instance of StreamSchema

  • A string of the format tuple<...> defining the attribute names and types. See StreamSchema for details on the format and types supported.

  • A string containing a namespace qualified SPL stream type (e.g. com.ibm.streams.geospatial::FlightPathEncounterTypes.Observation3D)

Module contents

Functions

is_common

Is schema an common schema.

Classes

CommonSchema

Common stream schemas for interoperability within Streams applications.

StreamSchema

Defines a schema for a structured stream.