Event Log (Streams) Data Structure Standards?

Per Joel’s idea here, is anyone else trying to standardize across event streams in the community?

  • Data structure standards
  • DID authentication
  • Timestamps through anchors

Data structure standards is important.
Do we have data scientist input here?

Want to underline to the group not to think of this as “dynamic vs static” - event streams are foundational to all structured data.

Given immutable storage like Blockchains or IPFS - event logs are the best way to model evolution of state of any kind - so it’s crucial to standardize to make Web3 structured data interoperable.

There are hundreds of things to think about here:

  • Using compact analytics-friendly formats for raw data (e.g. Parquet / Arrow)
  • Metadata formats (e.g. IPLD-based)
  • Common schema definition languages (e.g. DDL, JSON Schema, Avro)
  • Identity, signing, encryption, ownership, permissions
  • Evolution of streams over time
    • e.g. how schema change can be performed without disrupting all consumers
    • or how to “compact” the stream produced by some high-volume IoT device
  • Enforcing certain properties like bitemporality for event-time vs ingest-time processing
  • etc.

We have parts of this covered in our protocol spec - would be very interested to find overlaps and a common ground. Structured data definitely deserves something better than CSV / JSON.