Data Commons

Data Commons — Focus on “Material” ESG Factors

In 2021-2022, OS-C’s focus is factors identified by asset owners, asset managers, banks, and regulators as priorities from among SASB, TCFD, CDSB, GRI, and CDP as “highly material.” For example:

Data Governance as Architecture

To build the Data Commons…

We need to build active communities organized and delivering:

Data Platform Layered Architecture

Ingestion Layer: Responsible for connecting to the source systems and bringing both batch and streaming data into the data platform.

Storage: Persistent storage is provided for long term retention, and a message store (with data expiry policies) with low latency access is provided for data processing.

Technical Metadata: Interface and store for information about activities status of different data platform layers.

Delta Lake: Serving layer for data consumers, providing Atomic, Consistent, Isolated & Durable (ACID) transaction management.

Data Query: Unified ANSI-SQL compliant query federation engine, providing single point of authentication and authorization to data for contributors and users’ community.

Watch Videos about Data Commons

Data Commons Architecture and Infrastructure Demo

Speakers: Vincent Caldeira and Erik Erlandson from Red Hat

Summary:  Obtain details on the Data Commons components including the data management architecture. Learn how we manage data like code, turning a “data mess” into a “data mesh”.

Watch Here

Data Sources and Linkages Across Datasets (Bonus Video)

Speakers: Michael Tiemann from Red Hat

Summary:  Learn more about the data available in the Data Commons, along with how different datasets can be combined to drive insights into climate-smart investing.

Watch Here

Extraction/Transformation of Corporate Data from structured and unstructured sources

Speakers: Lea Deleris from BNP Paribas, Ismail Demir from Allianz IDS GmbH, Jeremy Goh from BNP Paribas, Karan Chauhan from Red Hat, Christian Meyndt

Summary:  Learn how users can leverage OS-C’s NLP toolkit to extract key climate data/metrics from unstructured reports like ESG and annual reports.

Watch Here

View Project Repos on Github