Note

This project is under active development.

Welcome to LabGraph’s documentation!

LabGraph is a Python library for storing and retrieving materials science data stored in MongoDB. This library enforces a data model tailored for experimental materials data.

At a high-level, data is stored as a directed graph of four node types: Material, Action, Measurement, and Analysis. The content of these nodes is up to you – we just make sure that any data you enter results in a valid graph.

An example graph for a single Sample

This is a graph for a single Sample. Learn more in the Schema Overview section.

Node Types

Here is a brief overview of the four node types and their roles in the data model. Further details on allowed node relationships can be found in the Schema Overview section.

  • Material nodes are the fundamental building blocks of the data model. These represent a material in a given state.

  • Action nodes are operations that generate Material nodes have incoming edges from any input Material can generate Material from a vendor or receiving a Material from a collaborator.

  • Measurement nodes act upon a Material node to yield some form of raw data.

  • Analysis nodes act upon Measurement and/or Analysis nodes to yield some form of processed data.

Samples = Graphs

As materials scientists, we execute sets of Action to synthesize and study Material. A Sample is simply a graph of nodes that captures the steps performed in an experiment. In typical usage, we will enter nodes into the database as part of a Sample. This achieves a few things:

  • We can ensure that the graph we are entering is valid (i.e. it is a directed acyclic graph (DAG) with no isolated nodes).

  • Given a node, we can easily retrieve the most related nodes that belong to the same Sample.

  • We can record any Sample-level metadata (e.g. sample name, sample description, sample author, etc.).

Database Backend

LabGraph uses MongoDB as our backend database. We communicate with the database using the pymongo package.