ERD

ERD stands for entity relationship diagram. Objects of type dj.ERD allow visualizing portions of the data pipeline in graphical form. Tables are depicted as nodes and dependencies as directed edges between them. The draw method plots the graph.

Diagram notation

Consider the following ERD

../_images/mp-erd.png

DataJoint uses the following conventions:

  • Tables are indicated as nodes in the graph. The corresponding class name is indicated by each node.
  • Data tiers are indicated as colors and symbols: Lookup=gray asterisk, Manual=green square, Imported=blue circle, Computed=red star, Part=black dot. The names of part tables are indicated in a smaller font.
  • Dependencies are indicated as edges in the graph and always directed downward, forming a directed acyclic graph.
  • Foreign keys contained within the primary key are indicated as solid lines. This means that the referenced table becomes part of the primary key of the dependent table.
  • Foreign keys that are outside the primary key are indicated by dashed lines.
  • If the primary key of the dependent table has no other attributes besides the foreign key, the foreign key is a thick solid line, indicating a 1:{0,1} relationship.
  • Foreign keys made without renaming the foreign key attributes are in black whereas foreign keys that rename the attributes are indicated in red.

Diagramming an entire schema

The schema object for a package can be obtained using its getSchema function. (See Creating Schemas.)

draw(dj.ERD(seq.getSchema))   % draw the ERD

DataJoint provides shortcuts to plot ERD of a table neighborhood or a schema using the erd command:

% plot the ERD of the stimulus schema
erd stimulus

% plot the neighborhood of the stimulus.Trial table
erd stimulus.Trial

% plot the stimulus and experiment schemas and the neighborhood of preprocess.Sync
erd stimulus experiment preprocess.Sync

Initializing with a single table

A dj.ERD object can be initialized with a single table.

draw(dj.ERD(seq.Genome))

A single node makes a rather boring graph but ERDs can be added together or subtracted from each other using graph algebra.

Adding ERDs together

However two graphs can be added, resulting in new graph containing the union of the sets of nodes from the two original graphs. The corresponding foreign keys will be automatically

% plot the ERD with tables Genome and Species from package +seq.
draw(dj.ERD(seq.Genome) + dj.ERD(seq.Species))

Expanding ERDs upstream and downstream

Adding a number to an ERD object adds nodes downstream in the pipeline while subtracting a number from ERD object adds nodes upstream in the pipeline.

Examples:

% Plot all the tables directly downstream from ``seq.Genome``:
draw(dj.ERD(seq.Genome)+1)
% Plot all the tables directly upstream from ``seq.Genome``:
draw(dj.ERD(seq.Genome)-1)
% Plot the local neighborhood of ``seq.Genome``
draw(dj.ERD(seq.Genome)+1-1+1-1)