As Apache AGE is an extension to PostgreSQL. I am curious about how and where the data of vertices (nodes) and edges are stored in Apache AGE. Does it use PostgreSQL for this? If yes, then how.
-
Please refer to this [article](https://dev.to/rafsun42/how-apache-age-turns-a-relational-dbms-into-a-graph-dbms-1i48). – Rafsun Masud Mar 27 '23 at 18:46
6 Answers
When you create a graph with SELECT * FROM create_graph('ag_graph')
it adds it's name and namespace to ag_catalog.ag_graph
, and also create two tables within this namespace: _ag_label_vertex
and _ag_label_edge
. These will be the parent tables of any new vertex or edge label you create.
So then, if you want to see all the vertices or edges in your graph, you can execute the following queries:
SELECT * FROM "ag_graph"._ag_label_vertex;
SELECT * FROM "ag_graph"._ag_label_edge;
Then, whenever you add a vertex or edge label, it is going to store them in a new table for the label, like "ag_graph"."new_label"
which will inherit from one of these parent labels. Because of this inheritance system which postgres allows, querying for the parent label tables also retrieves the child label tables.

- 716
- 1
- 10
-
Now I am clear about the edge and vertex labels storage. They are stored in Tables. My question is related to actual data of any edge or node. Is it stored the same way as any table's data is stored in PostgreSQL? But conceptually and organization of data is with respect to graphs (node and edge) not (foreign key, primary key). Is it correct? – Muneeb Khan Mar 24 '23 at 12:47
-
They are stored with two columns for each table: the `id` column and the `properties` column, which will be a JSON like format that stores the content of every vertex or edge. And they don't have primary and foreign key, – Matheus Farias Mar 24 '23 at 19:50
Yes, the nodes and edges get stored in separate tables. You can see the full list of tables if you do:
SELECT * FROM information_schema.tables;
You can see the node/edge data if you do:
SELECT * FROM <graph_name>.<node/edge_label> LIMIT 10;
If you are unsure of the name you gave your graph, you can do:
SELECT * FROM ag_catalog.ag_graph
... to get a full list of graphs that you've stored using AGE.
Here are examples of two different tables in a test data set that I use comprised of Airports and defined airline routes between Airports. The first table is of vertices where each Airport is a vertex:
postgresDB=# \d airroutes.airport
Table "airroutes.airport"
Column | Type | Collation | Nullable | Default
------------+---------+-----------+----------+-------------------------------------------------------------------------------------------------------------------
id | graphid | | not null | _graphid((_label_id('airroutes'::name, 'airport'::name))::integer, nextval('airroutes.airport_id_seq'::regclass))
properties | agtype | | not null | agtype_build_map()
Indexes:
"airport_prop_idx" btree (agtype_access_operator(VARIADIC ARRAY[properties, '"code"'::agtype]))
Inherits: airroutes._ag_label_vertex
And then I have edges that define the routes between airports:
postgresDB=# \d airroutes.route
Table "airroutes.route"
Column | Type | Collation | Nullable | Default
------------+---------+-----------+----------+---------------------------------------------------------------------------------------------------------------
id | graphid | | not null | _graphid((_label_id('airroutes'::name, 'route'::name))::integer, nextval('airroutes.route_id_seq'::regclass))
start_id | graphid | | not null |
end_id | graphid | | not null |
properties | agtype | | not null | agtype_build_map()
Inherits: airroutes._ag_label_edge
A view of the first 5 airports:
postgresDB=# SELECT * FROM airroutes.airport LIMIT 5;
id | properties
-----------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
844424930131969 | {"id": "1", "lat": "33.63669968", "lon": "-84.42810059", "city": "Atlanta", "code": "ATL", "desc": "Hartsfield - Jackson Atlanta International Airport", "elev": "1026", "icao": "KATL", "__id__": 1, "region": "US-GA", "country": "US", "longest": "12390", "runways": "5"}
844424930131970 | {"id": "2", "lat": "61.17440033", "lon": "-149.9960022", "city": "Anchorage", "code": "ANC", "desc": "Anchorage Ted Stevens", "elev": "151", "icao": "PANC", "__id__": 2, "region": "US-AK", "country": "US", "longest": "12400", "runways": "3"}
844424930131971 | {"id": "3", "lat": "30.19449997", "lon": "-97.66989899", "city": "Austin", "code": "AUS", "desc": "Austin Bergstrom International Airport", "elev": "542", "icao": "KAUS", "__id__": 3, "region": "US-TX", "country": "US", "longest": "12250", "runways": "2"}
844424930131972 | {"id": "4", "lat": "36.12450027", "lon": "-86.67819977", "city": "Nashville", "code": "BNA", "desc": "Nashville International Airport", "elev": "599", "icao": "KBNA", "__id__": 4, "region": "US-TN", "country": "US", "longest": "11030", "runways": "4"}
844424930131973 | {"id": "5", "lat": "42.36429977", "lon": "-71.00520325", "city": "Boston", "code": "BOS", "desc": "Boston Logan", "elev": "19", "icao": "KBOS", "__id__": 5, "region": "US-MA", "country": "US", "longest": "10083", "runways": "6"}
(5 rows)
A view of the first 5 routes:
postgresDB=# SELECT * FROM airroutes.route LIMIT 5;
id | start_id | end_id | properties
------------------+-----------------+-----------------+-------------------------------------------------------------------
1688849860263937 | 844424930131969 | 844424930131971 | {"dist": "809", "route_id": "3749", "end_vertex_type": "airport"}
1688849860263938 | 844424930131969 | 844424930131972 | {"dist": "214", "route_id": "3750", "end_vertex_type": "airport"}
1688849860263939 | 844424930131969 | 844424930131973 | {"dist": "945", "route_id": "3751", "end_vertex_type": "airport"}
1688849860263940 | 844424930131969 | 844424930131974 | {"dist": "576", "route_id": "3752", "end_vertex_type": "airport"}
1688849860263941 | 844424930131969 | 844424930131975 | {"dist": "546", "route_id": "3753", "end_vertex_type": "airport"}
(5 rows)

- 1,963
- 6
- 12
-
Now I am clear about the edge and vertex labels storage. They are stored in Tables. My question is related to actual data of any edge or node. Is it stored the same way as any table's data is stored in PostgreSQL? But conceptually and organization of data is with respect to graphs (node and edge) not (foreign key, primary key). Is it correct? – Muneeb Khan Mar 24 '23 at 12:48
-
Updated my response above to include more detail. It uses the same underlying storage that a Postgres table would use. Although properties are not stored as separate columns, but as JSON/map objects within a single column. The tables don't have defined primary/foreign keys, but those are somewhat implicit in the data structure/model. – Taylor Riggan Mar 24 '23 at 13:34
YES, Apache AGE uses PostgreSQL for storing data. The graph data can be stored by creating separate tables for Vertices (Nodes) and edges. In the vertices table, each row represents a vertex and can have additional columns to store properties related to the vertices. Similarly, in the Edges table, each row represents an edge between two vertices and can have additional columns to store properties related to the edges.
Here is a simple example of tables.
Note: You can add columns for additional properties.
--Table for vertices
CREATE TABLE vertices (
id SERIAL PRIMARY KEY,
label TEXT,
);
-- Table for edges
CREATE TABLE edges (
id SERIAL PRIMARY KEY,
label TEXT,
source_id INT REFERENCES vertices(id),
target_id INT REFERENCES vertices(id),
);
[apache-age] [PostgreSQL][graph-database]

- 23
- 4
Yes, Apache AGE uses PostgreSQL to store data. It makes use of the underlying relational database management system, where each graph is represented as a separate schema within the PostgreSQL database. Within this schema, each label in the graph has its own table. The entities, which can represent vertices (nodes), are stored in their respective label's table as a row. If entities do not have a label, they are stored in one of two default tables. For edges I think they are stored in a similar manner to nodes, with additional information to indicate the start and end nodes of the edge.

- 11
- 2
Apache AGE uses postgreSQL tables to store the vertices and edges. To summarise, there are a few ways to do this, e.g. tables and columns to store the properties and attributes associated with every vertex and edge. Next, foreign keys are also used to create relationships among the vertices and edges. Indexes are also used to improve the performance of the queries and the users can also create indices on specific properties. Lastly, SQL queries are also used to perform different functions such as traversing the graph, finding paths, and performing different functionalities on the graph.
Hope this answer helps and gives you a better understanding of the topics!

- 205
- 2
when you run create_graph()
function which is a function from the ag_catalog
schema it creates a new schema or namespace in that database using the graph_name
where it stores all the object pertaining to that graph. You can list out the schemas in your database before and after you create your graph using the following command
FROM information_schema.schemata;
In the graph any vertices, label, edges etc... created would be stored in that schema created specifically for the graph, using the underlying relational database technology using traditional tables

- 43
- 4