You can query a knowledge graph to find a subset of the entities and relationships it contains and identify how different entities are connected. Provenance records can be used and provenance records can optionally be included in the query results. See the following examples:
- From a knowledge graph representing the spread of an infectious disease, work with humans and animals associated through any relationship with a given facility.
- From a knowledge graph representing a manufacturing supply chain, work with any content associated with a specific part including suppliers, means of delivery, warehouses, and so on.
- From a knowledge graph representing an organization, work with devices of a given type, and list their properties, including the name of the responsible employee.
- From a knowledge graph representing tortoises and their habitats, identify habitats where the level of risk was established using information in a specific environmental impact assessment.
You can identify the subset of entities and relationships, or their properties, by querying the knowledge graph. Use openCypher query language to write openCypher queries to discover related entities and their properties and work with this restricted set of information in the knowledge graph, a map, or a link chart.
Write an openCypher query
openCypher queries are to graph databases what SQL queries are to relational databases. The basic structure of the query is to find, or match, entities and return those entities, where the entities you want to find are identified in parentheses. For example, the query MATCH (e) RETURN e returns entities of any type. The number of entities returned is only limited by the knowledge graph's configuration. To restrict the number of graph items returned, use a LIMIT expression. For example, the query MATCH (e) RETURN e LIMIT 5 will return five entities of any type.
The query can identify entities that are related using symbols that create an arrow. For example, the query MATCH (e1)-->(e2) RETURN e1,e2 will return pairs of entities, e1 and e2, where any type of relationship exists between the two entities and any path from entity e1 to entity e2 connects the entities. If the query was written with the arrow pointing in the other direction, paths would be considered starting from the origin entity e2, to the destination entity e1: MATCH (e1)<--(e2) RETURN e1,e2. The manner in which entities are related to each other is referred to as a pattern.
The query can identify specific relationships that should be considered in square brackets. For example, the query MATCH (e1)-[]->(e2) RETURN e1,e2 will return pairs of entities, e1 and e2, where a single relationship of any type connects the two entities. This query shows another way to represent the same queries illustrated above, and illustrates the preferred query syntax. The query can be amended to return the entire tuple describing the relationship by returning the origin entity, e1, the relationship, r, and the destination entity, e2, as follows: MATCH (e1)-[r]->(e2) RETURN e1,r,e2. Similar queries MATCH (e1)-[ ]->( )-[ ]->(e2) RETURN e1,e2 or MATCH (e1)-[*2]->(e2) RETURN e1,e2 will return pairs of entities that are connected by two relationships in the same direction. Queries can also identify patterns where relationships have different directions such as MATCH (e1)-[ ]->(e2)<-[ ]-(e3) RETURN e1,e2,e3.
The example queries above can be used with any knowledge graph.
Tailor a query to a specific knowledge graph by referencing the entity types, relationship types, and properties defined in its data model. Include the name of a specific entity type in your query to constrain the graph items that are considered. For example, the query MATCH (e1:Person)-[r]->(e2) RETURN e1,r,e2 will return all Person entities, e1, in which any relationship, r, connects the Person to another entity, e2, which can be an entity of any type. Compared to the previous example, relationships in which a Pet, Vehicle, or Document entity is the origin of a relationship aren't included in the results.
You can constrain the query to consider specific relationship types and specific related entities by adding relationship types and entity types to the other facets of the query. For example, MATCH (p:Person)-[v:HasVehicle]->(e) RETURN p,v,e will return all Person entities, p, in which a HasVehicle relationship, v, connects the Person to another entity of any type, e. The variables p and v are assigned to the Person entities and HasVehicle relationships, respectively, so information about them can be returned by the query. Compared to the previous example, relationships in which a Pet or Document entity are the destination of a relationship aren't included in the results. Depending on the knowledge graph's data model, the destination entity, e, could be a generic Vehicle entity, or it could be one of a series of specific entity types such as Automobile, Motorcycle, Boat, Airplane, Commercial Vehicle, and so on.
Specific properties of entities and relationships can be included in the query results. For example, MATCH (p:Person)-[:HasVehicle]->(e) RETURN p,e.make,e.model,e.year will run the same query defined previously. However, instead of showing the destination entity itself, the results will show the values stored in several of its properties: the make, model, and year of the vehicle, respectively. In this example, a variable was not assigned for the specific relationship being considered by the query because the relationship's data is not included in the query results or evaluated elsewhere in the query.
Similarly, you can constrain the entities and relationships that are evaluated by specifying properties that define the entities and relationships of interest. The properties to consider are defined by adding a WHERE clause to the query. As with the examples above, variables must be assigned to reference specific information about entities and relationships in the WHERE clause. For example, in the following query, only Person entities with a specific lastName property value are evaluated; HasVehicle relationships are only considered if they have a NULL value in the endDate property; and related Vehicle entities are only considered if the year property has a value that is earlier than 1980: MATCH (p:Person)-[hv:HasVehicle]->(v:Vehicle) WHERE p.lastName = 'Doe' and hv.endDate IS NULL and v.year < 1980 RETURN p,p.firstName,v,v.make,v.year.
Instead of returning a series of individual entities and relationships, your query can return the complete path represented by a pattern. To do this, assign the pattern defined in the MATCH statement to a variable and return that variable. For example, the query MATCH path = (:Person)-[:HasVehicle]->(:Vehicle) RETURN path will return a list of paths for all entity and relationship combinations that satisfy the specified pattern. Each path will contain all parts of the matched pattern: the Person, HasVehicle relationship, and Vehicle. You do not need to assign variables to the individual parts of this pattern since they are not returned by the query.
Use date-time values in a query
When you create a knowledge graph using a hosted graph store or a NoSQL data store with ArcGIS-managed data, several options are available for storing temporal data with ArcGIS Enterprise 11.2 and later versions. When storing and querying temporal data, you must consider where the events are located and where people accessing the data are located.
If you are storing local data and using it in the same area, date and time information can be stored without a concern for the time zone. This data can be stored in properties with the date data type. When the data originates in or is queried from different time zones, care must be taken to ensure correct values are stored and queried despite your location and time zone. Properties with a timestamp offset data type allow you to record date-time values that account for the time zone where the event occurred and the time zone where the data is being used.
You can access date and time values stored in the knowledge graph using an openCypher query. To obtain the correct results, specific functions must be used to query properties of each data type as described in the table below. Using alternate functions can result in an error or unexpected results. When temporal values are provided to or returned by the query, they will always use the ISO 8601 date-time format.
Field type | function | Syntax | Example |
---|---|---|---|
localdatetime() | localdatetime('YYYY-MM-DDThh:mm:ss.sss') | localdatetime('2015-07-24T21:40:53.142') | |
date() | date('YYYY-MM-DD') | date('2015-07-24') | |
localtime() | localtime('hh:mm:ss.sss') | localtime('21:40:53.142') | |
datetime() | datetime('YYYY-MM-DDThh:mm:ss.sssZ') or datetime('YYYY-MM-DDThh:mm:ss.sss+00:00') | datetime('2015-07-24T21:40:53.142Z') or datetime('2015-07-21T21:40:53.142-08:00') |
To evaluate a knowledge graph property with the timestamp offset data type, use the datetime() function in your openCypher query. Use the date-time format YYYY-MM-DDThh:mm:ss.sssZ when the specified value is in coordinated universal time (UTC). Use the format YYYY-MM-DDThh:mm:ss.sss+00:00 when the value is a date in a local time with the appropriate time zone offset. If significant parts of the date-time value are omitted when the value is provided in the query, the query returns an error.
For example, to find all vehicles purchased after a specific time, use a query such as MATCH path=(:Person)-[hv:HasVehicle]->(:Vehicle) WHERE hv.acquisitionDate > datetime('2014-10-18T12:36-08:00') RETURN path. All paths are returned where the HasVehicle relationship has an acquisitionDate property with a value after 12:36pm Pacific Standard Time on October 18, 2014; Pacific Standard Time is eight hours behind UTC.
To find all vehicles purchased in the year 1998, use a query such as MATCH path=(:Person)-[hv:HasVehicle]->(:Vehicle) WHERE hv.acquisitionDate > datetime('1998-01-01T00:00Z') and hv.acquisitionDate < datetime('1998-12-31T23:59Z') RETURN path. All paths are returned where the HasVehicle relationship has an acquisitionDate property between January 1 and midnight of December 31 UTC in the year 1998.
Beginning with ArcGIS Enterprise 11.3, an openCypher query can assess temporal values in the knowledge graph using time durations. This allows you to understand time differences between data points. You can find entities and relationships where events captured in their properties last for a given period of time, or where a given amount of time elapses between events.
When a time duration is provided to or returned by the query, it always uses the ISO 8601 time duration format. For example, a duration of P1Y2M3DT4H5M6S represents a time interval of one year, two months, three days, four hours, five minutes, and six seconds. Portions of the duration can be omitted if their value is zero. For example, a time period of one and a half days can be represented as P1DT12H or PT36H. If the duration string includes time components, the character T must be specified after the date components and before the time components.
Several duration functions can be used to calculate the period between two times, as described below. The duration can be calculated in specific units of time, if needed.
- duration.between(a, b)—The amount of time elapsed between time a and time b. A time duration is returned that specifies the full amount of time elapsed in years, months, days, and so on.
- duration.inMonths(a, b)—The amount of time elapsed between time a and time b is calculated as a number of whole months and returned as a time duration. Amounts of time less than a whole month are discarded.
- duration.inDays(a, b)—The amount of time elapsed between time a and time b is calculated as a number of whole days and returned as a time duration. Amounts of time less than a whole day are discarded.
- duration.inSeconds(a, b)—The amount of time elapsed between time a and time b is calculated as a number of whole seconds and returned as a time duration. Amounts of time less than a whole second are discarded.
For example, to find people who sold a vehicle that they owned for less than two years, use a query such as MATCH path=(:Person)-[hv:HasVehicle]->(:Vehicle) WHERE hv.endDate < (hv.acquisitionDate + duration('P2Y')) RETURN path, duration.between(hv.acquisitionDate,hv.endDate) as TimeOwned. All paths are returned where the HasVehicle relationship has an endDate property and the duration between the endDate and the acquisitionDate property is less than two years. The query returns both the path and the duration of time that the vehicle was owned.
When calculating a duration, you can specify the current date or time as one of the two time instants by providing a date-time function in the appropriate place. Also, portions of the duration can be accessed and returned or used in the query. The following example identifies people who bought a vehicle more than four years ago: MATCH path=(:Person)-[hv:HasVehicle]->(:Vehicle) WHERE duration.between(hv.acquisitionDate, datetime()).years > 4 RETURN path. The current date-time is used in the query by providing the function datetime() without providing a specific date. The years part of the calculated duration is then accessed and evaluated to find the appropriate paths. All parts of the duration can be accessed in this manner, including the days, hours, minutes, and so on.
Durations can also be calculated between events that happen between related parties. For example, the following query assesses a knowledge graph to determine the relationship between Phone entities and Called relationships and the amount of time between phone calls: MATCH (:Phone)-[c1:Called]-(:Phone)-[c2:Called]-(:Phone) WHERE duration.between(c1.datecalled, c2.datecalled).minutes < 45 RETURN c1, c2. In this query, a duration is calculated between the time that a phone receives one call and places a second call. The minutes property of the duration is accessed, and if less than forty-five minutes elapsed between the calls, both the first and the second phone calls are returned by the query.
Use provenance records in a query
When you create a knowledge graph using a hosted graph store or a NoSQL data store with ArcGIS-managed data, provenance can be enabled. Provenance records allow you to indicate where the content of the knowledge graph originated. You can specify that the value stored in a property of an entity or a relationship was derived from a specific document or website. This information can be used later when you query the knowledge graph. You can find entities and relationships with property values that originated from a specific source of information.
Provenance records are excluded from all queries by default. In ArcGIS Pro, you must check the Include Provenance option to use information stored in provenance records in your query and include provenance records in the query results. If you create a custom application that communicates with an ArcGIS Knowledge Server site using one of the available developer APIs, your client application must specify that the query results should include provenance records.
Your query must determine which entities and relationships have properties associated with provenance records by comparing the instance identifiers of the graph items to the instance identifiers stored in the provenance records. The query can also evaluate properties of the graph items and the provenance records to get results. Provenance records themselves or their properties can optionally be included in the query results.
For example, to find the owners of all vehicles where the California Department of Motor Vehicles is the source of the information stored in properties of the HasVehicle relationship, use a query such as MATCH (p:Person)-[hv:HasVehicle]->(v:Vehicle), (pr:Provenance) WHERE ID(hv)=pr.instanceID and pr.sourceName ="California Department of Motor Vehicles" RETURN p,hv,v,pr. The query evaluates which HasVehicle relationships have provenance records by comparing instance identifiers of the relationships to the instanceID property of the provenance records. The provenance records evaluated are only those where the record's sourceName property has the appropriate value. Additionally, Person and Vehicle entities associated with the HasVehicle relationships that were found and the provenance records themselves are also included in the query results.
Use spatial operators in a query
ArcGIS Knowledge supports using spatial operators in openCypher queries with point, multipoint, line, and polygon geometries. These are the same geometry types supported for entity types created in a knowledge graph using a hosted graph store or a NoSQL data store with ArcGIS-managed data. If your knowledge graph uses a NoSQL data store with user-managed data, different geometry types may be supported.
The following spatial operators are supported:
- ST_Equals—Returns entities with equal geometries. The syntax is esri.graph.ST_Equals(geometry1, geometry2).
- ST_Intersects—Returns entities with intersecting geometries. The syntax is esri.graph.ST_Intersects(geometry1, geometry2).
- ST_Contains—Returns entities whose geometries are contained by the specified geometry. The syntax is esri.graph.ST_Contains(geometry1, geometry2).
Spatial operators can be incorporated into the WHERE clause of a query. The geometry parameters can reference an entity's geometry or you can specify a geometry that represents a spatial location. You can construct a geometry from a string using the operator esri.graph.ST_WKTToGeometry(string) where the string parameter is an OGC simple feature specified in the well-known text format. For example, to create a geometry representing the coordinates 117.1964763°W 34.0572046°N, you would use the operator esri.graph.ST_WKTToGeometry("POINT (-117.1964763 34.0572046)"). A geometry constructed in this manner can only be specified in the first geometry argument for the spatial operators. The second geometry argument must always reference the geometry associated with an entity in the knowledge graph.
Consider the following examples where entities of the Person type can have point geometries and entities of the Facility type can have polygon geometries:
- The query MATCH (p1:Person), (p2:Person) WHERE esri.graph.ST_Equals(p1.shape, p2.shape) RETURN p1, p2 returns Person entities p1 and p2 entities with equal shapes; that is, both Person entities have identical location geometries.
- The query MATCH (e:Employee), (f:Facility) WHERE esri.graph.ST_Intersects(e.shape, f.shape) RETURN e, f returns Employee entities and Facility entities, e and f, respectively, where geometries for the Employee and Facility entities intersect.
- The query MATCH (f:Facility) WHERE esri.graph.ST_Contains(esri.graph.ST_WKTToGeometry("POINT (-117.1964763 34.0572046)"), f.shape) RETURN f returns Facility entities, f, whose geometries contain the specified point.
The spatial utility ST_GeoDistance is also available, which has the syntax esri.graph.ST_GeoDistance(geometry, geometry). This utility returns the distance between the two geometries. For example, in the query MATCH (n), (e) WHERE esri.graph.ST_GeoDistance(n.shape, e.shape) as distance RETURN n, e, the distance variable in the WHERE clause stores the geodesic distance that is calculated between the entities n and e.
Learn more about openCypher queries
You can learn more about the openCypher query language using a document provided by openCypher Implementers Group. ArcGIS Knowledge does not support all aspects of the openCypher query language. For example, queries can't be used to update the knowledge graph, only to return values.
In ArcGIS Pro, you can learn about openCypher by seeing the queries that retrieve data from a knowledge graph to build histograms. In the Search and Filter pane, on the Histogram tab , click the Settings button , and click Send query to Query tab. The query used to retrieve data for the current set of histograms appears in the Query text box.