3

this question is partially answered in neo4j-legacy-indexes-and-auto-index-vs-new-label-bases-schema-indexes and the-difference-between-legacy-indexing-auto-indexing-and-the-new-indexing-approach

I can't comment on them yet and write a new thread here. In my db, I have a legacy index 'topic' and label 'Topic'.

I know that:

  • a. pattern MATCH (n:Label) will scan the nodes;
  • b. pattern START (n:Index) will search on legacy index
  • c. auto-index is a sort of legacy index and should gimme same results as (b) but it does not in my case
  • d. START clause should be replaced by MATCH for "good practices".

I have inconsistent results between a. and b. (see below), cannot figure out how to use proper syntax with MATCH for searching on indexing insted of labels.

Here some examples:

1#

start n=node:topic('name:(keyword1 AND keyword2)') return n

6 rows, 3ms

start n=node:node_auto_index('name:(keyword1 AND keyword2)') return n;

0 rows

MATCH (n:Topic) where n.name =~ '(?i).*keyword1*.AND.*keyword2*.' return n;

0 rows, 10K ms

2#

start n=node:topic('name:(keyword1)') return n

212 rows, 122 ms [all coherent results containing substring keyword1]

start n=node:node_auto_index('name:(keyword1)') return n

0 rows

MATCH (n:Topic) where n.name =~ '(?i).*keyword1*.'return n

835 rows, 8K ms [also results not coherent, containing substring eyword]

MATCH (n:Topic) where n.name =~ 'keyword1' return n;

1 row, >6K ms [exact match]

MATCH (n:topic) where n.name =~ 'keyword1' return n;

no results (here I used an index 'topic' not a label 'Topic'!)

MATCH (node:topic) where node.name =~ 'keyword1' return node;

no results (attempt to use node "object" directly, as in auto-index syntax)

Could you help shed some light:

  • What's the difference between a legacy index and auto-index and why inconsistent results between the two?

  • How to use MATCH clause with Indexes rather than labels? I want to reproduce results of full-text search.

  • Which syntax to do a full-text search applied to ONLY the neighbor of a node, not the full-db? MATCH ? START clause? legacy-index ? label? I am confused.

Community
  • 1
  • 1
user305883
  • 1,635
  • 2
  • 24
  • 48

1 Answers1

3

The auto index (there is only one) is a manual (aka legacy) index having the name node_auto_index. This special index tracks changes to the graph by hooking into the transaction processing. So if you declared name as part of your auto index for nodes in the config, any change to a node having a name property is reflected to that index.

Note that auto indexes do not automatically populate on an existing dataset when you add e.g. a new property for auto indexing.

Note further that manual or auto indexes are totally independent of labels.

The only way to query a manual or auto index is by using the START clause:

START n=node:<indexName>(<lucene query expression>) // index query
START n=node:<indexName>(key='<value>') // exact index lookup

Schema indexes are completely different and are used in MATCH when appropriate.

A blog post of mine covers all the index capabilities of neo4j.

In general you use an index in graph databases to identify the start points for traversals. Once you've got a reference inside the graph you just follow relationships and do no longer do index lookups.

For full text indexing, see another blog post.

updates based on commets below

In fact MATCH (p:Topic {name: 'DNA'}) RETURN p and MATCH (n:Topic) where n.name = 'DNA' return n are both equvalent. Both result in the same query plan. If there is a schema index on label Topic and property name (by CREATE INDEX ON :Topic(name)) Cypher will implicitly use the schema index to find the specified node(s).

At the moment you cannot use full text searches based on schema indexes. Full text is only available in manual / auto indexing.

All the example you've provided with START n=node:topic(...) rely on a manual index. It's your responsibility to keep them in sync with your graph contents, so I assume the differences are due to inconsistent modifications in the graph and not reflecting the change to the manual index.

In any case if you use START n=node:topic(....) will never use a schema index.

Stefan Armbruster
  • 39,465
  • 6
  • 87
  • 97
  • Thank you @Stefan, but then: what's the difference between syntax: `MATCH (p:Topic {name: 'DNA'}) RETURN p;` and `MATCH (n:Topic) where n.name = 'DNA' return n;` ? And how can I look for full-text search using schema index? Example, how can I reproduce same results as: `start n=node:topic('name:(DNA*)') return n;` and `start n=node:topic('name:(DNA* AND gene*)') return n`? Why `MATCH (node:Topic) where node.name =~ "DNA*" return node;` produces different results than `start n=node:topic('name:(DNA)') return n` and why it takes much longer despite `CREATE INDEX ON :Topic(name);` has been set ? – user305883 Aug 02 '15 at 17:30
  • ah-ha! so could you please provide some examples of best use of schema indexes, if full-text is not supported? Does it in this case a best case of applicability for beginning graph traversal is only numeric? (exact MATCH on id). In your blog I read "Currently schema indexes cannot be spawned over multiple properties but you can have multiple indexes for the same label": does it mean I could combine a full-text on legacy and schema-index ? E.g. full-text search on legacy-index matching string pattern 'Leonardo' with schema index 'Artist' ? Could you also provide an example of syntax in cypher? – user305883 Aug 03 '15 at 00:39
  • Combining schema indexes and manual indexes is kind of anti-pattern. Think of schema indexes as a large hash map for exact lookups. For anything else use manual/auto indexes. On a side note: Neo4j 2.3 will have a schema index backed `LIKE` allowing to do prefix search: `MATCH (n:Person) where n.name like "Jo%" RETURN n` will find John and Joanne via the index. – Stefan Armbruster Aug 03 '15 at 08:14
  • If full-text indexing is supported, **Which criteria is natively used to sort results for a best match against a keyword?** and how it would be possible to interact with it in cypher? I opened another thread, you may want to contribute here: [http://stackoverflow.com/q/31862761/305883] – user305883 Aug 07 '15 at 09:36