1

I am new to Gremlin and I am using the Gremlin console to read data from a graph database. In the graph, there are vertices with label "Device". These vertices have a property "name" associated with them. I need to find out if there is a vertex that has a certain name. This check has to be case insensitive.

Suppose I was to do this in a relational database, I could write the following query:

SELECT * FROM device d WHERE LOWER(d.name) = 'mydevice'

I am searching for a function similar to 'LOWER' in Gremlin. If there isn't a function for that, can somebody tell me how I can search properties of vertices without considering alphabetic case?

Thank you.

Samuel Konat
  • 113
  • 1
  • 9

2 Answers2

1

Officially, Gremlin currently only has three text predicates at this time as exposed by TextP: startingWith, endingWith and containing (as well as their negations), but their default implementations are case sensitive:

gremlin> g.V().has('person','name',containing('ark')).values('name')
==>marko
gremlin> g.V().has('person','name',containing('Ark')).values('name')
gremlin> 

Depending on the TinkerPop-enabled graph database that you use, you may have this sort of feature available to you as well as other more advanced search options (e.g. regex). For example, JanusGraph supports full-text search as case insensitive as well as a host of other options. DS Graph also has a rich text search system on top of the basic Gremlin options. So, if you have a explicit need for the type of search you describe you may need to look into the options provided by individual graph systems.

While it's not recommended for a number of reasons you can use a lambda:

gremlin> g.V().filter{it.get().value('name').toUpperCase() == 'MARKO'}.values('name')
==>marko

The downside to lambdas are that:

  1. they aren't supported by all providers and therefore your code portability is reduced
  2. they force requests to be evaluated in a manner that can be more costly than a traversal that strictly uses Gremlin steps.

TinkerPop is slowly identifying commonalities among search options provided by different vendors and will continue to generalize those features as opportunity presents itself, so that they are available as first-class citizens in the Gremlin language itself.

UPDATE: As of TinkerPop 3.6.0, Gremlin now has a TextP.regex predicate that can help with these sorts of searches:

gremlin> g.V().has('name', TextP.regex('[M|m].*o')).elementMap()
==>[id:1,label:person,name:marko,age:29]
gremlin> g.V().has('name', TextP.regex('(?i)M.*O')).elementMap()
==>[id:1,label:person,name:marko,age:29]

Note that the (?i) enables the Java's Pattern.CASE_INSENSITIVE pattern matching mode.

stephen mallette
  • 45,298
  • 5
  • 67
  • 135
  • Thank you for the quick reply. When I try to use the lambda you defined, I am getting an exception that starts like this: WARN org.apache.tinkerpop.gremlin.driver.MessageSerializer - Request [RequestMessage{, requestId=580bb852-074f-4cfd-979f-8929d07c45f6, op='bytecode', processor='traversal', args={gremlin=[[], [V(), filter(groovysh_evaluate$_run_closure1@5e7c141d), values(name)]], aliases={g=g}}}] could not be serialized by org.apache.tinkerpop.gremlin.driver.ser.AbstractGryoMessageSerializerV3d0. java.lang.IllegalArgumentException: Class is not registered: java.lang.reflect.Invocation... – Samuel Konat Sep 25 '19 at 17:42
  • if you remote to Gremlin Server then you need to use a string lambda - assuming you're using java then this link will help: http://tinkerpop.apache.org/docs/current/reference/#_the_lambda_solution if you are using another language there are similar sections for each one - just scroll a bit or use the table of contents on the left. – stephen mallette Sep 25 '19 at 18:09
  • also...you could send that exact Gremlin traversal as a script to Gremlin Server: http://tinkerpop.apache.org/docs/current/reference/#_submitting_scripts (but that's not preferred) – stephen mallette Sep 25 '19 at 18:10
  • It works. Thank you very much. Another work around I had in mind was to just get all the values of the 'name' property and iterate in code to see if a match exists. Do you know if using lambda could somehow prove to be more inefficient than this approach? – Samuel Konat Sep 25 '19 at 19:25
  • I suppose it depends on the width of your search. If you have a million "device" vertices then returning the to iterate for the "name" in code isn't going to work so well. If you can limit that number with other search criteria that uses indices to a few dozen "device" vertices then that could work. Note that you will probably need to do that anyway, because a lambda won't resolve to index lookups as graphs can't optimize a lambda expression. So assuming you limit to a few dozen vertices, I don't know if a lambda will end up faster than iterating with code on the client - you will have to test – stephen mallette Sep 25 '19 at 19:52
  • That makes sense. Appreciate the amount of detail included in your responses. Thank you once again. – Samuel Konat Sep 26 '19 at 12:22
  • Lambda option is available in gremlinpython? I am not able to find it in the statics/import. – Thirumal Aug 01 '22 at 12:01
  • https://tinkerpop.apache.org/docs/current/reference/#gremlin-python-lambda – stephen mallette Aug 01 '22 at 14:30
0

I just had the same problem, but found an acceptable solution (at least for me)

String search = "any"

g.V().hasLabel(label)
  .as("V")
  .properties(prop1, prop2, ...)
  .or(
    // Use fuzzy search with 1 to allow swapped chars
    hasValue(tokenFuzzy(search, 1)),
    // Search for exact contain match
    hasValue(containing(search)),
    // Search for incase sensitive start with
    hasValue(tokenPrefix(search))
  )
  .select("V")
  .dedup()

When I now search for "ger" or "any" I would get "It's sunny currently in germany".

I use as to store my current vertex before I using properties as this would leave only the selected props, but I want the full vertex after the search.
Also I need dedup as a hit could be made in prop1 and prop2 resulting in duplicate founds.

Shinigami
  • 646
  • 1
  • 7
  • 21