1

I am trying to use https://pythonhosted.org/virtuoso/ with RDFlib but I keep getting the following import error

~\miniconda3\envs\dlvr\lib\site-packages\virtuoso\__init__.py in <module>
      2 from pkg_resources import DistributionNotFound
      3 
----> 4 try: import alchemy
      5 except DistributionNotFound: pass
      6 

ModuleNotFoundError: No module named 'alchemy'

My code for using the plugin is as follows:

import rdflib
from rdflib.store import Store
from rdflib.plugin import get as plugin

Virtuoso = plugin("Virtuoso", Store)
store = Virtuoso("DSN=VOS;UID=dba;PWD=dba;WideAsUTF16=Y")

Does rdflib not support Virtuoso plugin? Is there a better option available for external triples store that I can use to speed up queries on large datasets with rdflib?

TallTed
  • 9,069
  • 2
  • 22
  • 37
Aniqa295
  • 21
  • 1
  • did you read the plugin text: "This has been tested with rdflib3 only and may or may not work with rdflib2." - I'm sure that you're using either latest (5.0) or 4.0 version of rdflib – UninformedUser Feb 18 '21 at 15:00
  • other than that, there is SPARQLWrapper for access of any standard SPARQL protocol capable triple store - which Virtuoso for sure is. Weird that you didn't find it given it's directly mentioned: https://github.com/RDFLib/sparqlwrapper – UninformedUser Feb 18 '21 at 15:01
  • @UninformedUser I am trying to query a local rdf file using rdflib. As far as I know (my knowledge might be limited I apologize) sparql wrapper is for querying an online endpoint. Rdflib takes too much time in processing the file and parsing it and that is why I wanted to somehow upload that data first to a triple store perhaps. I am having difficulty with the uploading part – Aniqa295 Feb 18 '21 at 20:15
  • This looks very likely to be [an XY Problem](https://meta.stackexchange.com/a/66378). What is your *real* goal? Virtuoso is a fine triple store (among other things), and *can* provide an interface over filesystem documents including RDF data files, but you're building in several latencies by going that route; you'd be far better off loading the RDF into Virtuoso's triple/quadstore, and then querying through Virtuoso's built-in SPARQL endpoint, potentially using SPARQLWrapper (which doesn't care *where* the endpoint it's querying is, so long as it's accessible through the normal TCP/IP stack). – TallTed Feb 18 '21 at 20:33
  • @Aniqa295 how big is the "local file"? I mean, Virtuoso is blazingly fast, but in my opinion when you use it as a triple store, i.e. load the data and then query it. And yes, SPARQLWrapper is for SPARQL over HTTP only - but I don't see the problem here. Loading the data into a triple store like Virtuoso takes what, a few seconds? What means "difficulties" in uploading data? Do you use Virtuoso? Where is the issue? – UninformedUser Feb 18 '21 at 21:02
  • A quick Google search suggests you need to install the alchemy library for Python. https://www.roseindia.net/answers/viewqa/pythonquestions/37045-ModuleNotFoundError-No-module-named-alchemy.html – TallTed Feb 23 '21 at 13:01

1 Answers1

1

A quick Google search suggests you need to install the alchemy library for Python.

TallTed
  • 9,069
  • 2
  • 22
  • 37