0

I'm working with an instance of ACUMOS platform, Clio release. I must use Jupyter notebooks in ACUMOS to analyze and prepare data for subsequent modelling. The only way I found until now to get the data in my notebooks is by uploading the files from the Jupyter homepage, and then copying the path and importing them from the notebook by using pandas.read_csv(path).

The point is that I'd like to have the data files visible from ACUMOS, in a kind of data repository, for instance from ML Workbench, but I don't know how to do it and if it is possible. Of course it would be more appropriate getting the files uploaded directly to ACUMOS and then reading them in the notebooks from there.

Is there any way to do this?

datariel
  • 150
  • 11
  • Hi datariel, did you try create a datapipeline ? (I think it's required some configuration during installation) – Phil Dec 15 '20 at 08:50
  • 1
    Hi @Phil, thanks. I tried but I didn't manage to do it. From Data Pipelines I got "Unknown server Error: Unable to create Data Pipeline", then from Projects I created one, but it remained blocked in INPROGRESS status. Might these be signals of lack of configuration during installation, or issues with it? – datariel Dec 15 '20 at 17:41
  • 1
    Hi datariel, could you try the following : in the three pipeline pod (mlwb-pipeline, mlwb-pipeline-webcomponent and mlwb-pipeline-catalog-webcomponent), change the value of "external pipeline" flag to true – Phil Dec 16 '20 at 09:28
  • 1
    Hi @datariel, After reading again your post and checking internally, I think external pipeline are not what you expect. "external pipeline" is just an URL of an existing pipeline outside Acumos. Unfortunately "internal pipeline" doesn't work in Acumos CLIO as you experienced. And there is no other functionality in Acumos to manage data. So the best way to train your model is to upload the train dataset throught the jupyther lab GUI – Phil Dec 16 '20 at 16:11
  • Hi @Phil, thanks again. Ok I undestand. And once I upload data from Jupyter GUI, it is not possible to see these data from Acumos GUI (for instance from AcuCompose), right? And what about Demeter? is it possible to manage data in this release? – datariel Dec 16 '20 at 19:19
  • Yes you are right @datariel, it is not possible to manage data in Acumos GUI whatever the release Clio or Demeter. This topic has already been discussed in Acumos community and decision was taken to not do it because data storage is strongly link with local physical ressources. – Phil Dec 17 '20 at 12:47
  • Hi @Phil, so what is the utility of new Datasource component in Demeter? It's just a kind of connector to external sources? It remains not possible to upload data files (csv, avro, parquet, xlsx, etc) and use the platform as a repository, correct? – datariel Dec 21 '20 at 09:10
  • Hi @datariel, sorry for my late answer. I'm not aware about the "datasource component" could you be more precise. Regarding your last question you are right it is not possible to upload data in Acumos – Phil Jan 06 '21 at 07:52
  • Hi again @Phil, I refer to this [datasource component](https://docs.acumos.org/en/demeter/submodules/workbench/docs/mlwb-user-guide/datasource-component/datasource-wc-user-guide.html#datasource-component-overview) in Demeter release. What is the utility of this component, which is not present in Clio? – datariel Jan 27 '21 at 16:35
  • Hi @datariel, it can be surprising but I wasn't aware of this component in Demeter. I remember that there were some discussions about that but I didn't know that some devloppements have been made and I hadn't install MLWB in my Demeter instance. It is a good news, I will try to know more about that and give you some feedback. – Phil Jan 29 '21 at 08:52

0 Answers0