Questions tagged [manifoldcf]

Apache Manifold CF is an open source connector framework for website and enterprise search engines.

Apache ManifoldCF is an effort to provide an open source framework for connecting source content repositories like Microsoft Sharepoint and EMC Documentum, to target repositories or indexes, such as Apache Solr, Open Search Server, or ElasticSearch. Apache ManifoldCF also defines a security model for target repositories that permits them to enforce source-repository security policies.

Currently included connectors support FileNet P8 (IBM), Documentum (EMC), LiveLink (OpenText), Meridio (Autonomy), Windows shares (Microsoft), and SharePoint (Microsoft). Also included are a general CMIS connector, a generic file system connector, a general JDBC connector, an RSS feed connector, a Wiki connector, a DropBox connector, an email connector, and a general web connector. Currently supported targets include Apache Solr, QBase (formerly MetaCarta) GTS , OpenSearchServer and ElasticSearch.

30 questions
0
votes
0 answers

Will the authority service give the tokens in the ACL for a user if he has access to the library through AD group in Apache Manifold CF?

We get the ACL for the user(abc@xyz.com) on all the sharepoint libraries through Authority service in Manifold CF. User(abc@xyz.com) is not having a direct access on the particular sharepoint library and also does not have access through defined…
0
votes
1 answer

Web crawl using manifoldcf

I'm trying to web crawl data from a specific website using Manifoldcf, but unfortunately, I keep getting 0 results I don't know what I'm doing wrong. I tried creating new Repository Connections as "Generic" and as "WEB" but when I create a job and…
0
votes
1 answer

Alfresco Community Edition, ManifoldCF and Elasticsearch to optimize full-text search

How can I integrate a milions documents Alfresco Community Edition CMIS repository and an Oracle RDBMS repository that stores metadata of the same documents, trough Apache ManifoldCF to index composed metadata in Elasticsearch to build clustered…
user9038848
  • 19
  • 2
  • 7
0
votes
2 answers

ApacheManifoldCF elasticsearch output connector version compatibility

I am trying to connect elasticsearch as output connector from apache ManifoldCF.Using elasticsearch 7.1.X version which is not working. Can you suggest will maniFoldCF work with latest elasticsearch version ? Tried configuring the repository and…
Muthu
  • 217
  • 5
  • 16
0
votes
1 answer

Apache ManifoldCF: Get a history report for a repository connection over REST API

I'm trying to get a history report for a repository connection over ManifoldCF REST API. According to the documentation: https://manifoldcf.apache.org/release/release-2.11/en_US/programmatic-operation.html#History+query+parameters It should be…
Marta G.
  • 36
  • 1
  • 7
0
votes
1 answer

ManifoldCF and Postgresql to crawl 1.5 Million of documents

We used ManifoldCF with Postgresql (9.6) to crawl our websites. The speed of the crawling is good (approximately 20.000docs/hours) until 500.000 docs. after the performance decrease, and we can see long freeze (very long) of the crawling. We…
0
votes
1 answer

Manifoldcf documentum crawling slowness

We are crawling data from DCTM repository using ManiFoldCF documentum connector and writing the crawled data to MongoDB. Crawling triggered with throttling value 500.But crawling speed is very slow per minute connector is fetching only 170…
0
votes
1 answer

Extracting contents using Tika transformation - Manifold CF

We are indexing Documentum contents to Elasticsearch using Manifold Cf. we are not able to get the contents from attachment, but metadata is available. Is there any way to get the contents using Tika transformation? Or please suggest some ways to…
User1203
  • 77
  • 2
  • 11
0
votes
1 answer

writing Mongo DB output connector for manifoldcf

We are trying to push repository contents to MongoDb through apache manifold CF. And we don't find any sample code for custom output connector. Is it possible? can someone please help on this? Thanks!!
User1203
  • 77
  • 2
  • 11
0
votes
2 answers

Best way to crawl through file system and index

I am working on a project where I need to crawl through more than 10TB of data and index it. I need to implement incremental crawling that takes less time. My question is : Which is the best tool suitable that all the big organizations are using for…
Shashank Raj
  • 25
  • 1
  • 12
0
votes
0 answers

ManifoldCF ERROR JCIFS connector, crash agents

I used ManifoldCF 2.7 with multiprocess-zk and after 10 minutes my 2 agents crash. ERROR : jcifs.smb.SmbException jcifs.util.transport.TransportException java.lang.InterruptedException at java.lang.Object.wait(Native Method) at…
MaxenceS
  • 25
  • 7
0
votes
1 answer

manifold sharepoint elasticsearch

I'm trying to create a crawler job in ManifoldCF 2.7.1, I create the elastic output everything is fine, create the SharePoint repository and everything is fine. Now when I'm creating a job and I add the elastic output I cannot see the elasticsearch…
0
votes
0 answers

Can you connect ManifoldCF to Documentum without Webtop?

I am working on a proof of concept connecting Documentum as a repository to ManifoldCF and Solr as an output. The ManifoldCF widget to connect to Documentum is asking for a Webtop URL and it won't allow me to leave it blank. We have not…
Dan
  • 33
  • 6
0
votes
2 answers

ManifoldCF error when creating ElasticSearch output connector

I have ElasticSearch 2.2 running on a linux VM. I'm running ManifoldCF 2.3 on another VM in the same netowrk. Using ManifoldCF's browser UI I added the ElasticSearch output connector and when I save it I get an error in the connector status: Name: …
Andrey
  • 20,487
  • 26
  • 108
  • 176
0
votes
2 answers

Searching metadata from images using Datafari

I'm looking for an open source document management system, to index all kind of files (texts : [pdf, doc...], images [jpg, png, bmp...], videos [mov, mp4...]) and i stumbled upon Datafari It uses Solr search enging, and ManifoldCF to manage content…
Overdose
  • 585
  • 7
  • 30
1
2