MarkLogic does not 'handle' EPUB. CPF does not. MLCP does not.
EPUB is a zip containing mainly xhtml, xml and pictures. I can rename it to .zip and load it with MLCP. But renaming is not so nice, it will show up in the URI unless I add a replace to the URI creation etc. etc.
Also, the .opf
file contains useful information, it is XML but read as binary. I can add .OPF to the MIME-types but this does not work in combination with loading from archive with MLCP, then it will still show up as Binary again.
I'd hate to add an extra layer 'preparing' the data before it loads into ML. And I would like to keep the information readable/indexable as much as I can.
Is there a better way than; renaming, unpacking and mime-typing to load EPUB files into MarkLogic?