0

this is my first post. I'm new in Java. I'm working on file parser. I've tried to identify if it is CSV or another file format, but it looks like it is not quite a standard format. I'm working on apache camel solution (my first and last idea :( ), but maybe some of you recognize this kind of file format? Additionally, I've got .imp file for my output.

Here is my example input:

NrDok:FS-2222/17/W Data:12.02.2017 SposobPlatn:GOT NazwaWystawcy:MAAKAI Gawron AdresWystawcy:33-123 bABA KodWystawcy:33-112 MiastoWystawcy:bABA UlicaWystawcy:czysfa 8 NIPWystawcy:123-19-85-123 NazwaOdbiorcy:abc abc-HANDLOWO-USŁUGOWE AdresOdbiorcy:33-123 fghd KodOdbiorcy:33-123 MiastoOdbiorcy:Tdsfs UlicaOdbiorcy:dfdfdA 39 NIPOdbiorcy:82334349 TelefonOdbiorcy:654-522-124 NrOdbiorcyWSieciSklepow:efdsS-sffgsA IloscLinii:1 Linia:Nazwa{ĆWIARTKA KG}Kod{C1}Vat{5}Jm{kg.}Asortyment{dfgv}Sww{}PKWIU{10.12.10}Ilosc{3.40}Cena{n3.21}Wartosc{n11.83}IleWOpak{1}CenaSp{b0.00} DoZaplaty:252.32

And here is my example output file:

FH 2015.07.31 2015.07.31 F04443 Gotowka FO 812-123-45-11 P.a.b.Uc"fdad" abcd deffF UL.fdfgdfdA 12/33 33-123 afvdf FS 779-19-06-082 badfdf S.A. ul. Wisniowa 89 60-003 Poznan FP 00218746 CHRZAN TARTY EXTRA POLONAISE 180G SZT 32.00 2.21 8 10.39.17.0 32.00 5900138000055

Is there any easy way to convert the first file to second file format? Maybe you know the type of this file? In a meanwhile, I'm continuing my work with apache camel.

Thanks in advance for your time and help!

2 Answers2

0

I suggest you to play with https://tika.apache.org/1.1/detection.html#Mime_Magic_Detection

It's very good lib for file type recognition.

Here https://www.tutorialspoint.com/tika/tika_document_type_detection.htm we have simple example.

Yaroslav
  • 446
  • 4
  • 15
0

Your file can be read as standard Java .properties file. This type of files allows both = and : as key and value separators. While the fact that it contains non ISO-8859-1 characters like Polish Ć may prevent Java from correctly parsing it.

This line

Nazwa{ĆWIARTKA  KG}Kod{C1}Vat{5}Jm{kg.}Asortyment{dfgv}Sww{}PKWIU{10.12.10}Ilosc{3.40}Cena{n3.21}Wartosc{n11.83}IleWOpak{1}CenaSp{b0.00}

Seem to be some custom serialization format of the object in the form

key1{value1}key2{value2}...

Your output file contains lots of data that is not listed in the input which makes me think that there is some data querying from external systems to build the output. You should investigate it yourself. There is no way anyone can guess the transformation with provided input.

Aleh Maksimovich
  • 2,622
  • 8
  • 19