1

Hello i am having problems reading after saving and appending a List of Tuple Lists inside a File.

Saving something into a File works without problems.

I am saving into a file with

import qualified Data.ByteString            as BS
import qualified Data.Serialize             as S (decode, encode)
import Data.Either 

toFile path = do
   let a = take 1000 [100..] :: [Float]
   let b = take 100 [1..] :: [Float]
   BS.appendFile path $ S.encode (a,b)

and reading with

fromFile path = do 
    bstr<-BS.readFile path
    let d = S.decode bstr :: Either String ([Float],[Float])
    return (Right d)

but reading from that file with fromFileonly gives me 1 Element of it although i append to that file multiple times.

Since im appending to the file it should have multiple Elements inside it so im missing something like map on my fromFile function but i couldnt work out how.

I appreciate any help or any other solutions so using Data.Serialize and ByteString is not a must. Other possibilities i thought of are json files with Data.Aeson if i cant get it to work with Serialize

Edit :

I realized that i made a mistake on the decoding type in fromFile

let d = S.decode bstr :: Either String ([Float],[Float])

it should be like this

let d = S.decode bstr :: Either String [([Float],[Float])]
Ninexreaker
  • 43
  • 1
  • 6
  • Wrt your edit, it would work with that second type if you add eight bytes at the beginning of the file indicating the number of chunks (i.e. increment this at every append). – Thomas M. DuBuisson Sep 17 '18 at 14:15

1 Answers1

1

The Problem In Brief The default format used by serialize (or binary) encoding isn't trivially append-able.

The Problem (Longer)

You say you appended:

S.encode (a,b)

to the same file "multiple times". So the format of the file is now:

[ 64 bit length field  | # floats encoded | 64 length field | # floats encoded ]

Repeated however many times you appended to the file. That is, each append will add new length fields and list of floats while leaving the old values in place.

After that you returned to read the file and decode some floats using, morally, S.decode <$> BS.readFile path. This will decode the first two lists of floats by first reading the length field (of the first time you wrote to the file) then the following floats and the second length field followed by its related floats. After reading the stated length worth of floats the decoder will stop.

It should now be clear that just because you appended more data does not make your encoding or decoding script look for any additional data. The default format used by serialize (or binary) encoding isn't trivially append-able.

Solutions

You mentioned switching to Aeson, but using JSON to encode instead of binary won't help you. Decoding two appended JSON strings like { "first": [1], "second": [2]}{ "first": [3], "second": [4]} is logically the same as your current problem. You have some unknown number of interleaved chunks of lists - just write a decoder to keep trying:

import Data.Serialize as S
import Data.Serialize.Get as S
import Data.ByteString as BS

fromFile path = do 
    bstr <- BS.readFile path
    let d = S.runGet getMultiChunks bstr :: Either String ([Float],[Float])
    return (Right d)

getMultiChunks :: Get ([Float],[Float])
getMultiChunks = go ([], [])
   where
  go (l,r) = do
     b <- isEmpty
     if b then pure ([],[])
          else do (lNext, rNext) <- S.get
                  go (l ++ lNext, r ++ rNext) -- inefficient

So we've written our own getter (untested) that will look to see if byte remain and if so decode another pair of lists of floats. Each time it decodes a new chunk it prepends the old chunk (which is inefficient, use something like a dlist if you want it to be respectable).

Thomas M. DuBuisson
  • 64,245
  • 7
  • 109
  • 166
  • First of all thank you, i didnt know that directly appending encoded data wouldnt work. I also realized that i was missing the List on the Tuple so that the type of `S.runGet getMultiChunks bstr :: Either String ([Float],[Float])` should be `:: Either String [([Float],[Float])]` . Meaning what i wanted to achieve was to add whole Tuples to the file like this `[([1.1,2.2],[3.3,4.4]),([9.9,3.2,4.2],[1.1,3.5])]` so i thought that appending it like that would be possible but from what i can see with `getMultiChunks` what i have to do is something like `getMultiChunks ::Get [([Float],[Float])]` – Ninexreaker Sep 16 '18 at 16:51
  • and also to avoid the inefficient appending it could also be added at the beginning of the List with `:` . The workaround what recently came to my mind was to read the File, decode whats inside, add the tuple, encode and write to file again. But that would cost much if the file was big – Ninexreaker Sep 16 '18 at 16:55