decoded should be equal to text but:
import tokenizers
text = "Hello World!"
tokenizer = tokenizers.Tokenizer(tokenizers.models.Unigram())
tokenizer.train_from_iterator(text)
encoded = tokenizer.encode(text)
decoded = tokenizer.decode(encoded.ids)
print(decoded)
# 'H e l l o W o r l d !'
how can i change the tokenizer to reflect the desired output?