-3

I want to input a file containing Urdu words and write them in other file. The problem I'm facing is that Urdu language doesn't have spaces, so other file written would look alike all words joined to each other. How can i separate words or detect spaces ? My Code is this.

#include<iostream.h>
#include<conio.h>
#include<fstream.h>

void main()
{
    ifstream file;
    ofstream myfile;
    file.open ("urduwords.txt");     //File containing Urdu words
    myfile.open("urduoutput.txt");   // Output file
    if (!file.is_open()) return;

char * word= new char[];
        while(file>>word)
        {
            myfile<<word;
            // What to write here to separate words or detect spaces. Input file is saved in UTF-8
        }
myfile.close();
file.close();
cout<<"Check output"<<endl;
}
Moeen
  • 1
  • 3
  • 1
    *"The problem I'm facing is that Urdu language doesn't have spaces"* .... *"How can i separate words or detect spaces"* I'm not sure I understand your question. You'd like to detect spaces, but just said there are no spaces? So what is there to detect? – Cory Kramer Nov 18 '14 at 13:18
  • Means even if i put spaces in them the code doesn't detect those spaces. it consider whole sentence a single word – Moeen Nov 18 '14 at 13:20
  • I want to write file as it is written same in input file. – Moeen Nov 18 '14 at 13:22
  • You're reading and writing character by character, and `counter` is the number of characters. To separate words, you probably need to use a dictionary (and settle for an approximate word count). Or, if you manually insert whitespace in the input, `string word;`. – molbdnilo Nov 18 '14 at 13:59
  • If i use string word or character array it works fine, but the issue is output file is different from input file, in Input file there are spaces between words but in output file all words are joined together. – Moeen Nov 18 '14 at 14:35
  • And Remember the input file containing words is in UTF-8 – Moeen Nov 18 '14 at 14:38

1 Answers1

0

Oh I got answer. The answer is you have to put spaces between Urdu characters because Urdu language has Space Omission Problem so at while loop

while(file>>word)
        {
            myfile<<word;
            myfile<<" ";   // Put spaces between words. 
        }
Moeen
  • 1
  • 3