1

Note: I am asking this question because I see that I need to import many libraries to work with different files which will increase the size of the applications. And time consuming to shift between libraries in run time because random files are processed every time. I just want the text in the files.

Hi, Recently I started working on a project that needs to read different types of files like txt, pdf, word, excel and many more.

I am reading

Excel - using Microsoft excel interop

Pdf - using ITextSharp

txt - using Stream based classes.

My question is is can i read all these files using Stream based classes because they convert all files data to bytes?

Or I can read only text files using stream classes because text files will have only pure text and not images unlike other file types like pdf?

Nithin B
  • 601
  • 1
  • 9
  • 26
  • Well you can read the bytes in the file... but in order to understand the *meaning* of those bytes, you'd need to understand the format. By the way, for text files you should be using `TextReader` rather than `Stream`... – Jon Skeet Aug 18 '16 at 06:50

1 Answers1

1

You can read all this files by bytes but

not all types are saving the data in bytes like you know it from a *.txt file because they are using different formats to save the content.

For example *.xlsx is a open XML-format. The file is a zpipped folder containing a lot of XML files. *.pdf is also a special format - it´s very complicatet to get the content out of the binary.

Read this answer to get more information!

Community
  • 1
  • 1
Fruchtzwerg
  • 10,999
  • 12
  • 40
  • 49