0

discovered this was apparently a package bug. worked fine using Nuget package DocX

Im trying to read in a file from a folder and save it as a different extension. What is the correct way to handle this? I read through the folder and encounter file paths like:

C:\Users\xx\Desktop_REPOS\scraper\Reading Questions\Week 1\239523-1094170 - yyy - Aug 24, 2017 148 PM - Short Answer Aug 21.docx

Error in my code

FileLoadException: Could not load file or assembly 'System.IO.Packaging, Version=4.0.2.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a'. The located assembly's manifest definition does not match the assembly reference. (Exception from HRESULT: 0x80131040)

I tried the solution from this post and got this error

System.IO.IOException: 'The filename, directory name, or volume label syntax is incorrect : 'C:\Users\king\Desktop_REPOS\scraper\scraper\bin\Debug\netcoreapp2.1\"C:\Users\king\Desktop_REPOS\scraper\Reading Questions\Week 1\239523-1094170 - yyy - Aug 24, 2017 148 PM - Short Answer Aug 21.docx"''

Mycode Code example I used

foreach (string path in Directory.EnumerateFiles(@"C:\Users\xx\Desktop\_REPOS\scraper\Reading Questions\Week 1", "*.*", SearchOption.AllDirectories)
.Where(s => s.EndsWith(".pdf") || s.EndsWith(".docx")))
{
        FileToTxt(path);
        //FileToTxt(AddQuotesIfRequired(path));
        Console.WriteLine("converted: " + Path.GetFileName(path));
}

public static void FileToTxt(string filepath)
{
    //Install-Package sautinsoft.document
    string textFilePath = Path.ChangeExtension(filepath, ".txt");
    DocumentCore docx = DocumentCore.Load(filepath); ////////---ERROR HERE
    docx.Save(textFilePath, SaveOptions.TxtDefault);
}
Rilcon42
  • 9,584
  • 18
  • 83
  • 167
  • @Saruman The tag for the software (Sautinsoft) doesn't exist. The post you linked deals with path length, not spaces (which seems to be my issue as paths without spaces work fine) – Rilcon42 Sep 23 '18 at 02:02
  • You are right the length is only 199, however this seems like an error with the library, and as such you should contact their support – TheGeneral Sep 23 '18 at 02:04
  • The code you linked has a `string docxFile = Path.Combine(workingDir, "romeo.docx");`. It looks like - even though it's not shown here - that a partial Path has ben combined with a full file Path in quotes. Maybe the library is expecting separated Path and FileName, which are then combined and quoted. Here, they are all mixed up. – Jimi Sep 23 '18 at 02:53
  • @Jimi, thanks for the suggestion, but I tried that.....if I remove all spaces from the path and the file name everything works perfectly, indicating that Path.Combine() isnt the issue – Rilcon42 Sep 23 '18 at 04:33

1 Answers1

0

I wound up confirming this was a package bug and switched to using DocX

This was my final working solution:

    public static bool FileToTxt(string filepath)
    {
        try { 
        //Install-Package DocX
        string textFilePath = Path.ChangeExtension(filepath, ".txt");
        DocX docx = DocX.Load(filepath);
        File.WriteAllText(textFilePath, docx.Text);
        }catch(Exception e)
        {
            Console.WriteLine($"{Path.GetFileName(filepath)} error: {e.Message}");
            return false;
        }
        return true;
    }
Rilcon42
  • 9,584
  • 18
  • 83
  • 167