9

I'm trying to pass the handle returned from u_fgetfile into fseek/fread functions.

When linking my application with the debug runtime libraries (/MTd /MDd) there is no crash, but if I link against the static versions this simple code crashes:

#include <stdio.h>
#include "unicode\ustdio.h"

int main()
{
    UFILE* file;
    file = u_fopen("C:\\test.txt","r",NULL,"UTF-8");
    fseek(u_fgetfile(file),3,SEEK_SET);
}

Now this happens with both official builds of ICU and when I build custom builds with Visual Studio 2012 (building ICU in debug or release doesn't matter).

The only thing I have found out is that there seems to be some mismatch in the FILE structure, but I really don't know.

Edit:

As part of adding a bounty to this question, here's a fully functional VS2012 project containing both reproducer program (same as the code posted above) and icu with source and binaries. Get it here: http://goo.gl/urTuU

monoceres
  • 4,722
  • 4
  • 38
  • 63
  • I'm sure you tested this, but just for the sake of completeness: `u_fopen()` and `u_fgetfile()` do not return `NULL`, don't they? – alk Jul 12 '13 at 08:55
  • I have tested that, so no. It seems to be some sort of structure mismatch between icu and my program for the FILE structure. – monoceres Jul 12 '13 at 09:38
  • `FILE` is defined in `` which should be the same for both builds (libicu **and** the test app). Are sure your aren't mixing 32 bit and 64 bit builds? – alk Jul 12 '13 at 09:44
  • Yeah, they should be the same, but obviously something is wrong with the handle. I can't see how that would even work at all, surely if I tried to link a 64 bit lib into my project (or vice versa), the build would fail (or at least crash before reaching the fseek line)? – monoceres Jul 12 '13 at 09:49
  • I cannot test the code as I am on Linux, but is it a compilation error or a runtime error ? – Vincent Jul 12 '13 at 19:27

4 Answers4

9

It seems to me like the issue is within _lock_file where it says:

    /*
     * The way the FILE (pointed to by pf) is locked depends on whether
     * it is part of _iob[] or not
     */
    if ( (pf >= _iob) && (pf <= (&_iob[_IOB_ENTRIES-1])) )
    {
        /*
         * FILE lies in _iob[] so the lock lies in _locktable[].
         */
        _lock( _STREAM_LOCKS + (int)(pf - _iob) );
        /* We set _IOLOCKED to indicate we locked the stream */
        pf->_flag |= _IOLOCKED;
    }
    else
        /*
         * Not part of _iob[]. Therefore, *pf is a _FILEX and the
         * lock field of the struct is an initialized critical
         * section.
         */
        EnterCriticalSection( &(((_FILEX *)pf)->lock) );

A "normal" FILE* will enter the top branch, the pointer returned from u_fgetfile will enter the bottom branch. Here it is assumed that it is a _FILEX*, which is most likely simply not correct.

As we see, the runtime compares to see if the file pointer fb is within _iob. But, in the debugger, we can see clearly that it is far outside of it (at least in the release build).

Given that u_fgetfile just returns a FILE* that was stored within the UFILE structure, we can inspect finit_owner in ufile.c to see how the FILE* ends up in our structure in the first place. After reading that code, I must assume that in a release build, two separate instances of the _iob array exist in the CRT, but in the debug build, only a single instance exists.

To get around this problem, you're going to want to make sure that the FILE* is created in the same thread as your main application. To do that, you can utilize u_finit, like so:

FILE* filePointer = fopen("test.txt","r");
UFILE* file = u_finit(filePointer,NULL,"UTF-8");

fseek(filePointer,3,SEEK_SET); // <- won't crash

Regarding your issue that came up after this, it seems to me like the underlying problem is simply sharing a FILE* between libraries, which fails because they have separate storage areas for FILE*. I find this somewhat confusing, but I don't have the necessary understanding of the involved components (and the style of the Windows C runtime code isn't helping either).

So, if the FILE* is allocated in ICU, then you can't lock it in your main application and vice versa (and trying to read or seek will always involve locking).

Unless there is a very obvious solution to this problem, which I'm missing, I would recommend emulating the behavior of u_fgets() (or whatever else you'll need) in your main application.
From what I can tell, u_fgets() just calls fread() to read data from the file and then uses ucnv_toUnicode(), with the converter stored in the UFILE (which you can retrieve with u_fgetConverter()), to convert the read data into a UChar*.

One way that seems to work is linking ICU statically. I don't know if that is an option for you, but it seems to resolve the issue on my end.

I downloaded the latest release of ICU (51.2) and compiled it with this helpful script. I then linked the project against the libraries in icu-release-static-win32-vs2012 (link with sicuuc.lib, sicuio.lib, sicudt.lib, sicuin.lib).

Now u_fgets() no longer causes an access violation. Of course, now my .exe is almost 23 MB big.

Oliver Salzburg
  • 21,652
  • 20
  • 93
  • 138
  • Thanks for this! The fseek line now finally doesn't crash! However I now get the reverse behavior, using the UFILE* handle from u_finit or u_fadopt cannot be used with u_fgets without crashing :( – monoceres Jul 16 '13 at 12:17
  • @monoceres: Ouch. I only tried to `fread` the `FILE*` to see if it was valid. But I guess that was only half the story :) I'll look into it this evening and see what I can find out. – Oliver Salzburg Jul 16 '13 at 12:21
  • Yep, I just added debug info the release build of icu and I can confirm that u_fgets crashes in the else branch on the EnterCriticalSection. – monoceres Jul 16 '13 at 12:30
  • @monoceres: I gave it another look. Maybe you'll find the results helpful. – Oliver Salzburg Jul 16 '13 at 20:55
  • I definitely find the results helpful. At least I know how to work around the problem and why it crashes. I'm keeping the question open for a little longer in case someone comes in with something magic, otherwise I'm awarding the bounty to you. Oh, and thanks for the effort. I really appreciate it! – monoceres Jul 16 '13 at 22:01
  • See my answer below... you have a wrong setting in the ICO-IO project for release mode!!! – Jochen Kalmbach Jul 18 '13 at 08:52
5

The problem is "icuio51.dll" (Release) is linked against the static CRT! Therefore it does not share the same FILE pointer with the shared CRT! And that´s the reason why it crashes in "lock"...

On the other hand: ´icuio51d.dll´ (Debug) is linked against the same shared CRT (msvcr110d.dll) and therefor are using the same shared FILE* pointer.

That´s the reason while it is working in "Debug" but not in "Release".

Solution: You have to recompile the ICU with the correct settings for always using "Shared CRT" (/MD and /MDd)... To do this, you have to make the follwoing steps:

  1. Open the Solution "allinone\allinone.sln"
  2. Edit the properties of the project "io"
  3. Change "C/C++|Code Generation|Runtime Library" from "Multi-threaded (/MT)" to "Multi-threaded DLL (/MD)" for "Win32" Platform (x64 seems to be correct!)
  4. Recompile your project
Jochen Kalmbach
  • 3,549
  • 17
  • 18
3

Your problem is due to linking with the wrong libraries.

I have reproduced exactly the reverse results with the official ICU and VS2010 - Release build works while Debug build crashes.

The problem is (IMHO) in linking both to debug and release VC runtime and passing FILE pointer between functions in its different versions. Result from fopen in the Release is passed to fseek in Debug. They seem incompatible in the part of differentiating between _iob and FILEX files (different _IOB_ENTRIES).

Linking a Debug project with Release ICU and vice versa will lead to this problem.

bbonev
  • 1,406
  • 2
  • 16
  • 33
  • This makes sense and is what I believe(d) also. However I just tried the following: 1) Cleaned solutions. 2) Built ICU in release mode. 3) Built my application in release mode, linked with icuuc.lib and icuio.lib and copied icuuc51.dll, icuio51.dll, icudt51.dll, icuin51.dll to output directory. 4) Crash still remains :( – monoceres Jul 14 '13 at 15:36
  • There are instruction on how to build ICU without function renaming aiming to install it as a system library. Maybe be this way of building it may decrease the mess? Also I have tried similar things as you and couldn't make the DEBUG build to work (in my case Release works). – bbonev Jul 14 '13 at 23:57
  • 1
    Thank you, this solved the issue for me. The official build contains binaries for release only. I used the following build, which also contains binaries for debug: http://www.npcglib.org/~stathis/blog/precompiled-icu/ – fdermishin Jun 24 '14 at 10:14
0

First question: is it possible to compile and use c++11 features (because then we will have more tools to analyze what is actually happening.) ?

The first thing would be to try to look inside your file structure using this kind of code:

#include <stdlib.h>
#include <string.h>
#include <stdio.h>

int main()
{
    unsigned int size = sizeof(FILE);
    FILE* file = fopen("main.c", "r");
    unsigned int i = 0;
    unsigned char buffer[size];
    memcpy(buffer, file, size);
    printf("The %u bytes at address %p are: \n", size, file);
    for (i = 0; i < size; ++i) {
        printf("%02X ", (unsigned int)(buffer[i]));
        if ((i+1)%64 == 0) {
            printf("\n");
        }
    }
    printf("\n");
    fclose(file);
    return 0;
}

replacing FILE by your UFILE and the fopen function by your u_fopen, for two different files and for both debug and static libraries.

It will display the bytes of your file structure, and maybe we will learn usefull things about where the problem is.

Vincent
  • 57,703
  • 61
  • 205
  • 388