OpenGL is a forward renderer that restricts the objects it can rasterise to points, lines and polygons. The starting point is that all 3d shapes are built from those two-or-fewer dimensional primitives. OpenGL does not in itself have a concept of solid, filled 3d geometry, and therefore has no built-in concept of how far through an object a particular fragment conceptually runs, just how many times it enters or exits.
Since it became possible to write shader programs, a variety of ways around the problem have become possible, with the most obvious for your purpose being ray casting. You could upload a cube as geometry, set to render back faces rather than front, and your actual object in voxel form as a 3d texture map. In your shader, for each pixel you'll start at one place in the 3d texture, get a vector towards the camera and walk forward resampling at suitable intervals.
A faster and easier to debug solution would be to build a BSP tree of your object for the purposes of breaking it into convex sections that can be drawn in back to front order. Prepare two depth buffers and a single pixel buffer. For clarity, call one the depth buffer the back buffer and the the other the front buffer.
You're going to step along convex sections of the model from back to front, alternating between rendering to the back depth buffer with no colour output and writing to the front depth buffer with colour output. You could get by with just one in a software renderer but OpenGL doesn't allow the target buffer to be read from for various pipeline reasons.
For each convex section, first render back-facing polygons to the back buffer. Then render front-facing polygons to the front buffer and the colour buffer. Write a shader so that every pixel you output calculates its opacity as the difference between its depth and the depth stored at its location in the back buffer.
If you're concerned about the camera intersecting with your model, you could also render front-facing polygons to the back buffer (immediately after rendering them to the front buffer and switching targets would be most convenient), then at the end draw a full-screen polygon at the front plane that outputs a suitable alpha where the value of the back buffer differs from that of the front buffer.
Addition: if the source data is some sort of voxel data, as from a CT or MRI scanner, then an alternative to ray casting is to upload as a 3d texture and draw a series of slices (perpendicular to the view plane if possible, along the current major axis otherwise). You can see some documentation and a demo at Nvidia's Developer Zone.