2

I am writing some font drawing shaders in OpenGL 3.3. I will render my font into a texture atlas and then generate some display lists for some text I want to draw. I would like the rendering of text to consume the least amount of resources (CPU, GPU memory, GPU time). How can I accomplish this?

Looking at Freetype-gl, I noticed that the author generates 6 indices and 4 vertices per character.

Since I am using OpenGL 3.3, I have some additional freedom. My plan was to generate 1 vertex per character plus one integer "code" per character. The character code can be used in texelFetch operations to retrieve texture coördinates and character size information. A geometry shader turns the size information and vertex into a triangle strip.

Is texelFetch going to be slower than sending more vertices/texture coördinates? Is this worth doing?, or is there are reason why it's not done in the font libraries I looked at?


Final code:

Vertex shader:

#version 330

uniform sampler2D font_atlas;
uniform sampler1D code_to_texture;
uniform mat4 projection;
uniform vec2 vertex_offset;  // in view space.
uniform vec4 color;
uniform float gamma;

in vec2 vertex;  // vertex in view space of each character adjusted for kerning, etc.
in int code;

out vec4 v_uv;

void main()
{
    v_uv = texelFetch(
            code_to_texture,
            code,
            0);
    gl_Position = projection * vec4(vertex_offset + vertex, 0.0, 1.0);
}

Geometry shader:

#version 330

layout (points) in;
layout (triangle_strip, max_vertices = 4) out;

uniform sampler2D font_atlas;
uniform mat4 projection;

in vec4 v_uv[];

out vec2 g_uv;

void main()
{
    vec4 pos = gl_in[0].gl_Position;
    vec4 uv = v_uv[0];
    vec2 size = vec2(textureSize(font_atlas, 0)) * (uv.zw - uv.xy);
    vec2 pos_opposite = pos.xy + (mat2(projection) * size);

    gl_Position = vec4(pos.xy, 0, 1);
    g_uv = uv.xy;
    EmitVertex();

    gl_Position = vec4(pos.x, pos_opposite.y, 0, 1);
    g_uv = uv.xw;
    EmitVertex();

    gl_Position = vec4(pos_opposite.x, pos.y, 0, 1);
    g_uv = uv.zy;
    EmitVertex();

    gl_Position = vec4(pos_opposite.xy, 0, 1);
    g_uv = uv.zw;
    EmitVertex();

    EndPrimitive();
}

Fragment shader:

#version 330

uniform sampler2D font_atlas;
uniform vec4 color;
uniform float gamma;

in vec2 g_uv;

layout (location = 0) out vec4 fragment_color;

void main()
{
    float a = texture(font_atlas, g_uv).r;
    fragment_color.rgb = color.rgb;
    fragment_color.a = color.a * pow(a, 1.0 / gamma);
}
Neil G
  • 32,138
  • 39
  • 156
  • 257

2 Answers2

1

Maybe you can use Atomic Counter to handle current position in text.

Here is an interresting paper on memory bandwidth GPU perf...

You can cache the result in a fbo.

For realy fast rendering as you said, you may build a geom shader taking points as input and outputing quads and sample a texture to get additional on glyph info.

This appear effectively the best solution...

j-p
  • 1,622
  • 10
  • 18
  • Interesting use of atomic counter, and sending the string as yet another texture! Why can't I use code points, which the card translates into atlas displacements using another texture and some texelFetch commands? – Neil G Mar 20 '14 at 09:16
  • because if you arrange your atlas the same way as unicode points you'll be rapidly out of memory :-), so you will surely need a CodePoint->AtlasIndex converter when inputing chars – j-p Mar 20 '14 at 09:21
  • texelFetch for getting your indices, why not but it appear to me a bit eavy – j-p Mar 20 '14 at 09:23
  • That's my question: will it be slower to send UV into the texture atlas as you're suggesting or code points, which are translated into UV using texelFetch? – Neil G Mar 20 '14 at 09:24
  • a couple of arithmetic operations or a texel fetch, I would opt for one or two opp... – j-p Mar 20 '14 at 09:27
  • Yes, but without the texelFetch, you add maybe 5 words per character (2 UVs, 1 character size, all 3 of which are vec2). Isn't there are cost to buffering the data, cache misses, etc.? – Neil G Mar 20 '14 at 09:32
  • no uv may be computed base on charIndice, char size should be a multiplicator constant for a shader run which result will be cached in a fbo that could be blender for multiple char size which would be an uniform modified potentialy between shader runs – j-p Mar 20 '14 at 09:37
  • I see, you're assuming a monospace font… :) – Neil G Mar 20 '14 at 09:38
  • lol, right. see the paper i link i've put in my response, there are memory bandwith consideration that would answer most of your concern – j-p Mar 20 '14 at 09:40
  • is it possible to fetch pixel from geometry shader? – j-p Mar 20 '14 at 09:47
  • yes, after some quick study, your solution appear to me the good choice – j-p Mar 20 '14 at 10:03
  • Thanks for your input. I got it working and would be happy to post my shaders if anyone is interested. Since I only learned opengl 3 a couple weeks ago, it was nice to have confirmation that it would work before starting down this path. – Neil G Mar 26 '14 at 08:43
  • @NeilG Sure I'm interested, nice job :-) – j-p Mar 26 '14 at 09:50
  • Thannk you. Added to the question. (Feedback welcome.) – Neil G Mar 26 '14 at 10:07
1

I wouldn't expect there to be a significant performance difference between your proposed method vs storing the quad vertex positions and texture coordinates in a vertex buffer. On the one hand your method requires a smaller vertex buffer and less work for the CPU. On the other hand the texelFetch calls will be more-or-less at random locations, and not make the best use of the cache. This last point may not be very significant as I guess that texture wont be very large. Also, the execution model of geometry shaders mean they can quickly become the bottleneck of the pipeline.

To answer "is this worth doing?" - I suspect not for performance reasons. Unfortunately you can't tell until you implement it and measure the performance. I think it's quite a cool idea though, so I don't think you'd be wasting your time trying it out.

GuyRT
  • 2,919
  • 13
  • 8
  • Thanks for your input. I got it working and would be happy to post my shaders if anyone is interested. Since I only learned opengl 3 a couple weeks ago, it was nice to have confirmation that it would work before starting down this path. – Neil G Mar 26 '14 at 08:39