3

I work in autonomous robotics. I will often simulate a robot without visualization, export position and rotation data to a file at ~30 fps, and then play that file back at a later time. Currently, I save the animation data in a custom-format JSON file and animate using three.js.

I am wondering if there is a better way to export this data?

I am not well versed in animation, but I suspect that I could be exporting to something like COLLADA or glTF and gain the benefits of using a format that many systems are already setup to import.

I have a few questions (some specific and some general):

  1. How do animations usually get exported in these formats? It seems that most of them have something to do with the skeletons or morphing, but neither of concepts appear to apply to my case. (Could I get a pointer to an overview of general animation concepts?)

  2. I don't really need key-framing. Is it reasonable to have key-frames at 30 to 60 fps without any need for interpolation?

  3. Do any standard animation formats save data in a format that doesn't assume some form of interpolation?

  4. Am I missing something? I'm sure my lack of knowledge in the area has hidden something that is obvious to animators.

drajc
  • 115
  • 7

2 Answers2

2

The type of animation you describe is often called "baked" animation, where some calculation has been sampled, possibly at 30 ~ 60 fps, with keyframes saved at the high sample rate. For such animations, usually linear interpolation is applied. For example, in Blender, there's a way to run the Blender Game Engine and record the physics simulation to (dense) keyframes.

As for interpolation, here's a thought experiment: Consider for a moment a polygon-based render engine wants to render a circle, but must use only straight lines. Some limited number of points are calculated around the edge of the circle, and dozens or hundreds of small line segments fill in the gaps between the points. With enough density, or with the camera far enough back, it looks round, but the line segments ensure there are no leaks or gaps in the would-be circle. The same concept applies (in time rather than in space) to baked keyframes. There's a high sample density, and straight lines (linear interpolation) fill in the gaps. If you play it in super-slow motion, you might be able to detect subtle changes in speed as new keyframes are reached. But at normal speed, it looks normal, and the frame rate doesn't need to stay locked to the sample rate.

There's a section on animations for glTF 2.0 that I'll recommend reading here (disclaimer, I'm a glTF contributor and member of the working group). In particular, look at the descriptions of node-based animations with linear interpolation.

For robotics, you'll want to steer clear of skins and skeleton-based animation. Such things are not always compatible with node-based animations anyway (we've run into problems there just recently). The node-based animations are much more applicable to non-deforming robots with articulated joints and such.

emackey
  • 11,818
  • 2
  • 38
  • 58
  • Thank you for the thorough answer. So, are you saying that I should still be concerned with interpolation even when I export dense key-frames (since it would improve smoothness of playback)? That seems to make sense as I also have the ability to do slo-mo replay in my visualizer. Before I posted, I read a bit about the COLLADA and glTF animation formats, but the terminology is foreign enough to me that I asked the above question. Could you recommend a source for a quick primer on what these terms mean? – drajc Dec 14 '17 at 16:58
2

You specifically mentioned autonomous robots, and position and rotation in particular. So I assume that the robot itself is the level of granularity that is supposed to be stored here. (Just to differentiate it from an articulated robot - basically a manipulator ("arm") with several rotational or translational joints that may have different angles)

For this case, a very short, high-level description about how this could be stored in glTF(*):

You would store the robot (or each robot) as one node of a glTF asset. Each of the nodes can contain a translation and rotation property (given as a 3D vector and a quaternion). These nodes would then simply describe the position and orientation of your robots. You could imagine the roboty being "attached" to these nodes. (In fact, you can attach a mesh to these nodes in glTF, which then could be the visual representation of the robot).

The animation data itself would then be a description about how these properties (translation and rotation) change over time. The way how this information is stored can be imagined as a table, where you associate the translation and rotation with each time stamp:

time (s)        0.1   0.2  ...  1.0

translation x   1.2   1.3  ...  2.3
translation y   3.4   3.4  ...  4.3
translation z   4.5   4.6  ...  4.9

rotation x      0.12  0.13 ...  0.42
rotation y      0.32  0.43 ...  0.53
rotation z      0.14  0.13 ...  0.34
rotation w      0.53  0.46 ...  0.45

This information is then stored, in a binary form, and provided by so-called accessor objects.

The animation of a glTF asset then basically establishes the connection between this binary animation data, and the properties in the node that are affected by that: Each animation refers to such a "data table", and to the node whose properties will be filled with the new translation and rotation value as time progresses.

Regarding interpolation:

In your case, where the output is sampled at a high rate from the simulation, basically each frame is a "key frame", and no explicit information about key frames or the interpolation scheme will have to be stored. Just declaring that the animation interpolation should be of the type LINEAR or STEP should be sufficient for this use case.

(The option to declare it as a LINEAR interpolation will mainly relevant for the playback. Imagine you stop your playback exactly after 0.15 seconds: Should it then show the state that the robot had at the time stamp 0.1 or the state at time stamp 0.2, or one that is interpolated linearly? This, however, would mainly apply to a standard viewer, and not necessarily to a custom playback)


(*) A side note: On a conceptual level, the way of how the information is represented in glTF and COLLADA is similar. Roughly speaking, COLLADA is an interchange format for authoring applications, and glTF is a transmission format that can efficiently be transferred and rendered. So although the answers until now refer to glTF, you should consider COLLADA as well, depending on your priorities, use-cases or how the "playback" that you mentioned is supposed to be implemented.

Disclaimer: I'm a glTF contributor as well. I also created the glTF tutorial section showing a simple animation and the one that explains some concepts of animations in glTF. You might find them useful, but they obviously build upon some of the concepts that are explained in the earlier sections.

Marco13
  • 53,703
  • 9
  • 80
  • 159
  • Thank you for your answer. Your's combined with @emackey's have been very helpful, and I think I've been put on the right track as far as implementing my visualizer. – drajc Dec 19 '17 at 16:53