The affine filter is a great way to do it.
Here is an example:
echo "
description=Square
frame_rate_num=24
frame_rate_den=1
width=640
height=640
progressive=1
sample_aspect_num=1
sample_aspect_den=1
display_aspect_num=1
display_aspect_den=1
colorspace=708
" > square_profile.txt
melt -profile ./square_profile.txt clip.mp4 -filter affine transition.geometry="0=0,0:1138x640; 720=-498,0:1138x640"
The example assumes:
- clip.mp4 is a 16x9 source (1920x1080 would work)
- Clip is 720 frames long (e.g. 30 seconds at 24 fps)
Let me break down the example for you.
The first part specifies a custom profile that is 640x640 and has a square aspect ratio. You don't need to create the file every time. You can customize it to your specifications.
-profile ./square_profile.txt
This tells melt to use your custom profile.
transition.geometry= ...
This is how you tell the affine transition (which the affine filter uses internally) what you want it to do. The first number of each geometry entry is the frame number that it applies to. The filter will interpolate values between frames. The syntax for a geometry entry is: "K=X,Y:WxH" where "K" is the key frame that the geometry applies to.
0=0,0:1138x640
The first geometry entry tells the affine filter to scale the image to 1138x640 and to position the image at 0,0.
640 is the height of the output - telling affine to scale the original image to a height of 640 to fill the output frame. 1138 is the width of a 16x9 image that is 640 pixels high. 1138 is wider than the output image. And since we specified the image to be positioned at 0,0, the right part of the image will be cropped off by the affine filter.
720=-498,0:1138x640
The second geometry entry tells the affine filter to keep the same scaling, but to position the image at an x location of -498. 489 = 1138 - 640. That is, the number of pixels that were cropped off of the image in the first frame. And the negative tells affine to position the image to the left of the output frame so that the left part of the image is cropped off. "720=" specifies that this is the geometry for the 720th frame.
The x position for all frames between 0 and 720 will be interpolated automatically by the affine filter. So you will see the image scroll from left to right as it plays.
You can add more key frames to the geometry to make it pause at a particular position or to make it go back and forth. The affine transition (which the affine filter uses) also has other interesting operations like mirror and cycle. You can see the full documentation here:
http://www.mltframework.org/bin/view/MLT/TransitionAffine#scale