Actually you problem is pretty normal in task like NLP where you have different length of sequence. In your comment you discard all of previous output by using return_sequences=False
which is not common in our practice and it normally result in a low performance model.
Note: There is no ultimate solution in neural network architecture design
Here is what I can suggest.
Method 1 (No custom layer required)
You can use same latent dimension in both LSTM and stack them up in 2 dimension and treat them as one big hidden layer tensor.
input1 = Input(shape=(50,1))
x1 = LSTM(100, return_sequences=True)(input1)
input2 = Input(shape=(25,1))
x2 = LSTM(100, return_sequences=True)(input2)
x = concatenate([x1,x2], axis=1)
# output dimension = (None, 75, 100)
If you do not want to have same latent dimension, what others do is adding 1 more part which we normally call it a mapping layer which consisted of stacked of dense layer. This approach have more variable which means model is harder to train.
input1 = Input(shape=(50,1))
x1 = LSTM(100, return_sequences=True)(input1)
input2 = Input(shape=(25,1))
x2 = LSTM(50, return_sequences=True)(input2)
# normally we have more than 1 hidden layer
Map_x1 = Dense(75)(x1)
Map_x2 = Dense(75)(x2)
x = concatenate([Map_x1 ,Map_x2 ], axis=1)
# output dimension = (None, 75, 75)
Or flatten the output (both of them)
input1 = Input(shape=(50,1))
x1 = LSTM(100, return_sequences=True)(input1)
input2 = Input(shape=(25,1))
x2 = LSTM(50, return_sequences=True)(input2)
# normally we have more than 1 hidden layer
flat_x1 = Flatten()(x1)
flat_x2 = Flatten()(x2)
x = concatenate([flat_x1 ,flat_x2 ], axis=1)
# output (None, 2650)
Method 2 (custom layer required)
create your custom layer and use attention mechanism that produce a attention vector and use that attention vector as a representation of your LSTM output tensor. What others do and achieve better performance is to use last hidden state of LSTM (that you only use in your model) with attention vector as a representation.
Note: According to research, different types of attention gives almost the same performance, so I recommend "Scaled Dot-Product Attention" because it is faster to compute.
input1 = Input(shape=(50,1))
x1 = LSTM(100, return_sequences=True)(input1)
input2 = Input(shape=(25,1))
x2 = LSTM(50, return_sequences=True)(input2)
rep_x1 = custom_layer()(x1)
rep_x2 = custom_layer()(x2)
x = concatenate([rep_x1 ,rep_x2], axis=1)
# output (None, (length rep_x1+length rep_x2))