I have a machine learning model (namely, an autoencoder) that attempts to learn a sparse representation of an input signal via a simple l1 penalty term added to the objective function. This indeed works to promote a sparse vector representation in the sense that most of the elements in the learned vector representation are zeros. However, I need the sparsity to be structured such that the non-zero elements are "spread out"/distributed/uniform over the vector. Concretely, for a given input signal, my model produces a sparse representation that looks like this:
Current sparse code:
[...,0,0,0,0,0,0,0,0,0,0,0,
0.2,0.3,0.5,0.9,0.3,0.2,0.1
,0,0,0,0,0,0,0,0,0,0,0,0,...]
You can appreciate that most elements are zero, with small clusters of non-zero elements. Instead, I want the sparsity to be such that non-zero elements are "repelled" by each other, and hence make it so that all non-zero elements are surrounded by at least 1 or more zeros and few or no non-zero elements are adjacent in the vector; concretely, it should look more like this:
Desired sparse code:
[...,0,0,0,0,0,
0.2,0,0,0,0,
0.9,0,0,0,0,
0.5,0,0,0,0,0,0,
0.7,0,0,0,
0.4,0,0,
0.6,...]
In the latter sparse code, the number of non-zero elements may be similar as the former, but each non-zero element is separated from each other by some number of zeros.
Is there a straightforward objective function penalty I can use to induce this form of sparsity?