0

For string variable in DigitalMicrograph, we can find the position of a particular pattern using the "find" function:

Number find( String str, String sub_str )

I would like to do the same but with image data. For example, I can create an image with

image img := exprsize(1024, icol);

and the pattern I want to find is

image pattern := exprsize( 15, icol+64 );

In above case, we know the offset of pattern w.r.t. the data is at column number 64. A real case we won't have a such simple pattern (i.e. a straight line). A brutal force approach with a "for" loop will certainly work but it gets painfully slow when the data size is getting bigger. Anyone has a better/elegant suggestion? 1D image may be easier, how about 2D image?

Many thanks!

KEVIVI
  • 297
  • 1
  • 8
  • 1
    This is an interesting question, but the answer depends on what you mean by 'image data'. Do you mean real-world image data that includes noise? If so, then a pattern will never match exactly and something noise-tolerant like a cross-correlation will be needed. If you simply mean some sort of numeric data that happens to be stored in an image array, then you might be able to locate an exact match by converting the numeric data to a hex string representation and then using the Find function. Please provide more details about your problem. – Mike Kundmann Jan 17 '16 at 02:28
  • I'm looking for the exact match so it looks like the hex string is the way to go. Is there a fast way to convert binary to hex string (i.e. not converting one byte at a time) just using DM scripts? There are indeed many bin-hex conversion apps out there. – KEVIVI Jan 17 '16 at 12:42

3 Answers3

2

Given that you are effectively looking for an exact match to numeric data, then judicious use of image expressions may be the most efficient path to a solution. Roughly following your example, we begin by setting up source data and target pattern:

Image sourceData := RealImage("Source data", 4, 4096);
sourceData = Random();

Image targetPattern := RealImage("Target pattern", 4, 15);
targetPattern = sourceData.Index(icol + 1733, 0);

Then we prepare a carefully arranged search buffer with a single image expression:

Number targetSize = targetPattern.ImageGetDimensionSize(0);
Number searchBufferW = sourceData.ImageGetDimensionSize(0) - targetSize;
Image searchBuffer := RealImage("Search buffer", 4, searchBufferW, targetSize);
searchBuffer = sourceData.Index(icol + irow, 0);

This arranges all potential matching subsets of the source data in vertical columns of a 2D image. Finally we do a little image math to locate the match to the target pattern, if one exists:

searchBuffer = Abs(searchBuffer - targetPattern.Index(irow, 0));
Image projectionVector := targetPattern.ImageClone();
projectionVector = 1.0;
Image searchResult := projectionVector.MatrixMultiply(searchBuffer);

Number posX, posY;
Number wasFound = (searchResult.Min(posX, posY) == 0);
String resultMsg = (wasFound) ? "Pattern found at " + posX : "Pattern not found";
OKDialog(resultMsg);

The first line will yield an exact zero in every pixel of the search buffer column that matches the target pattern. Vertically summing the search buffer and using the Min() function to find a zero speeds up the search for a match.

Note the use of MatrixMultiply() to do a rapid vertical sum projection. This will only work for type Real (4-byte floating point) source data. There are, however, slightly more complex approaches to rapid data projection that will also give a fairly quick result for any numeric data type.

Although illustrated for a 1D pattern in a 1D data set, this approach can probably be extended to 1D and 2D patterns in 2D and 3D data sets by using a multi-dimensioned search buffer and more advanced indexing using ImageDataSlice objects, but that would be a subject for another question.

Mike Kundmann
  • 705
  • 5
  • 12
  • This is an elegant solution for 1D data case. The only catch is the search buffer will get really big if the data to be searched has a large size and the search pattern has a small size. Thanks! – KEVIVI Jan 18 '16 at 15:16
2

As Mike has pointed out, a cross-correlation is a good way to search for a pattern in the presence of noise. However, it is even better (if not the perfect method) to search in the absence of noise! This will work in 1D and 2D for scripting. See below

number sx = 1024
number sy = 1024
number pw = 32
number ph = 32
number px = 100 // trunc( random()*(sx-pw) )
number py = 200 // trunc( random()*(sy-ph) )

image test := RealImage("Data",4,sx,sy)
test = random()
image pattern := test[py,px,py+ph,px+pw].ImageClone()
//test.showimage()
//pattern.showimage()
image patternSearch = test*0
patternSearch[0,0,ph,pw] = pattern
//patternSearch.ShowImage()

image corr := CrossCorrelate(test,patternSearch)
corr.ShowImage()
number mx,my,mv
mv = max(corr,mx,my)
mx -= trunc(sx/2)       // because we've placed the pattern in the 
my -= trunc(sy/2)       // top/left of the search-mask
Result("\n Pattern = " + px + " / " + py )
Result("\n max = " + mv + " at " + mx + "/" + my )

image found = test*0
found[my,mx,my+ph,mx+pw]=pattern
rgbImage overlay = RGB((test-found)*256,found*256,0)
overlay.ShowImage()

If your problem is only 1D and you've very large data, then an alternative approach might give you a quicker solution. I would then suggest to try to use RAW-data streaming (via the TagGroup Streaming commands) and use any additional information you have to adjust the search, i.e. search only for the beginning of a pattern in the stream and then only verify on "hit" etc.

Notes added here to address issue regarding search pattern in 1D image. If we run the following scripts couple of times then we can find it fails to find the pattern properly about 50% of time.

number sx = 1024
number sy = 0
number pw = 16
number ph = 0
number px = trunc( random()*(sx-pw) )
number py = 0 // trunc( random()*(sy-ph) )

image test := RealImage("Data",4,sx );
test = random();
image patternSearch := exprsize( sx, icol<pw? test[icol+px, irow]: 0 );
// test.ShowImage();
// patternSearch.ShowImage();
patternSearch.SetName( "PatternSearch" );
//

image corr := CrossCorrelate(test,patternSearch)
// corr.ShowImage()
number mx,my,mv
mv = max(corr,mx,my)
mx -= trunc(sx/2)       // because we've placed the pattern in the 
my -= trunc(sy/2)       // top/left of the search-mask
if( mx <= 0 ) mx += sx;
Result("\n\n Pattern = " + px + " / " + py )
Result("\n max = " + mv + " at " + mx + "/" + my )
KEVIVI
  • 297
  • 1
  • 8
BmyGuest
  • 6,331
  • 1
  • 21
  • 35
  • The cross-correlating approach works great with 2D pattern search with one catch. We will need to take care of the case where the value of mx -= trunc(sx/2) or my -= trunc(sy/2) may be negative by adding two following lines: if( mx <= 0 ) mx += sx; if( my <= 0 ) my += sy; On the other hand, this cross-correlation approach does not work well with 1D image. It fails to find the pattern sometimes and I'm not sure why. I'm not sure how to search raw-data stream. Can you provide one simple example? Many thanks! – KEVIVI Jan 18 '16 at 15:01
  • Can you post an example where the CC for 1D fails? I'm curious why this would be the case - or if something is unexpected with using CC for 1D data... And thanks for the hint with the width/2 or height/2 shift. Feel free to edit my post directly :c) – BmyGuest Jan 18 '16 at 15:04
2

As requested, here is a snipped showing how one could do a search in a "raw" data stream. I'm not claiming that the script below is the fastest or most elegant solution, it is just showing how the according commands work. (You find them documented in the "File Input and Output" section of the online F1 help.)

The 'idea' I've put into it: Just search for the occurrences of last value of your search pattern in the stream. Only when found, see if the start-value at given distance would also match. Only in this case, check the whole pattern. This should be a useful method for long search patterns, but it might not be so optimal for very short ones.

{
    number patternSize = 8
    number dataSize = 24000
    number patternPos = trunc( random() * ( dataSize - patternSize ) )

    number const = 200
    number dataTypeSizeByte  = 4
    number stream_byte_order = 0

    // Prepare test-Dummies
        image searchSet := IntegerImage( "search", dataTypeSizeByte, 0, patternSize )
        searchSet = const * sin( icol/iwidth *  Pi() )
        // searchSet.ShowImage()

        image dataSet := IntegerImage( "data", dataTypeSizeByte, 0, dataSize ) 
        dataSet = const * random() * 0.3
        dataSet.Slice1( patternPos, 0, 0, 0, patternSize, 1 ) = searchSet
        // dataSet.ShowImage()

    // Prepare Data as RawStream
        object buffer = NewMemoryBuffer( dataSize * dataTypeSizeByte )
        object stream = NewStreamFromBuffer(buffer)
        dataSet.ImageWriteImageDataToStream( stream, stream_byte_order )
        stream.StreamSetPos(0,0)

    // Prepare aux. Tags for streaming
        TagGroup tg = NewTagGroup();
        tg.TagGroupSetTagAsUInt32( "UInt32_0", 0 )

    // Prepare values to search for 
        number startValue = searchSet.GetPixel(0,0)
        number lastValue =  searchSet.GetPixel(patternSize-1,0)

    // search for the pattern
        // Search for the LAST value of the pattern only.
        // If found, check if the FIRST value in appropriated distance also matches
        // Only then compare whole pattern.

        number value
        number streamEndPos = stream.StreamGetSize() 
        number streamPos = (patternSize-1) * dataTypeSizeByte // we can skip the first few tests
        stream.StreamSetPos(0, streamPos )  
        while( streamPos < streamEndPos )
        {
            tg.TagGroupReadTagDataFromStream( "UInt32_0", stream, stream_byte_order )
            streamPos = stream.StreamGetPos()

            tg.TagGroupGetTagAsUInt32( "UInt32_0", value )  // use appropriate data type!
            if ( lastValue == value )
            {
                result("\n Pattern might end at: "+streamPos/dataTypeSizeByte)

                // shift to start-value (relative) to check first value!
                stream.StreamSetPos(1, -1 * patternSize * dataTypeSizeByte )    
                tg.TagGroupReadTagDataFromStream( "UInt32_0", stream, stream_byte_order )
                tg.TagGroupGetTagAsUInt32( "UInt32_0", value )  
                if ( startValue == value )
                {
                    result("\t (Start also fits!) " )

                    // Now check all of it!
                    stream.StreamSetPos(1, -1 * dataTypeSizeByte )  
                    image compTemp := IntegerImage( "SectionData", dataTypeSizeByte, 0, patternSize )
                    compTemp.ImageReadImageDataFromStream( stream, stream_byte_order )

                    if ( 0 == sum( abs(compTemp - searchSet) ) )
                    {
                        number foundPos = (stream.StreamGetPos()/dataTypeSizeByte - patternSize)
                        Result("\n Correct starting position: " + patternPos )
                        Result("\n Found starting position  : " + foundPos )
                        OKDialog( "Found subset at position : " + foundPos )
                        exit(0)
                    }       
                }
                stream.StreamSetPos(0, streamPos )  
            }   
    }
    OKDialog("Nothing found.")
}
BmyGuest
  • 6,331
  • 1
  • 21
  • 35
  • The script will become slower if there are *many* values in the data which match the "search-end-value". It is reasonably fast though. Depending on what exactly you're trying to locate, any search-algorithm might be adjustable for more speed... Raw-streaming will in all cases beat a simple "GetPixel" type extraction. However, I still think the CC method should work generally better. – BmyGuest Jan 18 '16 at 15:55