0

I have been given a set of (approved) requirements and an already approved solution to implement Google Custom Search into an existing website.

This website has the following:

  • Jobs

    • Category 1
    • Category 2
    • Category 3
  • Normal pages

    • Category 1
    • Category 2
    • Category 3

The requirement of the search functionality is that people can use CheckBoxes to filter results. So if the below was true:

[x] Category 1
[ ] Category 2
[x] Category 3

Then no pages would be shown from Category 2. However, there is also:

[x] Show jobs only

How am I able to implement this via Google Custom Search? I've read about PageMap, using <meta> tags, etc.. however I cannot understand how I am to filter results based on these.. ?

I looked here: Google custom search API - sorting / filter

However it doesn't appear to answer my concerns. I'm still a bit lost in the documentation.

Is this sort of thing possible? Does anyone have any links to some more thorough examples?

I had a thought to try in-memory filtering.. however if Google just happens to throw back a 1 Job page in 10 results while the [x] Show jobs only checkbox is checked.. then the user will only get 1 result on the page.

I am leaning towards the XML-based result set using the Custom Search Engine.. however if that needs to change I'm open to suggestions.

Any advice appreciated.

Community
  • 1
  • 1
Simon Whitehead
  • 63,300
  • 9
  • 114
  • 138
  • All of the content is managed by a custom CMS on this particular site.. so using an XML file is out of the question really. Ideally we would dump whatever information Google needs into the pages themselves. However, I want to filter the results of the XML response BEFORE I get the XML.. does that make sense? – Simon Whitehead Feb 06 '13 at 00:36

1 Answers1

10

I have managed to figure this out ... Only by lots of trial and error.

To begin, an example PageMap element in the XML response:

<PageMap>
    <DataObject type="metatags">
        <Attribute name="creationdate" value="D:20100902144455+10'00'"/>
        <Attribute name="creator" value="Adobe InDesign CS5 (7.0)"/>
        <Attribute name="moddate" value="D:20100902144510+10'00'"/>
        <Attribute name="producer" value="Adobe PDF Library 9.9"/>
    </DataObject>
</PageMap>

Google's filtering will only match individual words separated by spaces, special characters, etc. So, if I wanted to search for a "creator" with "CS5" in it, I would use this query string:

?q=My+Search+Text+Here+more:pagemap:metatags-creator:CS5
                                    ^^^^^^^^ ^^^^^^^
                                      type    name

The above "type" and "name" refer to the DataObject and Attribute element attribute names. The last part is the word you want to filter by.

So now I should be able to dump the following to to a page in Category 1:

<PageMap>
    <DataObject type="metatags">
        <Attribute name="category" value="Category1"/>
    </DataObject>
</PageMap>

Or.. for a job:

<PageMap>
    <DataObject type="metatags">
        <Attribute name="IsJobPage" value="Yes"/>
    </DataObject>
</PageMap>

..and use a query such as this:

?q=My+Search+Text+Here+more:pagemap:metatags-category:Category1,Category3
?q=My+Search+Text+Here+more:pagemap:metatags-IsJobPage:Yes

The first example returns any pages with a meta tag name of "category" that contains the value "Category1" OR "Category3".

Hopefully this answer saves someone from tearing their hair out.. like I almost did.

Simon Whitehead
  • 63,300
  • 9
  • 114
  • 138