0

If you aren't as familiar as you need to be with the technologies listed and would like to try to answer the question anyway, here are some useful links:

https://github.com/tallstreet/Whoosh-AppEngine

http://packages.python.org/Whoosh/quickstart.html#a-quick-introduction

http://code.google.com/appengine/docs/whatisgoogleappengine.html

Now, I have pages that are dynamically created on my Google App Engine website. I want to use Whoosh to do a full text search of everything on these dynamic pages, most importantly including the dynamic content.

Here is one of my dynamic pages, using Django templates, explained in the GAE documentation. If anyone knows how to properly index the dynamic content listed in this page using the Whoosh libraries, please let me know.

<!DOCTYPE HTML>
<html>

<head>
  <title>World Crises</title>
  <meta name="description" content="website description" />
  <meta name="keywords" content="website keywords, website keywords" />
  <meta http-equiv="content-type" content="text/html; charset=windows-1252" />
  <link rel="stylesheet" type="text/css" href="style/style.css" title="style" />
  <link rel="stylesheet" type="text/css" href="style/wc.css" title="style" />
  <script type="text/javascript" src="scripts/jquery.js"></script>
  <script type="text/javascript" src="scripts/gallery.js"></script>
</head>

<body>
  <div id="main">
    <div id="header">
      <div id="logo">
        <div id="logo_text">
          <!-- class="logo_colour", allows you to change the colour of the text -->
          <h1><a href="index">World<span class="logo_colour">Crises</span></a></h1>
          <h2>Reporting World Crises since Oct 16th, 2011.</h2>
        </div>
      </div>
      <div id="menubar">
        <ul id="menu">
          <!-- put class="selected" in the li tag for the selected page - to highlight which page you're on -->
          <li><a href="index">Index</a></li>
          <li class="selected"><a href="crises">Crises</a></li>
          <li><a href="organizations">Organizations</a></li>
          <li><a href="people">People</a></li>
        </ul>
      </div>
    </div>
    <div id="site_content">
      <div id="content">
        <!-- insert the page content here -->
        {% if crisis %}
          <h1>{{ crisis.name }}</h1>
          <p></p>

          <div>
            <h2>Information</h2>
            <ul>

                {% if crisis.kind_ %} <li>Kind: {{ crisis.kind_ }}</li> {% endif %}

                {% if crisis.city or crisis.region or crisis.state or crisis.country or crisis.continent %}
                    <li>Location:
                        {% if crisis.city %}      {{ crisis.city }}      {% endif %}
                        {% if crisis.region %}    {{ crisis.region }}    {% endif %}
                        {% if crisis.state %}     {{ crisis.state }}     {% endif %}
                        {% if crisis.country %}   {{ crisis.country }}   {% endif %}
                        {% if crisis.continent %} {{ crisis.continent }} {% endif %}
                    </li>
                {% endif %}

                {% if crisis.startedOn or crisis.endedOn %}
                    <li>Date and time:
                        {% if crisis.startedOn and crisis.endedOn%}
                            {{ crisis.startedOn }} - {{ crisis.endedOn }}
                        {% else %}
                            {% if crisis.startedOn %} {{ crisis.startedOn }} {% endif %}
                            {% if crisis.endedOn %}   {{ crisis.endedOn }}   {% endif %}
                        {% endif %}
                    </li>
                {% endif%}

                {% if crisis.deaths or crisis.missing or crisis.numberDisplaced or crisis.groupsAffected %}
                    <li>Human impact:<ul>
                        {% if crisis.deaths %}          <li>{{ crisis.deaths }} deaths</li>             {% endif %}
                        {% if crisis.missing %}         <li>{{ crisis.missing }} missing</li>           {% endif %}
                        {% if crisis.numberDisplaced %} <li>{{ crisis.numberDisplaced }} displaced</li> {% endif %}
                        {% if crisis.groupsAffected %}
                            <li>Groups Affected:<ul>
                                {% for group in crisis.groupsAffected %}
                                    <li>{{ group }}</li>
                                {% endfor %}
                            </ul></li>
                        {% endif %}
                    </ul></li>
                {% endif %}

                {% if crisis.damageCost or crisis.unemployment or crisis.collectedDonations or crisis.pledgedDonations or crisis.economicSectorsAffected %}
                    <li>Economic impact:<ul>
                        {% if crisis.damageCost %}         <li>Cost: {{ crisis.damageCost }}</li>                      {% endif %}
                        {% if crisis.unemployment %}       <li>Jobs lost: {{crisis.unemployment }}</li>                {% endif %}
                        {% if crisis.collectedDonations %} <li>Donations collected: {{crisis.collectedDonations}}</li> {% endif %}
                        {% if crisis.pledgedDonations %}   <li>Donations pledged: {{crisis.pledgedDonations}}</li>     {% endif %}
                        {% if crisis.economicSectorsAffected %}
                            <li>Economic sectors affected:<ul>
                                {% for sector in crisis.economicSectorsAffected %}
                                    <li>{{ sector }}</li>
                                {% endfor %}
                            </ul></li>
                        {% endif %}
                    </ul></li>
                {% endif %}

                {% if crisis.resourcesNeeded %}
                    <li>Resources needed:<ul>
                        {% for resource in crisis.resourcesNeeded %}
                            <li>{{ resource }}</li>
                        {% endfor %}
                    </ul></li>
                {% endif %}

                {% if crisis.waysToHelp %}
                    <li>Ways to help:<ul>
                        {% for way in crisis.waysToHelp %}
                            <li>{{ way }}</li>
                        {% endfor %}
                    </ul></li>
                {% endif %}

                {% if links.videos %}
                    <li>Videos: <ul>
                        {% for video in links.videos %}
                            {% if video.url %}
                                <li><a href="{{video.url}}">
                                    {% if video.title %}
                                        {{video.title}}
                                    {% else %}
                                        {{video.url}}
                                  {% endif %}
                                </a>
                                {% if video.description %} - {{ video.description }} {% endif %}</li>
                            {% endif %}
                        {% endfor %}
                    </ul></li>
                {% endif%}

                {% if links.social %}
                    <li>Social networks: <ul>
                        {% for social in links.social %}
                            {% if social.url %}
                                <li><a href="{{social.url}}">
                                    {% if social.title %}
                                        {{social.title}}
                                    {% else %}
                                        {{social.url}}
                                    {% endif %}
                                </a>
                                {% if social.description %} - {{ social.description }} {% endif %}</li>
                            {% endif %}
                        {% endfor %}
                    </ul></li>
                {%endif%}

                {% if links.external %}
                    <li>External links: <ul>
                        {% for link in links.external %}
                            {% if link.url %}
                                <li><a href="{{link.url}}">
                                    {% if link.title %}
                                        {{link.title}}
                                    {% else %}
                                        {{link.url}}
                                    {% endif %}
                                </a>
                                {% if link.description %} - {{ link.description }} {% endif %}</li>
                            {% endif %}
                        {% endfor %}
                    </ul></li>
                {% endif %}

                {% if connections.orgs %}
                    <li>Organizations:<ul>
                        {% for org in connections.orgs %}
                            <li><a href="organization?id={{org.uniqueId}}">{{org.name}}</a></li>
                        {% endfor %}
                    </ul></li>
                {% endif %}

                {% if connections.people %}
                    <li>People:<ul>
                        {% for person in connections.people %}
                            <li><a href="person?id={{person.uniqueId}}">{{person.name}}</a></li>
                        {% endfor %}
                    </ul></li>
                {% endif %}
            </ul>
          </div>
        {% else %}
          <h1>404 Crisis Not Found</h1>
        {% endif %}
      </div>

      {% if links.images %}
        <ul class="gallery">
            {% for image in links.images %}
                {% if image.url %}
                    <li><a href="#"><img src="{{ image.url }}" {% if image.description %} alt={{image.description}} {% endif %}></a></li>
                {% endif %}
            {% endfor %}
        </ul>
      {% endif %}

    </div>
    <div id="content_footer"></div>
    <div id="footer">
    </div>
  </div>
</body>
</html>
CODe
  • 2,253
  • 6
  • 36
  • 65

1 Answers1

0

Use the Whoosh API to add documents to the search index as documented in your second link, whenever new content is added. You can choose whether you add the rendered template or the input data; it's more usual to do the latter when you're rolling your own custom search; otherwise you may as well just use a Google Custom Search Engine.

Your data appears to be pretty highly structured, though, and largely numerical; fulltext indexing may not be the best way to search your data.

Nick Johnson
  • 100,655
  • 16
  • 128
  • 198