Table of contents

This article explains how Happeo's search engine works, including how results are indexed, scored, sorted, ranked, and boosted, how partial searches and suggestions function, and how search analytics are calculated.

How search results are indexed

Search query scores are determined by keyword matching using Elasticsearch. Different result types are weighted differently, with page groups receiving the highest weight and posts the lowest.

  • Page groups
  • Pages
  • Channels
  • People
  • Articles
  • Posts

Keyword weighting is determined by us, not Elasticsearch. Also, we have a specific policy for handling search results from third-party applications like Jira and Drive. We do not store or rank these results ourselves. Instead, we use their respective search APIs to retrieve and display the information exactly as provided. This is why you'll see results from these apps in a separate column on the right side of the search results page rather than integrated with Happeo's own results.

Scoring of search queries

Specific results will be boosted depending on where the search query appears. Boosts can be interpreted from a 1-3 scale, with 1 being the lowest boost and 3 being the highest. We use the following boosts:

  • 1
  • 1.3
  • 1.5
  • 2
  • 3

For instance, if you search for “Sales,” page groups including the word “Sales” will appear first, whereas posts will appear last. Below, you can find content elements that affect the content scoring:

  • Boost 3
    • Page name – The keyword is included in the Page title.
    • User name – The keyword is included in the user’s name.
  • Boost 2
    • Page group (collection) – The keyword is included in a page group.
    • Channel name – The keyword is included in the channel’s name.
  • Boost 1.5
    • Page group description – The keyword is included in the page group description.
    • Page description – The keyword is included in the page description.
    • Channel description – The keyword is included in the channel description.
    • Article title – The keyword is included in the article title.
  • Boost 1.3
    • Article subtitle – The keyword is included in the article subtitle.
    • Hashtags – The keyword is included in hashtags (pages, channels, articles, posts, comments – anywhere hashtags are used).

Note: While page groups are generally weighted higher than channels in search results, a channel might appear higher if it contains the search keyword in more places. For example, a page group might only have the keyword in its title, while a channel could have it in the title, description, and hashtags, giving it a higher overall relevance score.

How results are sorted

In the Elasticsearch cluster, the search query is parsed into terms and then queried to a relevant index, such as posts.

Documents (in this index, posts) are given scores by algorithms and sorted based on the scores.

These algorithms take into account:

  • Term match in the document.
  • Term frequency in the document.
  • Term frequency overall.
  • Term importance in the document.
    • Is it once in a 300-word document, or once in a 10-word document?
  • Term location in the document.
  • Partial searches / matches.
  • Document age (older documents are less relevant).
    • This is used more heavily in the case of posts and comments, less in the case of pages, and not used at all in the case of users and groups.

Note: When querying the Google APIs, the results are given by Google as-is.

How results are ranked

Happeo results are ranked by 3 elements:

  1. Keyword relevance (simplified) – The number of times the search terms appear in the overall text matters. If you search for “product” and find a post with the word “product” listed twice in 20 words, relevance will be 2/20 = 10%.
  2. Custom attributes weighting – Search terms are more relevant, like page titles, depending on where they appear in the content. If the keyword you search is in a page title, multiply relevance by 2.
  3. Search logic – We use logic equations to apply additional relevance modifiers based on other conditions to the content. For instance, the relevancy of posts in results decrease by 50% if they are over a year old.

Partial searches and suggestions

Partial searches

Typing part of a word or sentence correctly will show a correct result. However, misspelling a word or a sentence will show no results. 

Suggestions

"Suggestion" means that the search suggests tags that were found by the search term and you can click on one of the suggested tags. In other words, it suggests hashtags related to the search by matching hashtags to the keyword.

Example case

When you search for "Sales," the algorithm prioritizes results as follows:

  1. Exact matches: First, it looks for pages, channels, or user groups with "Sales" in their name. Page groups are weighted highest, so they appear first.
  2. Description matches: If multiple results have "Sales" in their name, the algorithm checks descriptions. Pages and channels with "Sales" in their description are prioritized over those without. Pages have a slightly higher weight than channels in this step.
  3. Hashtag matches: If multiple pages and channels have "Sales" in their descriptions, the algorithm looks at associated hashtags.
  4. Content matches: If hashtags still result in a tie (e.g., two pages both have the hashtag "Sales"), the algorithm analyzes the content of the page, channel, article, or post, counting how many times "Sales" is mentioned.

Users and groups do not have descriptions or hashtags, so if there is a tie between them, they will not be prioritized over pages and channels. This is because Happeo’s Search is intended for content, whereas Happeo’s People Directory can and is recommended for searching for people instead.

Best practices

To learn how to optimize your content for Happeo's search and improve its visibility, read "Best Practices for Using Happeo's Search," which covers keyword strategies, information preservation, improving search for embedded documents, and more!

How search analytics are calculated

Search analytics categorizes user searches into two main groups: successful and failed. These categories represent all user interactions with the search system. Refined searches, where users modify their initial query, are calculated separately and do not contribute to the success or failure metrics.

  • A refined search is defined as a user starting a query and then changing that query. Simply changing the query to a completely different word does not count as a refined search. 
  • A failed search, also known as a "gave up" search, occurs when a user opens the search bar, types in a query, does not click on any results and leaves the search interface.
  • A "gave up" search can also be triggered if the user remains inactive for two minutes or longer after entering a query.

The Overview section in Search Analytics is based on search sessions, while data for individual search terms is based on search queries. A search session includes all the queries a user enters to find their desired result. This means the average search time might be longer, but it provides greater accuracy because it reflects the complete search process.

Glossary

Search session – One search session can have multiple queries. When a user opens the search bar, a search session is started.

  • Success – A user typed a query (or multiple) into the search bar, found what they were looking for, and clicked on the result.
  • Failure – A user typed a query (or multiple) into the search bar, didn’t find what they were looking for, and left.

Search query – Any term that you look for in the search.

  • Success / opened item – A user typed one query and clicked on the result.
  • Refined query – A user typed a query into the search bar, didn’t find what they were looking for, and changed the query (following an algorithm to determine the similarity of the queries).
  • Failed / gave up query – A user typed a query and left the Search without clicking on anything or didn’t do anything for 2 mins.

 

 

Previous
Next
34448981774481