Create a search tool in under 5 minutes using Azure Search

Pretty much every website and mobile app we use nowadays has the same feature: search. It’s so common that users simply take it for granted that it should exist. At work, we are busy building the new version of our platform, and, of course, we should support search. But it can be quite tricky to implement a good search tool, not only capable of actually finding stuff, but also scaling to serve potentially millions of users. I mean, look at Google, they literally conquered the internet after building a (great) search, and it was not easy.

During today’s planning meeting, our Product team asked us to estimate the cost of building this feature, and much to their sorrow, we told them that it could be quite tricky. But we left with the promise that we would sleep on the problem and think about alternatives.

Thing is, we didn’t even have to sleep. We are hosting our platform in Microsoft Azure, and one of the reasons we chose them is because they have many cool services we can play with. After doing a little bit of reading, writing no code whatsoever and devoting literally 5 minutes to setup, boom, our search tool was implemented. The secret? Azure Search.

What we needed

We wanted to let our users search for (and find) Services:

  • Near them, so we needed geo-based search.
  • By typing words, so we needed text-based search:
    • in the title
    • in the description
    • …and supporting different languages!
  • By selecting categories from a list.
  • And of course, they should be able to sort results by date, distance or rating.
  • While we’re at it, we’d like to support pagination – because nobody should have to download 11 thousand results using their 3G every time they search for something.

Setting up

The first step is to create an Azure Search instance in Azure Portal (select New> Azure Search). After that, importing your data is as easy as clicking a few buttons: we’re using DocumentDB, so I just choose which collection I wanted to work with (Azure SQL is supported too).

You must define a query that is used to import fresh data. Mine looked like this:
SELECT * FROM c WHERE c._type="Service" AND c._ts >= @HighWaterMark ORDER BY c._ts

In DocumentDB, every document has an auto-generated field _ts, which is the timestamp of when the document was last updated. So that query returns all documents created or modified since the last time Azure Search imported data.

A common practice is to never actually delete entries – you simply add an “isDeleted” flag and consider it in your queries. Guess what? Azure Search supports that, you just have to tell them what is the name of your flag, and it’ll exclude documents accordingly.

Finally, you should create your indexes. The wizard shows you a list of the fields in the results of the query you defined, and you can set each one as:

  • Searchable.
  • Filterable.
  • Sortable.
  • Facetable.
  • And Retrievable, if you want the field to be returned in results.

In our case, it looked like this:

That’s it. Congratulations! You have just implemented your search feature.

Using it

You can test your implementation in the Search Explorer. To use it in your system, there are client libraries available for a few programming languages, but they’re just wrappers to a REST API. This means you can paste an URL in your browser and watch the magic happen! For instance, this request:

https://your-name.search.windows.net/indexes/your-index/docs?  
&$select=id,category,name,description
&$filter=geo.distance(location, geography'POINT(-46.7 -23.54)') le 10 
         and category eq 'Category 1'
&$orderby=geo.distance(location, geography'POINT(-46.7 -23.54)')
&$top=2
&$skip=0
&$count=true
&$search=test*
  • $select defines which fields you want in your results.
  • $filter lets you, well, filter your results. In the example above, we want only Services less than 10km away from latitude -23.54 and longitude -46.7 (notice that the correct order is longitude then latitude, like the GeoJson format), and from Category 1.
  • $orderby sorts your results, in this case by distance from that GeoPoint.
  • $top and $skip are used for pagination. In this case, we’re getting the first page, with 2 results per page.
  • $count gives you the total number of results. If this number is large, it’ll be an approximation. It is false by default.
  • $search is the textual search you’ll be doing. The * is a wildcard, and there’s support for pretty elaborate constructions.

You’ll also have to include an api-version and your api-key, which you’ll get from the dashboard. That request returned this:

{
    "@odata.count": 10,
    "value": [
        {
            "@search.score": 1,
            "id": "1",
            "category": "Category 1",
            "name": "Teste L3",
            "description": "Teste"
        },
        {
            "@search.score": 1,
            "id": "2",
            "category": "Category 1",
            "name": "Testando",
            "description": "Testing everything"
        }
    ]
}

And this is the tip of the iceberg. We definitely still have a lot to discover, but I can’t wait to see the surprised face of my colleagues when they see this tomorrow 🙂

You’ll definitely want to read:

Autor: João Marcos Barguil

Brazilian, loves Croatia, went to Finland and had a detour to Uganda. Who knows what's coming? The future is here to be written.

Deixe um comentário

Preencha os seus dados abaixo ou clique em um ícone para log in:

Logotipo do WordPress.com

Você está comentando utilizando sua conta WordPress.com. Sair / Alterar )

Imagem do Twitter

Você está comentando utilizando sua conta Twitter. Sair / Alterar )

Foto do Facebook

Você está comentando utilizando sua conta Facebook. Sair / Alterar )

Foto do Google+

Você está comentando utilizando sua conta Google+. Sair / Alterar )

Conectando a %s