- Frederick Giasson’s Weblog - http://fgiasson.com/blog -

New UMBEL Concept Noun Tagger Web Service & Other Improvements

Last week, we released the UMBEL Concept Plain Tagger web service endpoint [1]. Today we are releasing the UMBEL Concept Noun Tagger. umbel_ws [2]

This noun tagger uses UMBEL reference concepts to tag an input text, and is based on the plain tagger [1], except as noted below.

The noun tagger uses the plain labels of the reference concepts as matches against the nouns of the input text. With this tagger, no manipulations are performed on the reference concept labels nor on the input text except if you specify the usage of the stemmer. Also, there is NO disambiguation performed by the tagger if multiple concepts are tagged for a given keyword.

Intended Users

This tool is intended for those who want to focus on UMBEL and do not care about more complicated matches. The output of the tagger can be used as-is, but it is intended to be the input to more sophisticated reference concept matching and disambiguation methods. Expect additional tagging methods to follow.

Stemming Option

This web service endpoint does have a stemming option. If the option is specified, then the input text will be stemmed and the matches will be made against an index where all the preferred and alternative labels have been stemmed as well. Then once the matches occurs, the tagger will recompose the text such that unstemmed versions of the input text and the tagged reference concepts are presented to the user.

Depending on the use case. users may prefer turning on or off the stemming option on this web service endpoint.

The Web Service Endpoint

The web service endpoint [3] is freely available. It can return its resultset in JSON, Clojure code or EDN [4] (Extensible Data Notation).

This endpoint will return a list of matches on the preferred and alternative labels of the UMBEL reference concepts that match the noun tokens of an input text. It will also return the number of matches and the position of the tokens that match the concepts.

The Online Tool

We also provide an online tagging tool [5] that people can use to experience interacting with the web service.

The results are presented in two sections depending on whether the preferred or alternative label(s) were matched. Multiple matches, either by concept or label type, are coded by color. Source words with matches and multiple source occurrences are ranked first; thereafter, all source words are presented alphabetically.

The tagged concepts can be clicked to have access to their full description.

umbel_tagger_noun [6]

Other UMBEL Website Improvements

We also did some more improvements to the UMBEL website.

Search Autocompletion Mode

First, we created a new autocomplete option on the UMBEL Search web service endpoint [7]. Often people know the concept they want to look at, but they don’t want to go to a search results page to select that concept. What they want is to get concept suggestions instantly based on the letters they are typing in a search box.

Such a feature requires a special kind of search which we call an “autocompletion search”. We added that special mode to the existing UMBEL search web service endpoint. Such a search query takes about 30ms to process. Most of that time is due to the latency of the network since the actual search function takes about 0.5 millisecond the complete.

To use that new mode, you only have to append /autocomplete to the base search web service endpoint URL.

Search Autocompletion Widget

Now that we have this new autocomplete mode for the Search endpoint, we also leveraged it to add autocompletion behavior on the top navigation search box on the UMBEL website [8].

Now, when you start typing characters in the top search box, you will get a list of possible reference concept matches based on the preferred labels of the concepts. If you select one of them, you will be redirected to their description page.

concept_autocomplete [9]

Tagged Concepts Within Concept Descriptions

Finally, we improved the quality of the concept description reading experience by linking concepts that were mentioned in the descriptions to their respective concept pages. You will now see hyperlinks in the concept descriptions that link to other concepts.

linked_concepts [10]