{"id":3564,"date":"2017-01-24T14:19:06","date_gmt":"2017-01-24T19:19:06","guid":{"rendered":"http:\/\/fgiasson.com\/blog\/?p=3564"},"modified":"2017-01-24T15:26:46","modified_gmt":"2017-01-24T20:26:46","slug":"disambiguating-kbpedia-knowledge-graph-concepts","status":"publish","type":"post","link":"https:\/\/fgiasson.com\/blog\/index.php\/2017\/01\/24\/disambiguating-kbpedia-knowledge-graph-concepts\/","title":{"rendered":"Disambiguating KBpedia Knowledge Graph Concepts"},"content":{"rendered":"<p>One of the most important natural language processing tasks is to &#8220;tag&#8221; concepts in text. Tagging a concept means determining whether words or phrases in a text document matches any of the concepts that exist in some kind of a knowledge structure (such as a knowledge graph, an ontology, a taxonomy, a vocabulary, etc.). (BTW, a similar definition and process applies to tagging an entity.) What is usually performed is that the input text is parsed and normalized in some manner. Then all of the surface forms of the concepts within the input knowledge structure (based on their preferred and alternative labels) are matched against the words within the text. &#8220;Tagging&#8221; is when a match occurs between a concept in the knowledge structure and one of its surface forms in the input text.<\/p>\n<p>But here is the problem. Given the ambiguous world we live in, often this surface form, which after all is only a word or phrase, may be associated with multiple different concepts. When we identify the surface form of &#8220;bank&#8221;, does that term refer to a financial institution, the shore of a river, a plane turning, or a pool shot? identical surface forms may refer to totally different concepts. Further, sometimes a single concept will be identified but it won&#8217;t be the right concept, possibly because the right one is missing from the knowledge structure or other issues.<\/p>\n<p>A good way to view this problem of ambiguity is to analyze a random Web page using the <a href=\"http:\/\/cognonto.com\/\">Cognonto Demo online application<\/a>. The demo usea the Cognonto Concepts Tagger service to tag all of the existing KBpedia knowledge graph concepts found in the target Web page. Often, when you analyze what has been tagged by the demo, you may see some of these amgibuities or wrongly tagged concepts yourself. For instance, check out <a href=\"http:\/\/cognonto.com\/analyze\/?url=http%3A%2F%2Fwww.cnn.com%2F2016%2F09%2F20%2Fpolitics%2Fsyria-convoy-strike-us-conclusion-russia%2F\">this example<\/a>. If you mouse over the tagged concepts, you will notice that many of the individual &#8220;tagged&#8221; terms refer to multiple KBpedia concepts. Clearly, in its basic form, <i>this Cognonto demo is <b>not<\/b> disambiguating the concepts<\/i>.<\/p>\n<p>The purpose of this article is thus to explain how we can &#8220;disambiguate&#8221; (that is, suggest the proper concept from an ambiguous list) the concepts that have been tagged. What we will do is to show how we can leverage the KBpedia knowledge graph structure as-is to perform this disambiguation. What we will do is to create graph embeddings for each of the KBpedia concepts using the DeepWalk algorithm. Then we will perform simple linear algebra equations on the graph embeddings to determine if the tagged concept(s) is the right one given its context or not. We will test multiple different algorithms and strategies to analyze the impact on the overall disambiguation performance of the system.<\/p>\n<p><!--more--><\/p>\n<p>[extoc]<\/p>\n<div id=\"outline-container-org3b4f4f8\" class=\"outline-2\">\n<h2 id=\"org3b4f4f8\">Different Situations<\/h2>\n<div id=\"text-org3b4f4f8\" class=\"outline-text-2\">\n<p>Before starting to work on the problem, let&#8217;s discuss the issues we may encounter with any such concept tagging and disambiguation task. The three possibilities we may encounter are:<\/p>\n<ol class=\"org-ol\">\n<li>A word in the text refers to a concept, but the concept hasn&#8217;t been identified (is not in the graph)<\/li>\n<li>A word in the text has been tagged with a single concept, but that concept is not the right one (the right one is not in the graph)<\/li>\n<li>A word in the text has been tagged with multiple concepts, so the right concept needs to be selected amongst the options offered.<\/li>\n<\/ol>\n<p>What we present in this article is how we can automatically disambiguate the situations 2 and 3. (To fix the first situation, not addressed further herein, we would have to add additional surface forms to the knowledge graph by adding new alternative labels to existing concepts, or by adding new concepts with their own surface forms.)<\/p>\n<\/div>\n<\/div>\n<div id=\"outline-container-orgff7696d\" class=\"outline-2\">\n<h2 id=\"orgff7696d\">Disambiguation Process<\/h2>\n<div id=\"text-orgff7696d\" class=\"outline-text-2\">\n<p>The disambiguation process is composed of two main tasks: the creation of the graph embeddings for each concept in the KBpedia knowledge graph, and then the evaluation of each tagged concept to disambiguate them.<\/p>\n<\/div>\n<div id=\"outline-container-orgd9d2631\" class=\"outline-3\">\n<h3 id=\"orgd9d2631\">Introducing DeepWalk<\/h3>\n<div id=\"text-orgd9d2631\" class=\"outline-text-3\">\n<p><a href=\"https:\/\/arxiv.org\/abs\/1403.6652\">DeepWalk<\/a> was created to learn <i>social representations<\/i> of a graph&#8217;s vertices that capture neighborhood similarity and community membership. DeepWalk generalizes neural language models to process a special language composed of a set of randomly-generated walks.<\/p>\n<p>With Cognonto, we want to use DeepWalk not to learn <i>social representations<\/i> but to learn the relationship (that is, the similarity) between all of the concepts existing in a knowledge graph given different kinds of relationships such as <code>sub-class-of<\/code>, <code>super-class-of<\/code>, <code>equivalent-class<\/code> or other relationships such as KBpedia&#8217;s <code>80 aspects relationships<\/code>. (Note, we also discussed and used DeepWalk in another Cognonto use case <a href=\"http:\/\/cognonto.com\/use-cases\/extending-kbpedia-with-kbpedia-categories\/\">extending KBpedia<\/a>.)<\/p>\n<p>For this use case we use the DeepWalk algorithm to select concepts from the broader Wikipedia corpus. Other tasks that could be performed using DeepWalk in a similar manner are:<\/p>\n<ol class=\"org-ol\">\n<li>Content recommendation,<\/li>\n<li>Anomaly detection [in the knowledge graph], or<\/li>\n<li>Missing link prediction [in the knowledge graph].<\/li>\n<\/ol>\n<p>Note that we randomly walk the graphs as stated in DeepWalk&#8217;s original paper <sup><a id=\"fnr.1\" class=\"footref\" href=\"#fn.1\">1<\/a><\/sup>. However more experiments could be performed to change the random walk by other graph walk strategies like depth-first or breadth-first walks.<\/p>\n<\/div>\n<\/div>\n<div id=\"outline-container-org5ff7cfb\" class=\"outline-3\">\n<h3 id=\"org5ff7cfb\">Create Graph Embedding Vectors<\/h3>\n<div id=\"text-org5ff7cfb\" class=\"outline-text-3\">\n<p>We first create a <code>graph embedding<\/code> for each of the Wikipedia categories. What we have to do is to use the Wikipedia category structure along with the linked KBpedia knowledge graph. Then we have to generate the graph embedding for each of the Wikipedia categories that exist in the structure.<\/p>\n<p>The graph embeddings are generated using the <a href=\"https:\/\/arxiv.org\/abs\/1403.6652\">DeepWalk<\/a> algorithm over that linked structure. It randomly walks the linked graph hundred of times to generate the graph embeddings for each category.<\/p>\n<\/div>\n<div id=\"outline-container-org1701165\" class=\"outline-4\">\n<h4 id=\"org1701165\">Create Deeplearning4j Graph<\/h4>\n<div id=\"text-org1701165\" class=\"outline-text-4\">\n<p>To generate the graph embeddings, we use <a href=\"https:\/\/deeplearning4j.org\/\">Deeplearning4j&#8217;s<\/a> DeepWalk implementation. This step creates a Deeplearning4j <code>graph<\/code> structure that is used by its DeepWalk implementation to generate the embeddings.<\/p>\n<p>The graph we have to create is composed of the latest version of KBpedia knowledge graph version <code>1.20<\/code>. Since the only concepts we want to disambiguate are the KBpedia reference concepts as tagged by the Cognonto Concept Tagger, then this is the only knowledge structure we have to load to create the graph embeddings.<\/p>\n<p>Then we generate the initial Deeplearning4j graph structure composed of the <i>inferred<\/i> KBpedia knowledge graph.<\/p>\n<div class=\"org-src-container\">\n<pre class=\"src src-clojure\"><span style=\"color: #ae81ff;\">(<\/span>use '<span style=\"color: #66d9ef;\">[<\/span><span style=\"color: #66d9ef;\">cognonto-deepwalk.core<\/span><span style=\"color: #66d9ef;\">]<\/span><span style=\"color: #ae81ff;\">)<\/span>\n<span style=\"color: #ae81ff;\">(<\/span>require '<span style=\"color: #66d9ef;\">[<\/span><span style=\"color: #66d9ef;\">clojure.string<\/span> <span style=\"color: #ae81ff;\">:as<\/span> string<span style=\"color: #66d9ef;\">]<\/span><span style=\"color: #ae81ff;\">)<\/span>\n<span style=\"color: #ae81ff;\">(<\/span>require '<span style=\"color: #66d9ef;\">[<\/span><span style=\"color: #66d9ef;\">clojure.data.csv<\/span> <span style=\"color: #ae81ff;\">:as<\/span> csv<span style=\"color: #66d9ef;\">]<\/span><span style=\"color: #ae81ff;\">)<\/span>\n<span style=\"color: #ae81ff;\">(<\/span>require '<span style=\"color: #66d9ef;\">[<\/span><span style=\"color: #66d9ef;\">clojure.java.io<\/span> <span style=\"color: #ae81ff;\">:as<\/span> io<span style=\"color: #66d9ef;\">]<\/span><span style=\"color: #ae81ff;\">)<\/span>\n<span style=\"color: #ae81ff;\">(<\/span>require '<span style=\"color: #66d9ef;\">[<\/span><span style=\"color: #66d9ef;\">cognonto-owl.query<\/span> <span style=\"color: #ae81ff;\">:as<\/span> query<span style=\"color: #66d9ef;\">]<\/span><span style=\"color: #ae81ff;\">)<\/span>\n\n<span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">defonce<\/span> <span style=\"color: #fd971f;\">knowledge-graph<\/span> <span style=\"color: #66d9ef;\">(<\/span>load-knowledge-graph <span style=\"color: #e6db74;\">\"file:\/d:\/cognonto-git\/kbpedia-generator\/kbpedia\/1.10\/target\/kbpedia_reference_concepts_linkage_inferrence_extended_2.n3.owl\"<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n<\/pre>\n<\/div>\n<\/div>\n<\/div>\n<div id=\"outline-container-org055a6e1\" class=\"outline-4\">\n<h4 id=\"org055a6e1\">Train DeepWalk<\/h4>\n<div id=\"text-org055a6e1\" class=\"outline-text-4\">\n<p>Once the Deeplearning4j graph is created, the next step is to create and train the DeepWalk algorithm. What the <code>(create-deep-walk)<\/code> function does is to create and initialize a <code>DeepWalk<\/code> object with the <code>Graph<\/code> we created above and with some hyperparameters.<\/p>\n<p>The <code>:window-size<\/code> hyperparameter is the size of the window used by the continuous <a href=\"https:\/\/en.wikipedia.org\/wiki\/N-gram#Skip-gram\">Skip-gram<\/a> algorithm used in DeepWalk. The <code>:vector-size<\/code> hyperparameter is the size of the embedding vectors we want the DeepWalk to generate (it is the number of dimensions of our model). The <code>:learning-rate<\/code> is the initial leaning rate of the <a href=\"https:\/\/en.wikipedia.org\/wiki\/Stochastic_gradient_descent\">Stochastic gradient descent<\/a>.<\/p>\n<p>For this task, we initially use a window of <code>15<\/code> and <code>3<\/code> dimensions to make visualizations simpler to interpret, and an initial learning rate of <code>2.5%<\/code>.<\/p>\n<div class=\"org-src-container\">\n<pre class=\"src src-clojure\"><span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">def<\/span> <span style=\"color: #fd971f;\">graph<\/span> <span style=\"color: #66d9ef;\">(<\/span>create-deepwalk-graph knowledge-graph <span style=\"color: #ae81ff;\">:directed?<\/span> <span style=\"color: #ae81ff;\">true<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n\n<span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">def<\/span> <span style=\"color: #fd971f;\">deep-walk<\/span> <span style=\"color: #66d9ef;\">(<\/span>create-deep-walk graph \n                                 <span style=\"color: #ae81ff;\">:window-size<\/span> 15\n                                 <span style=\"color: #ae81ff;\">:vector-size<\/span> 3\n                                 <span style=\"color: #ae81ff;\">:learning-rate<\/span> 0.025<span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n<\/pre>\n<\/div>\n<p>Once the DeepWalk object is created and initialized with the graph, the next step is to train that model to generate the embedding vectors for each vertex in the graph.<\/p>\n<p>The training is performed using a random walk iterator. The two hyperparameters related to the training process are the <code>walk-length<\/code> and the <code>walks-per-vertex<\/code>. The <code>walk-length<\/code> is the number of vertices we want to visit for each iteration. The <code>walks-per-vertex<\/code> is the number of timex we want to create random walks for each vertex in the graph.<\/p>\n<div class=\"org-src-container\">\n<pre class=\"src src-clojure\"><span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">defn<\/span> <span style=\"color: #a6e22e;\">train<\/span>\n  <span style=\"color: #66d9ef;\">(<\/span><span style=\"color: #a6e22e;\">[<\/span>deep-walk iterator<span style=\"color: #a6e22e;\">]<\/span>\n   <span style=\"color: #a6e22e;\">(<\/span>train deep-walk iterator 1<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span>\n  <span style=\"color: #66d9ef;\">(<\/span><span style=\"color: #a6e22e;\">[<\/span>deep-walk iterator walks-per-vertex<span style=\"color: #a6e22e;\">]<\/span>\n   <span style=\"color: #a6e22e;\">(<\/span><span style=\"color: #f92672;\">.fit<\/span> deep-walk iterator<span style=\"color: #a6e22e;\">)<\/span>\n   <span style=\"color: #a6e22e;\">(<\/span><span style=\"color: #f92672;\">dotimes<\/span> <span style=\"color: #e6db74;\">[<\/span>n walks-per-vertex<span style=\"color: #e6db74;\">]<\/span>\n     <span style=\"color: #e6db74;\">(<\/span><span style=\"color: #f92672;\">.reset<\/span> iterator<span style=\"color: #e6db74;\">)<\/span>\n     <span style=\"color: #e6db74;\">(<\/span><span style=\"color: #f92672;\">.fit<\/span> deep-walk iterator<span style=\"color: #e6db74;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span>\n  <span style=\"color: #66d9ef;\">(<\/span><span style=\"color: #a6e22e;\">[<\/span>deep-walk graph walk-length walks-per-vertex<span style=\"color: #a6e22e;\">]<\/span>\n   <span style=\"color: #a6e22e;\">(<\/span>train deep-walk <span style=\"color: #e6db74;\">(<\/span><span style=\"color: #f92672;\">new<\/span> <span style=\"color: #66d9ef;\">RandomWalkIterator<\/span> graph walk-length<span style=\"color: #e6db74;\">)<\/span> walks-per-vertex<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n<\/pre>\n<\/div>\n<p>For the initial setup, we want to have a <code>walk-length<\/code> of <code>15<\/code> and we want to iterate the process <code>175<\/code> times per vertex.<\/p>\n<div class=\"org-src-container\">\n<pre class=\"src src-clojure\"><span style=\"color: #ae81ff;\">(<\/span>train deep-walk graph 15 175<span style=\"color: #ae81ff;\">)<\/span>\n<\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div id=\"outline-container-org83c165a\" class=\"outline-3\">\n<h3 id=\"org83c165a\">Concept Disambiguation<\/h3>\n<div id=\"text-org83c165a\" class=\"outline-text-3\">\n<p>Now that all of the KBpedia reference concepts are characterized using the graph embeddings, how can we disambiguate each of these concepts in a text? The first thing is to observe what is happening. When we tag a sentence such as <i>&#8220;A sandstorm blows over damaged buildings in rebel held area of Douma.&#8221;<\/i> the first step that happens is that the surface forms (preferred labels and alternative labels) of the KBpedia concepts are identified within the sentence. As we discussed above, it is possible that a word, or a group of words, may get tagged with multiple KBpedia reference concepts.<\/p>\n<p>Because we characterized each of the concepts with its graph embeddings, it means that what we are dealing with is a series of vectors that represent the &#8220;meaning&#8221; of each concept in the knowledge graph structure. What we end-up with is the following:<\/p>\n<div class=\"figure\">\n<p><a href=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/sentence.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-3567\" src=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/sentence.png\" alt=\"\" width=\"638\" height=\"110\" srcset=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/sentence.png 638w, https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/sentence-300x52.png 300w\" sizes=\"auto, (max-width: 638px) 100vw, 638px\" \/><\/a><\/p>\n<\/div>\n<p>The next step is to use that information to try to disambiguate each of these concepts. To try to disambiguate the concepts, we have to make a few assumptions. First we have to assume that the graph embeddings created by the DeepWalk algorithm when crawling the KBpedia knowledge graph represent the <i>&#8220;meaning&#8221;<\/i> of the concept within the knowledge structure. Then we have to assume that the sentence where the concept has been identified creates a <i>context<\/i> that we can use to disambiguate the identified concepts based on their <i>&#8220;meaning&#8221;<\/i>.<\/p>\n<p>In the example above, the context is the sentence, and four concepts have been identified. The big assumption we are making here is that each identified concept is &#8220;tightly&#8221; related to others in the knowledge graph. Given this assumption, what we have to do is to calculate the relatedness of each concept, within the context, and keep the ones that are closest to each other.<\/p>\n<\/div>\n<div id=\"outline-container-orge43a25c\" class=\"outline-4\">\n<h4 id=\"orge43a25c\">Gold Standard Creation<\/h4>\n<div id=\"text-orge43a25c\" class=\"outline-text-4\">\n<p>Before digging into how we will perform the disambiguation of these concepts, another step is to create a gold standard that we will use to evaluate the performance of our model and to check the impact of different hyperparameters on the disambiguation process. We created this gold standard manually by using random sentences found in online news articles. We first tagged each of these sentences using the Cognonto Concept Tagger. Then, we manually disambiguated each of the tagged concepts. This gives us the properly labeled training set for our exercise. The result is the following <a href=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/disambiguation-gold-standard.csv\">gold standard file<\/a>.<\/p>\n<p>Each of the annotations looks like this:<\/p>\n<p><a href=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_7c24ba70dbba722f3a2b8cfcf3afea2446f60105.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-medium_large wp-image-3568\" src=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_7c24ba70dbba722f3a2b8cfcf3afea2446f60105-768x18.png\" alt=\"\" width=\"768\" height=\"18\" srcset=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_7c24ba70dbba722f3a2b8cfcf3afea2446f60105-768x18.png 768w, https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_7c24ba70dbba722f3a2b8cfcf3afea2446f60105-300x7.png 300w, https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_7c24ba70dbba722f3a2b8cfcf3afea2446f60105.png 921w\" sizes=\"auto, (max-width: 768px) 100vw, 768px\" \/><\/a><\/p>\n<p>This markup first presents the surface forms for the related concept(s) in the knowledge graph within the double brackets, followed by the concept(s) URI endings between the double parenthesis. The double colon <code>::<\/code> designator provides the suggested disambiguated concept. If nothing follows the double colon it means that the correct concept does not exist in the knowledge graph).<\/p>\n<\/div>\n<\/div>\n<div id=\"outline-container-org6bbe958\" class=\"outline-4\">\n<h4 id=\"org6bbe958\">Disambiguation Method<\/h4>\n<div id=\"text-org6bbe958\" class=\"outline-text-4\">\n<p>The actual disambiguation method is based on some simple linear algebra formulas.<\/p>\n<p>Each tagged word\u00c2\u00a0<a href=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_0d1399fad7c1911e6241538c7b4fdfb50329d273.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-3569 size-full\" src=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_0d1399fad7c1911e6241538c7b4fdfb50329d273.png\" width=\"19\" height=\"12\" \/><\/a> is related to a sliding context\u00c2\u00a0<a href=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_7d61ea454e712ce09c7f75d3140788acf53ae5fe.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-3570\" src=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_7d61ea454e712ce09c7f75d3140788acf53ae5fe.png\" alt=\"\" width=\"14\" height=\"20\" \/><\/a> where <a href=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_9783bbde9fb6923e6ceddef20e7a918b54a2f992.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-3571\" src=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_9783bbde9fb6923e6ceddef20e7a918b54a2f992.png\" alt=\"\" width=\"147\" height=\"20\" \/><\/a>. Each tagged word has\u00c2\u00a0<a href=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_ef038ba123ebef4319d569d54ec40dc5f67bc0c5.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-3572\" src=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_ef038ba123ebef4319d569d54ec40dc5f67bc0c5.png\" alt=\"\" width=\"18\" height=\"14\" \/><\/a> graph embedding vectors (i.e. a <i>&#8220;sense&#8221;<\/i> vector)\u00c2\u00a0<a href=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_a5f0e1561072b96080a9c732794bd18fe00f1a57.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-3573\" src=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_a5f0e1561072b96080a9c732794bd18fe00f1a57.png\" alt=\"\" width=\"26\" height=\"20\" \/><\/a> where <a href=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_72cf9483351da664caaca6cbde55432080d6c202.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-3574\" src=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_72cf9483351da664caaca6cbde55432080d6c202.png\" alt=\"\" width=\"96\" height=\"18\" \/><\/a>. What we want the disambiguation method to do is to calculate a score between each embedding vector of each identified concept of each word given its sliding context. The score is calculated by performing the dot product between the graph embedding vector\u00c2\u00a0<a href=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_6b526e3a714ac2c7be46661934ef653eac7e7fd8.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-3575\" src=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_6b526e3a714ac2c7be46661934ef653eac7e7fd8.png\" alt=\"\" width=\"11\" height=\"14\" \/><\/a> and the sliding context vector <a href=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_774d0389f3a45245568f0233190a445a7ad65f53.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-3576\" src=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_774d0389f3a45245568f0233190a445a7ad65f53.png\" alt=\"\" width=\"12\" height=\"14\" \/><\/a>. The sliding context vector is calculated by summing the graph embedding vector\u00c2\u00a0<a href=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_6b526e3a714ac2c7be46661934ef653eac7e7fd8.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-3575\" src=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_6b526e3a714ac2c7be46661934ef653eac7e7fd8.png\" alt=\"\" width=\"11\" height=\"14\" \/><\/a> of each word\u00c2\u00a0<a href=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_ccb5e9230532a6990c88b620742d233da163504d.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-3577\" src=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_ccb5e9230532a6990c88b620742d233da163504d.png\" alt=\"\" width=\"14\" height=\"9\" \/><\/a> within the window:<\/p>\n<p><a href=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_73e561775d087190d01c276973ae7b6941ab72e7.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-3578\" src=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_73e561775d087190d01c276973ae7b6941ab72e7.png\" alt=\"\" width=\"343\" height=\"256\" srcset=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_73e561775d087190d01c276973ae7b6941ab72e7.png 343w, https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_73e561775d087190d01c276973ae7b6941ab72e7-300x224.png 300w\" sizes=\"auto, (max-width: 343px) 100vw, 343px\" \/><\/a><\/p>\n<p>Finally the score is calculated using:<\/p>\n<p><a href=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_15bde81280c85f78a2f16bd40d9ac16d62b8fb41.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-3579\" src=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_15bde81280c85f78a2f16bd40d9ac16d62b8fb41.png\" alt=\"\" width=\"58\" height=\"20\" \/><\/a><\/p>\n<p>This is the first definition of the score we want to use to begin to disambiguate concepts from the knowledge graph associated with words in a sentence. As you can read from the formula, only the first sense for each word\u00c2\u00a0<a href=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_0d1399fad7c1911e6241538c7b4fdfb50329d273.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-3569\" src=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_0d1399fad7c1911e6241538c7b4fdfb50329d273.png\" alt=\"\" width=\"19\" height=\"12\" \/><\/a> is being selected for each word within the sliding window that creates the context. (Other variants are be explored later in this article.) However as we will see below, we will change that formulas a bit such that the score becomes easier to understand for a human by expressing it in degrees. What is important to understand here is that we use two vectors to calculate the score: the graph embedding of the concept we want to disambiguate and the vector calculated from the windowed context that sums all the graph embedding vectors of each concept within that window.<\/p>\n<p>Now let&#8217;s put this method into code. The first thing we have to do is to create a lookup table that is composed of the graph embeddings for each of the concepts that exist in the KBpedia knowledge graph as we calculated with the DeepWalk algorithm above.<\/p>\n<div class=\"org-src-container\">\n<pre class=\"src src-clojure\"><span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">def<\/span> <span style=\"color: #fd971f;\">index<\/span> <span style=\"color: #66d9ef;\">(<\/span><span style=\"color: #f92672;\">-&gt;&gt;<\/span> <span style=\"color: #a6e22e;\">(<\/span><span style=\"color: #66d9ef;\">query<\/span><span style=\"color: #66d9ef;\">\/<\/span>get-classes knowledge-graph<span style=\"color: #a6e22e;\">)<\/span>\n                <span style=\"color: #a6e22e;\">(<\/span>map-indexed <span style=\"color: #e6db74;\">(<\/span><span style=\"color: #f92672;\">fn<\/span> <span style=\"color: #fd971f;\">[<\/span>i class<span style=\"color: #fd971f;\">]<\/span>\n                               <span style=\"color: #fd971f;\">(<\/span><span style=\"color: #f92672;\">if-not<\/span> <span style=\"color: #f92672;\">(<\/span>string? class<span style=\"color: #f92672;\">)<\/span>\n                                 <span style=\"color: #f92672;\">{<\/span><span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">.toString<\/span> <span style=\"color: #66d9ef;\">(<\/span><span style=\"color: #f92672;\">.getIRI<\/span> class<span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span> <span style=\"color: #ae81ff;\">(<\/span>inc i<span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #f92672;\">}<\/span>\n                                 <span style=\"color: #f92672;\">{<\/span>class <span style=\"color: #ae81ff;\">(<\/span>inc i<span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #f92672;\">}<\/span><span style=\"color: #fd971f;\">)<\/span><span style=\"color: #e6db74;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span>\n                <span style=\"color: #a6e22e;\">(<\/span>apply merge<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n\n<span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">def<\/span> ^<span style=\"color: #ae81ff;\">:dynamic<\/span> <span style=\"color: #fd971f;\">vertex-vectors<\/span> <span style=\"color: #66d9ef;\">(<\/span><span style=\"color: #f92672;\">.getVertexVectors<\/span> <span style=\"color: #a6e22e;\">(<\/span><span style=\"color: #f92672;\">.lookupTable<\/span> deep-walk<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n<\/pre>\n<\/div>\n<p>Getting the vector\u00c2\u00a0<a href=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_a5f0e1561072b96080a9c732794bd18fe00f1a57.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-3573\" src=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_a5f0e1561072b96080a9c732794bd18fe00f1a57.png\" alt=\"\" width=\"26\" height=\"20\" \/><\/a> for the word\u00c2\u00a0<a href=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_0d1399fad7c1911e6241538c7b4fdfb50329d273.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-3569\" src=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_0d1399fad7c1911e6241538c7b4fdfb50329d273.png\" alt=\"\" width=\"19\" height=\"12\" \/><\/a> is as simple as getting it from the lookup table we created above:<\/p>\n<div class=\"org-src-container\">\n<pre class=\"src src-clojure\"><span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">defn<\/span> <span style=\"color: #a6e22e;\">get-vector<\/span>\n  <span style=\"color: #66d9ef;\">[<\/span>uri-ending<span style=\"color: #66d9ef;\">]<\/span>\n  <span style=\"color: #66d9ef;\">(<\/span><span style=\"color: #f92672;\">when-let<\/span> <span style=\"color: #a6e22e;\">[<\/span>concept <span style=\"color: #e6db74;\">(<\/span>get index <span style=\"color: #fd971f;\">(<\/span>str <span style=\"color: #e6db74;\">\"http:\/\/kbpedia.org\/kko\/rc\/\"<\/span> uri-ending<span style=\"color: #fd971f;\">)<\/span><span style=\"color: #e6db74;\">)<\/span><span style=\"color: #a6e22e;\">]<\/span>\n    <span style=\"color: #a6e22e;\">(<\/span><span style=\"color: #f92672;\">.getRow<\/span> vertex-vectors concept<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n<\/pre>\n<\/div>\n<p>To get the sense\u00c2\u00a0<a href=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_e58c29910eb8c832fb24c34d0ac56af7b0892c2e.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-3580\" src=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_e58c29910eb8c832fb24c34d0ac56af7b0892c2e.png\" alt=\"\" width=\"28\" height=\"20\" \/><\/a> of the word\u00c2\u00a0<a href=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_f1a1bc341aaedf18366e6daed56662ba6789bae3.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-3581\" src=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_f1a1bc341aaedf18366e6daed56662ba6789bae3.png\" alt=\"\" width=\"21\" height=\"12\" \/><\/a> (which is <code>SandstormAsObject<\/code>) in our example sentence above, we only have to:<\/p>\n<div class=\"org-src-container\">\n<pre class=\"src src-clojure\"><span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">.toString<\/span> <span style=\"color: #66d9ef;\">(<\/span><span style=\"color: #f92672;\">.data<\/span> <span style=\"color: #a6e22e;\">(<\/span>get-vector <span style=\"color: #e6db74;\">\"SandstormAsObject\"<\/span><span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n<\/pre>\n<\/div>\n<pre class=\"example\">[ 0.08586349,-0.06854561,-0.14704005 ]\n<\/pre>\n<p>To get the vector\u00c2\u00a0<a href=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_774d0389f3a45245568f0233190a445a7ad65f53.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-3576\" src=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_774d0389f3a45245568f0233190a445a7ad65f53.png\" alt=\"\" width=\"12\" height=\"14\" \/><\/a> of the context of the word\u00c2\u00a0<a href=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_0d1399fad7c1911e6241538c7b4fdfb50329d273.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-3569\" src=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_0d1399fad7c1911e6241538c7b4fdfb50329d273.png\" alt=\"\" width=\"19\" height=\"12\" \/><\/a> we have to do some more work. We have to create a function that will calculate the <code>dot product<\/code> between two vectors. Then we will have to create another function that will calculate the <code>sum<\/code> of <code>x<\/code> number of vectors.<\/p>\n<div class=\"org-src-container\">\n<pre class=\"src src-clojure\"><span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">defn<\/span> <span style=\"color: #a6e22e;\">dot-product-clj<\/span>\n  <span style=\"color: #66d9ef;\">[<\/span>x y<span style=\"color: #66d9ef;\">]<\/span>\n  <span style=\"color: #66d9ef;\">(<\/span><span style=\"color: #f92672;\">-&gt;&gt;<\/span> <span style=\"color: #a6e22e;\">(<\/span>interleave x y<span style=\"color: #a6e22e;\">)<\/span>\n       <span style=\"color: #a6e22e;\">(<\/span>partition 2 2<span style=\"color: #a6e22e;\">)<\/span>\n       <span style=\"color: #a6e22e;\">(<\/span>map #<span style=\"color: #e6db74;\">(<\/span>apply * <span style=\"color: #fd971f;\">%<\/span><span style=\"color: #e6db74;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span>\n       <span style=\"color: #a6e22e;\">(<\/span>reduce +<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n\n<span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">defn<\/span> <span style=\"color: #a6e22e;\">angle-clj<\/span>\n  <span style=\"color: #66d9ef;\">[<\/span>a b<span style=\"color: #66d9ef;\">]<\/span>\n  <span style=\"color: #66d9ef;\">(<\/span><span style=\"color: #66d9ef;\">Math<\/span><span style=\"color: #66d9ef;\">\/<\/span><span style=\"color: #f92672;\">toDegrees<\/span>\n   <span style=\"color: #a6e22e;\">(<\/span><span style=\"color: #66d9ef;\">Math<\/span><span style=\"color: #66d9ef;\">\/<\/span>acos\n    <span style=\"color: #e6db74;\">(<\/span>\/ <span style=\"color: #fd971f;\">(<\/span>dot-product-clj a b<span style=\"color: #fd971f;\">)<\/span>\n       <span style=\"color: #fd971f;\">(<\/span>* <span style=\"color: #f92672;\">(<\/span><span style=\"color: #66d9ef;\">Math<\/span><span style=\"color: #66d9ef;\">\/<\/span>sqrt <span style=\"color: #ae81ff;\">(<\/span>dot-product-clj a a<span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #f92672;\">)<\/span>\n          <span style=\"color: #f92672;\">(<\/span><span style=\"color: #66d9ef;\">Math<\/span><span style=\"color: #66d9ef;\">\/<\/span>sqrt <span style=\"color: #ae81ff;\">(<\/span>dot-product-clj b b<span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #f92672;\">)<\/span><span style=\"color: #fd971f;\">)<\/span><span style=\"color: #e6db74;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n\n<span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">defn<\/span> <span style=\"color: #a6e22e;\">disambiguate<\/span>\n  <span style=\"color: #66d9ef;\">[<\/span>line<span style=\"color: #66d9ef;\">]<\/span>\n  <span style=\"color: #66d9ef;\">(<\/span><span style=\"color: #f92672;\">let<\/span> <span style=\"color: #a6e22e;\">[<\/span>tags <span style=\"color: #e6db74;\">(<\/span>re-seq #<span style=\"color: #e6db74;\">\"<\/span><span style=\"color: #e6db74; font-weight: bold;\">\\[<\/span><span style=\"color: #e6db74; font-weight: bold;\">\\[<\/span><span style=\"color: #e6db74;\">(<\/span><span style=\"color: #e6db74;\">.*?<\/span><span style=\"color: #e6db74;\">)<\/span><span style=\"color: #e6db74; font-weight: bold;\">\\]<\/span><span style=\"color: #e6db74; font-weight: bold;\">\\]<\/span><span style=\"color: #e6db74; font-weight: bold;\">\\<\/span><span style=\"color: #e6db74; font-weight: bold;\">(<\/span><span style=\"color: #e6db74; font-weight: bold;\">\\<\/span><span style=\"color: #e6db74; font-weight: bold;\">(<\/span><span style=\"color: #e6db74;\">(<\/span><span style=\"color: #e6db74;\">.*?<\/span><span style=\"color: #e6db74;\">)<\/span><span style=\"color: #e6db74; font-weight: bold;\">\\<\/span><span style=\"color: #e6db74; font-weight: bold;\">)<\/span><span style=\"color: #e6db74; font-weight: bold;\">\\<\/span><span style=\"color: #e6db74; font-weight: bold;\">)<\/span><span style=\"color: #e6db74;\">\"<\/span> line<span style=\"color: #e6db74;\">)<\/span><span style=\"color: #a6e22e;\">]<\/span>\n    <span style=\"color: #a6e22e;\">(<\/span><span style=\"color: #66d9ef;\">clojure.pprint<\/span><span style=\"color: #66d9ef;\">\/<\/span>pprint tags<span style=\"color: #a6e22e;\">)<\/span>\n    <span style=\"color: #a6e22e;\">(<\/span>println<span style=\"color: #a6e22e;\">)<\/span>\n    <span style=\"color: #a6e22e;\">(<\/span><span style=\"color: #f92672;\">loop<\/span> <span style=\"color: #e6db74;\">[<\/span>i 0\n           tag <span style=\"color: #fd971f;\">(<\/span>first tags<span style=\"color: #fd971f;\">)<\/span>\n           rtags <span style=\"color: #fd971f;\">(<\/span>rest tags<span style=\"color: #fd971f;\">)<\/span><span style=\"color: #e6db74;\">]<\/span>\n      <span style=\"color: #e6db74;\">(<\/span><span style=\"color: #f92672;\">let<\/span> <span style=\"color: #fd971f;\">[<\/span>word <span style=\"color: #f92672;\">(<\/span>second tag<span style=\"color: #f92672;\">)<\/span>\n            concepts <span style=\"color: #f92672;\">(<\/span>last tag<span style=\"color: #f92672;\">)<\/span>\n            concept <span style=\"color: #f92672;\">(<\/span>get-tag-concept concepts<span style=\"color: #f92672;\">)<\/span><span style=\"color: #fd971f;\">]<\/span>\n        <span style=\"color: #fd971f;\">(<\/span>println word <span style=\"color: #e6db74;\">\" --&gt; \"<\/span> concepts<span style=\"color: #fd971f;\">)<\/span>\n\n        <span style=\"color: #75715e; font-style: italic;\">;; <\/span><span style=\"color: #75715e; font-style: italic;\">Disambiguate concepts<\/span>\n        <span style=\"color: #fd971f;\">(<\/span><span style=\"color: #f92672;\">let<\/span> <span style=\"color: #f92672;\">[<\/span>ambiguous-concepts <span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #66d9ef;\">string<\/span><span style=\"color: #66d9ef;\">\/<\/span>split <span style=\"color: #66d9ef;\">(<\/span>first <span style=\"color: #a6e22e;\">(<\/span><span style=\"color: #66d9ef;\">string<\/span><span style=\"color: #66d9ef;\">\/<\/span>split concepts #<span style=\"color: #e6db74;\">\"<\/span><span style=\"color: #e6db74;\">::<\/span><span style=\"color: #e6db74;\">\"<\/span><span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span> #<span style=\"color: #e6db74;\">\" \"<\/span><span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #f92672;\">]<\/span>\n          <span style=\"color: #f92672;\">(<\/span><span style=\"color: #f92672;\">doseq<\/span> <span style=\"color: #ae81ff;\">[<\/span>ambiguous-concept ambiguous-concepts<span style=\"color: #ae81ff;\">]<\/span>\n            <span style=\"color: #ae81ff;\">(<\/span>println <span style=\"color: #e6db74;\">\"a-vector: \"<\/span> <span style=\"color: #66d9ef;\">(<\/span>get-vector ambiguous-concept<span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n            <span style=\"color: #ae81ff;\">(<\/span>println <span style=\"color: #e6db74;\">\"b-vector: \"<\/span> <span style=\"color: #66d9ef;\">(<\/span>get-context i tags<span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>                \n            <span style=\"color: #ae81ff;\">(<\/span>println <span style=\"color: #e6db74;\">\"dot product: \"<\/span> <span style=\"color: #66d9ef;\">(<\/span>dot-product <span style=\"color: #a6e22e;\">(<\/span>get-vector ambiguous-concept<span style=\"color: #a6e22e;\">)<\/span> <span style=\"color: #a6e22e;\">(<\/span>get-context i tags<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n            <span style=\"color: #ae81ff;\">(<\/span>println <span style=\"color: #e6db74;\">\"angle: \"<\/span> <span style=\"color: #66d9ef;\">(<\/span>angle <span style=\"color: #a6e22e;\">(<\/span>get-vector ambiguous-concept<span style=\"color: #a6e22e;\">)<\/span> <span style=\"color: #a6e22e;\">(<\/span>get-context i tags<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n            <span style=\"color: #ae81ff;\">(<\/span>println<span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #f92672;\">)<\/span><span style=\"color: #fd971f;\">)<\/span>\n\n\n        <span style=\"color: #fd971f;\">(<\/span><span style=\"color: #f92672;\">when-not<\/span> <span style=\"color: #f92672;\">(<\/span>empty? rtags<span style=\"color: #f92672;\">)<\/span>\n          <span style=\"color: #f92672;\">(<\/span><span style=\"color: #f92672;\">recur<\/span> <span style=\"color: #ae81ff;\">(<\/span>inc i<span style=\"color: #ae81ff;\">)<\/span>\n                 <span style=\"color: #ae81ff;\">(<\/span>first rtags<span style=\"color: #ae81ff;\">)<\/span>\n                 <span style=\"color: #ae81ff;\">(<\/span>rest rtags<span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #f92672;\">)<\/span><span style=\"color: #fd971f;\">)<\/span><span style=\"color: #e6db74;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n<\/pre>\n<\/div>\n<div class=\"org-src-container\">\n<pre class=\"src src-clojure\"><span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">defn<\/span> <span style=\"color: #a6e22e;\">dot-product<\/span> \n  <span style=\"color: #75715e;\">\"Calculate the dot product of two vectors (NDArray)\"<\/span>\n  <span style=\"color: #66d9ef;\">[<\/span>v1 v2<span style=\"color: #66d9ef;\">]<\/span>\n  <span style=\"color: #66d9ef;\">(<\/span>first <span style=\"color: #a6e22e;\">(<\/span>read-string <span style=\"color: #e6db74;\">(<\/span><span style=\"color: #f92672;\">.toString<\/span> <span style=\"color: #fd971f;\">(<\/span><span style=\"color: #f92672;\">.data<\/span> <span style=\"color: #f92672;\">(<\/span><span style=\"color: #f92672;\">.mmul<\/span> v1 <span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">.transpose<\/span> v2<span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #f92672;\">)<\/span><span style=\"color: #fd971f;\">)<\/span><span style=\"color: #e6db74;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n\n<span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">defn<\/span> <span style=\"color: #a6e22e;\">sum-vectors<\/span>\n  <span style=\"color: #75715e;\">\"Sum any number of vectors (NDArray)\"<\/span>\n  <span style=\"color: #66d9ef;\">[<\/span>&amp; args<span style=\"color: #66d9ef;\">]<\/span>\n  <span style=\"color: #66d9ef;\">(<\/span><span style=\"color: #f92672;\">let<\/span> <span style=\"color: #a6e22e;\">[<\/span>args <span style=\"color: #e6db74;\">(<\/span>remove nil? args<span style=\"color: #e6db74;\">)<\/span><span style=\"color: #a6e22e;\">]<\/span>\n    <span style=\"color: #a6e22e;\">(<\/span><span style=\"color: #f92672;\">loop<\/span> <span style=\"color: #e6db74;\">[<\/span>result <span style=\"color: #fd971f;\">(<\/span>first args<span style=\"color: #fd971f;\">)<\/span>\n           args <span style=\"color: #fd971f;\">(<\/span>rest args<span style=\"color: #fd971f;\">)<\/span><span style=\"color: #e6db74;\">]<\/span>\n      <span style=\"color: #e6db74;\">(<\/span><span style=\"color: #f92672;\">if<\/span> <span style=\"color: #fd971f;\">(<\/span>empty? args<span style=\"color: #fd971f;\">)<\/span>\n        result\n        <span style=\"color: #fd971f;\">(<\/span><span style=\"color: #f92672;\">recur<\/span> <span style=\"color: #f92672;\">(<\/span><span style=\"color: #f92672;\">.add<\/span> result <span style=\"color: #ae81ff;\">(<\/span>first args<span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #f92672;\">)<\/span>\n               <span style=\"color: #f92672;\">(<\/span>rest args<span style=\"color: #f92672;\">)<\/span><span style=\"color: #fd971f;\">)<\/span><span style=\"color: #e6db74;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n<\/pre>\n<\/div>\n<p>The next step is to create the function that calculates the context vector, which is a sliding window of the concept associated with <a href=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_82ab83e730d76d23ce3732d5b1705d391cb72c45.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-3582\" src=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_82ab83e730d76d23ce3732d5b1705d391cb72c45.png\" alt=\"\" width=\"38\" height=\"12\" \/><\/a>,\u00c2\u00a0<a href=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_0d1399fad7c1911e6241538c7b4fdfb50329d273.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-3569\" src=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_0d1399fad7c1911e6241538c7b4fdfb50329d273.png\" alt=\"\" width=\"19\" height=\"12\" \/><\/a> and <a href=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_aa3602038506d03eb84e703eb2db703d74794a9c.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-3583\" src=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_aa3602038506d03eb84e703eb2db703d74794a9c.png\" alt=\"\" width=\"38\" height=\"14\" \/><\/a>.<\/p>\n<div class=\"org-src-container\">\n<pre class=\"src src-clojure\"><span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">defn<\/span> <span style=\"color: #a6e22e;\">get-tag-concept<\/span>\n  <span style=\"color: #66d9ef;\">[<\/span>concepts<span style=\"color: #66d9ef;\">]<\/span>\n  <span style=\"color: #66d9ef;\">(<\/span><span style=\"color: #f92672;\">if<\/span> <span style=\"color: #a6e22e;\">(<\/span>&gt; <span style=\"color: #e6db74;\">(<\/span><span style=\"color: #f92672;\">.indexOf<\/span> concepts <span style=\"color: #e6db74;\">\"<\/span><span style=\"color: #e6db74;\">::<\/span><span style=\"color: #e6db74;\">\"<\/span><span style=\"color: #e6db74;\">)<\/span> -1<span style=\"color: #a6e22e;\">)<\/span>\n    <span style=\"color: #a6e22e;\">(<\/span>second <span style=\"color: #e6db74;\">(<\/span>re-find #<span style=\"color: #e6db74;\">\"<\/span><span style=\"color: #e6db74;\">::<\/span><span style=\"color: #e6db74;\">(<\/span><span style=\"color: #e6db74;\">.*<\/span><span style=\"color: #e6db74;\">)<\/span><span style=\"color: #e6db74;\">\"<\/span> concepts<span style=\"color: #e6db74;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span>\n    concepts<span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n<\/pre>\n<\/div>\n<p>Let&#8217;s see how that works. To calculate\u00c2\u00a0<a href=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_774d0389f3a45245568f0233190a445a7ad65f53.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-3576\" src=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_774d0389f3a45245568f0233190a445a7ad65f53.png\" alt=\"\" width=\"12\" height=\"14\" \/><\/a> when\u00c2\u00a0<a href=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_99f89040e96fd7145a771f6ea53e64f14a07bfbc.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-3584\" src=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_99f89040e96fd7145a771f6ea53e64f14a07bfbc.png\" alt=\"\" width=\"41\" height=\"14\" \/><\/a> we have to calculate\u00c2\u00a0<a href=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_3e70cb55d508d5cd9276f01daa169dce44fb7bcb.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-3585\" src=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_3e70cb55d508d5cd9276f01daa169dce44fb7bcb.png\" alt=\"\" width=\"93\" height=\"59\" \/><\/a> which can be done with the following code:<\/p>\n<div class=\"org-src-container\">\n<pre class=\"src src-clojure\"><span style=\"color: #75715e; font-style: italic;\">;; <\/span><span style=\"color: #75715e; font-style: italic;\">s{1,1}<\/span>\n<span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">def<\/span> <span style=\"color: #fd971f;\">sandstormasobject<\/span> <span style=\"color: #66d9ef;\">(<\/span>get-vector <span style=\"color: #e6db74;\">\"SandstormAsObject\"<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n\n<span style=\"color: #75715e; font-style: italic;\">;; <\/span><span style=\"color: #75715e; font-style: italic;\">s{2,1}<\/span>\n<span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">def<\/span> <span style=\"color: #fd971f;\">blowingair<\/span> <span style=\"color: #66d9ef;\">(<\/span>get-vector <span style=\"color: #e6db74;\">\"BlowingAir\"<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n\n<span style=\"color: #75715e; font-style: italic;\">;; <\/span><span style=\"color: #75715e; font-style: italic;\">s{3,1}<\/span>\n<span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">def<\/span> <span style=\"color: #fd971f;\">building<\/span> <span style=\"color: #66d9ef;\">(<\/span>get-vector <span style=\"color: #e6db74;\">\"Building\"<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n\n<span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">def<\/span> <span style=\"color: #fd971f;\">c<\/span> <span style=\"color: #66d9ef;\">(<\/span>sum-vectors sandstormasobject blowingair building<span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n\n<span style=\"color: #ae81ff;\">(<\/span>println <span style=\"color: #66d9ef;\">(<\/span><span style=\"color: #f92672;\">.toString<\/span> <span style=\"color: #a6e22e;\">(<\/span><span style=\"color: #f92672;\">.data<\/span> c<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n<\/pre>\n<\/div>\n<pre class=\"example\">[ 0.24941699,-0.0940429,-0.11848262 ]\n<\/pre>\n<p>Finally we have to calculate the score used to disambiguate the two senses of the word <code>blows<\/code> by performing the dot product between\u00c2\u00a0<a href=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_1776d587df81749698f57cb9f0873c69870fe6ce.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-3586\" src=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_1776d587df81749698f57cb9f0873c69870fe6ce.png\" alt=\"\" width=\"61\" height=\"20\" \/><\/a> and\u00c2\u00a0<a href=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_34ba6b04114515f0170ff209540475e3fcdf7ddb.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-3587\" src=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_34ba6b04114515f0170ff209540475e3fcdf7ddb.png\" alt=\"\" width=\"61\" height=\"20\" \/><\/a> where\u00c2\u00a0<a href=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_eafcc43344ccdb500c279acd0b8a17187992c261.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-3588\" src=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_eafcc43344ccdb500c279acd0b8a17187992c261.png\" alt=\"\" width=\"89\" height=\"20\" \/><\/a> which can be done using the following code:<\/p>\n<div class=\"org-src-container\">\n<pre class=\"src src-clojure\"><span style=\"color: #75715e; font-style: italic;\">;; <\/span><span style=\"color: #75715e; font-style: italic;\">s{2,2}<\/span>\n<span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">def<\/span> <span style=\"color: #fd971f;\">windprocess<\/span> <span style=\"color: #66d9ef;\">(<\/span>get-vector <span style=\"color: #e6db74;\">\"WindProcess\"<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n\n<span style=\"color: #ae81ff;\">(<\/span>println <span style=\"color: #e6db74;\">\"BlowingAir score:\"<\/span> <span style=\"color: #66d9ef;\">(<\/span>dot-product blowingair c<span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n<span style=\"color: #ae81ff;\">(<\/span>println <span style=\"color: #e6db74;\">\"WindProcess score:\"<\/span> <span style=\"color: #66d9ef;\">(<\/span>dot-product windprocess c<span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n<\/pre>\n<\/div>\n<pre class=\"example\">BlowingAir score: 0.011248392\nWindProcess score: 0.005993685\n<\/pre>\n<p>What the scores suggest is, given the context, the right concept associated with the word <code>blows<\/code> is <code>BlowingAir<\/code> since its score is bigger. This is the right answer. We can see how simple linear algebra manipulations can be used to help us automatically disambiguate such concepts. In fact, the crux of the problem is not to perform these operations but to create a coherent and consistent knowledge graph such as KBpedia and then to create the right graph embeddings for each of its concepts. The coherent graph structure gives us this disambiguation capability for &#8220;free&#8221;.<\/p>\n<p>However, these scores, as is, are hard to interpret and understand. What we want to do next is to transform these numbers into a degree between <code>0<\/code> and <code>360<\/code>. The degree between the word sense&#8217;s vector\u00c2\u00a0<a href=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_a5f0e1561072b96080a9c732794bd18fe00f1a57.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-3573\" src=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_a5f0e1561072b96080a9c732794bd18fe00f1a57.png\" alt=\"\" width=\"26\" height=\"20\" \/><\/a> and\u00c2\u00a0<a href=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_7d61ea454e712ce09c7f75d3140788acf53ae5fe.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-3570\" src=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_7d61ea454e712ce09c7f75d3140788acf53ae5fe.png\" alt=\"\" width=\"14\" height=\"20\" \/><\/a> represent how close the two vectors are to each other. A degree <code>0<\/code> would means that both vectors are identical in terms of relationship, and a degree of <code>180<\/code> would mean they are quite dissimilar. What we will see later is that we can use scores such as this to drop senses that score above some similarity degree threshold.<\/p>\n<p>The angle between the two vectors can easily be calculated with the following formula:<\/p>\n<p><a href=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_8fa9e31902fcb39732a13f13ac5105eaa36976ee.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-3590\" src=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_8fa9e31902fcb39732a13f13ac5105eaa36976ee.png\" alt=\"\" width=\"241\" height=\"47\" \/><\/a><\/p>\n<p>The following code will calculate the angle between two vectors using this formula:<\/p>\n<div class=\"org-src-container\">\n<pre class=\"src src-clojure\"><span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">defn<\/span> <span style=\"color: #a6e22e;\">angle<\/span>\n  <span style=\"color: #66d9ef;\">[<\/span>a b<span style=\"color: #66d9ef;\">]<\/span>\n  <span style=\"color: #66d9ef;\">(<\/span><span style=\"color: #66d9ef;\">Math<\/span><span style=\"color: #66d9ef;\">\/<\/span><span style=\"color: #f92672;\">toDegrees<\/span>\n   <span style=\"color: #a6e22e;\">(<\/span><span style=\"color: #66d9ef;\">Math<\/span><span style=\"color: #66d9ef;\">\/<\/span>acos\n    <span style=\"color: #e6db74;\">(<\/span>\/ <span style=\"color: #fd971f;\">(<\/span>dot-product a b<span style=\"color: #fd971f;\">)<\/span>\n       <span style=\"color: #fd971f;\">(<\/span>* <span style=\"color: #f92672;\">(<\/span><span style=\"color: #66d9ef;\">Math<\/span><span style=\"color: #66d9ef;\">\/<\/span>sqrt <span style=\"color: #ae81ff;\">(<\/span>dot-product a a<span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #f92672;\">)<\/span>\n          <span style=\"color: #f92672;\">(<\/span><span style=\"color: #66d9ef;\">Math<\/span><span style=\"color: #66d9ef;\">\/<\/span>sqrt <span style=\"color: #ae81ff;\">(<\/span>dot-product b b<span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #f92672;\">)<\/span><span style=\"color: #fd971f;\">)<\/span><span style=\"color: #e6db74;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n<\/pre>\n<\/div>\n<p>OK, so let&#8217;s get the angle between the example we created above:<\/p>\n<div class=\"org-src-container\">\n<pre class=\"src src-clojure\"><span style=\"color: #ae81ff;\">(<\/span>println <span style=\"color: #e6db74;\">\"BlowingAir degree score:\"<\/span> <span style=\"color: #66d9ef;\">(<\/span>angle blowingair c<span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n<span style=\"color: #ae81ff;\">(<\/span>println <span style=\"color: #e6db74;\">\"WindProcess degree score:\"<\/span> <span style=\"color: #66d9ef;\">(<\/span>angle windprocess c<span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n<\/pre>\n<\/div>\n<pre class=\"example\">BlowingAir degree score: 76.77807342393672\nWindProcess degree score: 82.80469144165336\n<\/pre>\n<p>Since the degree between the context and <code>BlowingAir<\/code> is smaller than the degree between the context and <code>WindProcess<\/code>, we keep the word <code>blows<\/code> as the disambiguated concept.<\/p>\n<\/div>\n<\/div>\n<\/div>\n<div id=\"outline-container-org89c890f\" class=\"outline-3\">\n<h3 id=\"org89c890f\">Evaluate Model<\/h3>\n<div id=\"text-org89c890f\" class=\"outline-text-3\">\n<p>The last step is to evaluate the model we created. What we have to do is to create a function called <code>(evaluate-disambiguation-model)<\/code> that will read the gold standard file, parse the markup and then perform the disambiguation. Then the <code>precision<\/code>, <code>recall<\/code>, <code>acuracy<\/code> and <code>F1<\/code> metrics will be calculated to get the performance of this disambiguation system and the models we created for it.<\/p>\n<div class=\"org-src-container\">\n<pre class=\"src src-clojure\"><span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">defn<\/span> <span style=\"color: #a6e22e;\">get-first-sense-vector<\/span>\n  <span style=\"color: #66d9ef;\">[<\/span>concepts<span style=\"color: #66d9ef;\">]<\/span>\n  <span style=\"color: #66d9ef;\">(<\/span>get-vector\n   <span style=\"color: #a6e22e;\">(<\/span><span style=\"color: #f92672;\">if<\/span> <span style=\"color: #e6db74;\">(<\/span>&gt; <span style=\"color: #fd971f;\">(<\/span><span style=\"color: #f92672;\">.indexOf<\/span> concepts <span style=\"color: #e6db74;\">\"<\/span><span style=\"color: #e6db74;\">::<\/span><span style=\"color: #e6db74;\">\"<\/span><span style=\"color: #fd971f;\">)<\/span> -1<span style=\"color: #e6db74;\">)<\/span>\n     <span style=\"color: #e6db74;\">(<\/span>first <span style=\"color: #fd971f;\">(<\/span><span style=\"color: #66d9ef;\">string<\/span><span style=\"color: #66d9ef;\">\/<\/span>split <span style=\"color: #f92672;\">(<\/span>second <span style=\"color: #ae81ff;\">(<\/span>re-find #<span style=\"color: #e6db74;\">\"<\/span><span style=\"color: #e6db74;\">(<\/span><span style=\"color: #e6db74;\">.*<\/span><span style=\"color: #e6db74;\">)<\/span><span style=\"color: #e6db74;\">::<\/span><span style=\"color: #e6db74;\">\"<\/span> concepts<span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #f92672;\">)<\/span> #<span style=\"color: #e6db74;\">\" \"<\/span><span style=\"color: #fd971f;\">)<\/span><span style=\"color: #e6db74;\">)<\/span>\n     concepts<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n\n<span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">defn<\/span> <span style=\"color: #a6e22e;\">get-context<\/span>\n  <span style=\"color: #66d9ef;\">[<\/span>i tags<span style=\"color: #66d9ef;\">]<\/span>\n  <span style=\"color: #66d9ef;\">(<\/span><span style=\"color: #f92672;\">if<\/span> <span style=\"color: #a6e22e;\">(<\/span>= <span style=\"color: #e6db74;\">(<\/span>count tags<span style=\"color: #e6db74;\">)<\/span> 1<span style=\"color: #a6e22e;\">)<\/span>\n    <span style=\"color: #75715e; font-style: italic;\">;; <\/span><span style=\"color: #75715e; font-style: italic;\">There is only one tag in the sentence<\/span>\n    <span style=\"color: #a6e22e;\">(<\/span>get-first-sense-vector <span style=\"color: #e6db74;\">(<\/span>last <span style=\"color: #fd971f;\">(<\/span>first tags<span style=\"color: #fd971f;\">)<\/span><span style=\"color: #e6db74;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span>\n    <span style=\"color: #a6e22e;\">(<\/span><span style=\"color: #f92672;\">if<\/span> <span style=\"color: #e6db74;\">(<\/span>= <span style=\"color: #fd971f;\">(<\/span>count tags<span style=\"color: #fd971f;\">)<\/span> 2<span style=\"color: #e6db74;\">)<\/span>\n      <span style=\"color: #75715e; font-style: italic;\">;; <\/span><span style=\"color: #75715e; font-style: italic;\">There is only 2 tags in the sentence<\/span>\n      <span style=\"color: #e6db74;\">(<\/span>sum-vectors <span style=\"color: #fd971f;\">(<\/span>get-first-sense-vector <span style=\"color: #f92672;\">(<\/span>last <span style=\"color: #ae81ff;\">(<\/span>first tags<span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #f92672;\">)<\/span><span style=\"color: #fd971f;\">)<\/span>\n                   <span style=\"color: #fd971f;\">(<\/span>get-first-sense-vector <span style=\"color: #f92672;\">(<\/span>last <span style=\"color: #ae81ff;\">(<\/span>second tags<span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #f92672;\">)<\/span><span style=\"color: #fd971f;\">)<\/span><span style=\"color: #e6db74;\">)<\/span>\n      <span style=\"color: #75715e; font-style: italic;\">;; <\/span><span style=\"color: #75715e; font-style: italic;\">Target concept is the first one of the sentence<\/span>\n      <span style=\"color: #e6db74;\">(<\/span><span style=\"color: #f92672;\">if<\/span> <span style=\"color: #fd971f;\">(<\/span>= i 0<span style=\"color: #fd971f;\">)<\/span>\n        <span style=\"color: #fd971f;\">(<\/span>sum-vectors <span style=\"color: #f92672;\">(<\/span>get-first-sense-vector <span style=\"color: #ae81ff;\">(<\/span>last <span style=\"color: #66d9ef;\">(<\/span>first tags<span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #f92672;\">)<\/span>\n                     <span style=\"color: #f92672;\">(<\/span>get-first-sense-vector <span style=\"color: #ae81ff;\">(<\/span>last <span style=\"color: #66d9ef;\">(<\/span>second tags<span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #f92672;\">)<\/span>\n                     <span style=\"color: #f92672;\">(<\/span>get-first-sense-vector <span style=\"color: #ae81ff;\">(<\/span>last <span style=\"color: #66d9ef;\">(<\/span>nth tags <span style=\"color: #a6e22e;\">(<\/span>+ i 2<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #f92672;\">)<\/span><span style=\"color: #fd971f;\">)<\/span>\n        <span style=\"color: #75715e; font-style: italic;\">;; <\/span><span style=\"color: #75715e; font-style: italic;\">Target concept is the last one of the sentence<\/span>\n        <span style=\"color: #fd971f;\">(<\/span><span style=\"color: #f92672;\">if<\/span> <span style=\"color: #f92672;\">(<\/span>= i <span style=\"color: #ae81ff;\">(<\/span>- <span style=\"color: #66d9ef;\">(<\/span>count tags<span style=\"color: #66d9ef;\">)<\/span> 1<span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #f92672;\">)<\/span>\n          <span style=\"color: #f92672;\">(<\/span>sum-vectors <span style=\"color: #ae81ff;\">(<\/span>get-first-sense-vector <span style=\"color: #66d9ef;\">(<\/span>last <span style=\"color: #a6e22e;\">(<\/span>nth tags <span style=\"color: #ae81ff;\">(<\/span>- i 2<span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n                       <span style=\"color: #ae81ff;\">(<\/span>get-first-sense-vector <span style=\"color: #66d9ef;\">(<\/span>last <span style=\"color: #a6e22e;\">(<\/span>nth tags <span style=\"color: #ae81ff;\">(<\/span>- i 1<span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n                       <span style=\"color: #ae81ff;\">(<\/span>get-first-sense-vector <span style=\"color: #66d9ef;\">(<\/span>last <span style=\"color: #a6e22e;\">(<\/span>nth tags i<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #f92672;\">)<\/span>\n          <span style=\"color: #75715e; font-style: italic;\">;; <\/span><span style=\"color: #75715e; font-style: italic;\">Target is in-between<\/span>\n          <span style=\"color: #f92672;\">(<\/span>sum-vectors <span style=\"color: #ae81ff;\">(<\/span>get-first-sense-vector <span style=\"color: #66d9ef;\">(<\/span>last <span style=\"color: #a6e22e;\">(<\/span>nth tags <span style=\"color: #ae81ff;\">(<\/span>- i 1<span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n                       <span style=\"color: #ae81ff;\">(<\/span>get-first-sense-vector <span style=\"color: #66d9ef;\">(<\/span>last <span style=\"color: #a6e22e;\">(<\/span>nth tags i<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n                       <span style=\"color: #ae81ff;\">(<\/span>get-first-sense-vector <span style=\"color: #66d9ef;\">(<\/span>last <span style=\"color: #a6e22e;\">(<\/span>nth tags <span style=\"color: #ae81ff;\">(<\/span>+ i 1<span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #f92672;\">)<\/span><span style=\"color: #fd971f;\">)<\/span><span style=\"color: #e6db74;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n\n<span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">defn<\/span> <span style=\"color: #a6e22e;\">predict-label<\/span>\n  <span style=\"color: #66d9ef;\">[<\/span>ambiguous-concepts i tags max-angle<span style=\"color: #66d9ef;\">]<\/span>\n  <span style=\"color: #66d9ef;\">(<\/span>second\n   <span style=\"color: #a6e22e;\">(<\/span>first\n    <span style=\"color: #e6db74;\">(<\/span><span style=\"color: #f92672;\">-&gt;&gt;<\/span> ambiguous-concepts\n         <span style=\"color: #fd971f;\">(<\/span>map <span style=\"color: #f92672;\">(<\/span><span style=\"color: #f92672;\">fn<\/span> <span style=\"color: #ae81ff;\">[<\/span>ambiguous-concept<span style=\"color: #ae81ff;\">]<\/span>\n                <span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">if-let<\/span> <span style=\"color: #66d9ef;\">[<\/span>a <span style=\"color: #a6e22e;\">(<\/span>get-vector ambiguous-concept<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">]<\/span>\n                  <span style=\"color: #66d9ef;\">(<\/span><span style=\"color: #f92672;\">if-let<\/span> <span style=\"color: #a6e22e;\">[<\/span>b <span style=\"color: #ae81ff;\">(<\/span>get-context i tags<span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #a6e22e;\">]<\/span>\n                    <span style=\"color: #a6e22e;\">(<\/span><span style=\"color: #f92672;\">if<\/span> <span style=\"color: #ae81ff;\">(<\/span>&gt; <span style=\"color: #66d9ef;\">(<\/span>angle a b<span style=\"color: #66d9ef;\">)<\/span> max-angle<span style=\"color: #ae81ff;\">)<\/span>\n                      <span style=\"color: #ae81ff;\">{<\/span><span style=\"color: #ae81ff;\">}<\/span>\n                      <span style=\"color: #ae81ff;\">{<\/span><span style=\"color: #66d9ef;\">(<\/span>angle a b<span style=\"color: #66d9ef;\">)<\/span> ambiguous-concept<span style=\"color: #ae81ff;\">}<\/span><span style=\"color: #a6e22e;\">)<\/span>\n                    <span style=\"color: #a6e22e;\">{<\/span><span style=\"color: #a6e22e;\">}<\/span><span style=\"color: #66d9ef;\">)<\/span>\n                  <span style=\"color: #66d9ef;\">{<\/span><span style=\"color: #66d9ef;\">}<\/span><span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #f92672;\">)<\/span><span style=\"color: #fd971f;\">)<\/span>\n         <span style=\"color: #fd971f;\">(<\/span>apply merge<span style=\"color: #fd971f;\">)<\/span>\n         sort<span style=\"color: #e6db74;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n\n<span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">defn<\/span> <span style=\"color: #a6e22e;\">predict-label-angle<\/span>\n  <span style=\"color: #66d9ef;\">[<\/span>ambiguous-concepts i tags<span style=\"color: #66d9ef;\">]<\/span>\n  <span style=\"color: #66d9ef;\">(<\/span>first\n   <span style=\"color: #a6e22e;\">(<\/span><span style=\"color: #f92672;\">-&gt;&gt;<\/span> ambiguous-concepts\n        <span style=\"color: #e6db74;\">(<\/span>map <span style=\"color: #fd971f;\">(<\/span><span style=\"color: #f92672;\">fn<\/span> <span style=\"color: #f92672;\">[<\/span>ambiguous-concept<span style=\"color: #f92672;\">]<\/span>\n               <span style=\"color: #f92672;\">(<\/span><span style=\"color: #f92672;\">if-let<\/span> <span style=\"color: #ae81ff;\">[<\/span>a <span style=\"color: #66d9ef;\">(<\/span>get-vector ambiguous-concept<span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">]<\/span>\n                 <span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">if-let<\/span> <span style=\"color: #66d9ef;\">[<\/span>b <span style=\"color: #a6e22e;\">(<\/span>get-context i tags<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">]<\/span>\n                   <span style=\"color: #66d9ef;\">{<\/span><span style=\"color: #a6e22e;\">(<\/span>angle a b<span style=\"color: #a6e22e;\">)<\/span> ambiguous-concept<span style=\"color: #66d9ef;\">}<\/span><span style=\"color: #ae81ff;\">)<\/span>\n                 <span style=\"color: #ae81ff;\">{<\/span><span style=\"color: #ae81ff;\">}<\/span><span style=\"color: #f92672;\">)<\/span><span style=\"color: #fd971f;\">)<\/span><span style=\"color: #e6db74;\">)<\/span>\n        <span style=\"color: #e6db74;\">(<\/span>apply merge<span style=\"color: #e6db74;\">)<\/span>\n        sort<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n\n<span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">defn<\/span> <span style=\"color: #a6e22e;\">evaluate-disambiguation-model<\/span>\n  <span style=\"color: #66d9ef;\">[<\/span>gold-standard-file &amp; <span style=\"color: #a6e22e;\">{<\/span><span style=\"color: #ae81ff;\">:keys<\/span> <span style=\"color: #e6db74;\">[<\/span>max-angle<span style=\"color: #e6db74;\">]<\/span>\n                         <span style=\"color: #ae81ff;\">:or<\/span> <span style=\"color: #e6db74;\">{<\/span>max-angle 90.0<span style=\"color: #e6db74;\">}<\/span><span style=\"color: #a6e22e;\">}<\/span><span style=\"color: #66d9ef;\">]<\/span>\n  <span style=\"color: #66d9ef;\">(<\/span><span style=\"color: #f92672;\">let<\/span> <span style=\"color: #a6e22e;\">[<\/span>sentences <span style=\"color: #e6db74;\">(<\/span><span style=\"color: #f92672;\">with-open<\/span> <span style=\"color: #fd971f;\">[<\/span>in-file <span style=\"color: #f92672;\">(<\/span><span style=\"color: #66d9ef;\">io<\/span><span style=\"color: #66d9ef;\">\/<\/span>reader gold-standard-file<span style=\"color: #f92672;\">)<\/span><span style=\"color: #fd971f;\">]<\/span>\n                    <span style=\"color: #fd971f;\">(<\/span><span style=\"color: #f92672;\">doall<\/span>\n                     <span style=\"color: #f92672;\">(<\/span><span style=\"color: #66d9ef;\">csv<\/span><span style=\"color: #66d9ef;\">\/<\/span>read-csv in-file <span style=\"color: #ae81ff;\">:separator<\/span> <span style=\"color: #e6db74;\">\\tab<\/span><span style=\"color: #f92672;\">)<\/span><span style=\"color: #fd971f;\">)<\/span><span style=\"color: #e6db74;\">)<\/span>\n        true-positive <span style=\"color: #e6db74;\">(<\/span>atom 0<span style=\"color: #e6db74;\">)<\/span>\n        false-positive <span style=\"color: #e6db74;\">(<\/span>atom 0<span style=\"color: #e6db74;\">)<\/span>\n        true-negative <span style=\"color: #e6db74;\">(<\/span>atom 0<span style=\"color: #e6db74;\">)<\/span>\n        false-negative <span style=\"color: #e6db74;\">(<\/span>atom 0<span style=\"color: #e6db74;\">)<\/span><span style=\"color: #a6e22e;\">]<\/span>\n\n    <span style=\"color: #a6e22e;\">(<\/span><span style=\"color: #f92672;\">doseq<\/span> <span style=\"color: #e6db74;\">[<\/span><span style=\"color: #fd971f;\">[<\/span>sentence<span style=\"color: #fd971f;\">]<\/span> sentences<span style=\"color: #e6db74;\">]<\/span>\n      <span style=\"color: #e6db74;\">(<\/span><span style=\"color: #f92672;\">let<\/span> <span style=\"color: #fd971f;\">[<\/span>tags <span style=\"color: #f92672;\">(<\/span>re-seq #<span style=\"color: #e6db74;\">\"<\/span><span style=\"color: #e6db74; font-weight: bold;\">\\[<\/span><span style=\"color: #e6db74; font-weight: bold;\">\\[<\/span><span style=\"color: #e6db74;\">(<\/span><span style=\"color: #e6db74;\">.*?<\/span><span style=\"color: #e6db74;\">)<\/span><span style=\"color: #e6db74; font-weight: bold;\">\\]<\/span><span style=\"color: #e6db74; font-weight: bold;\">\\]<\/span><span style=\"color: #e6db74; font-weight: bold;\">\\<\/span><span style=\"color: #e6db74; font-weight: bold;\">(<\/span><span style=\"color: #e6db74; font-weight: bold;\">\\<\/span><span style=\"color: #e6db74; font-weight: bold;\">(<\/span><span style=\"color: #e6db74;\">(<\/span><span style=\"color: #e6db74;\">.*?<\/span><span style=\"color: #e6db74;\">)<\/span><span style=\"color: #e6db74; font-weight: bold;\">\\<\/span><span style=\"color: #e6db74; font-weight: bold;\">)<\/span><span style=\"color: #e6db74; font-weight: bold;\">\\<\/span><span style=\"color: #e6db74; font-weight: bold;\">)<\/span><span style=\"color: #e6db74;\">\"<\/span> sentence<span style=\"color: #f92672;\">)<\/span><span style=\"color: #fd971f;\">]<\/span>\n        <span style=\"color: #fd971f;\">(<\/span><span style=\"color: #f92672;\">loop<\/span> <span style=\"color: #f92672;\">[<\/span>i 0\n               tag <span style=\"color: #ae81ff;\">(<\/span>first tags<span style=\"color: #ae81ff;\">)<\/span>\n               rtags <span style=\"color: #ae81ff;\">(<\/span>rest tags<span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #f92672;\">]<\/span>\n\n          <span style=\"color: #f92672;\">(<\/span><span style=\"color: #f92672;\">when-not<\/span> <span style=\"color: #ae81ff;\">(<\/span>= i <span style=\"color: #66d9ef;\">(<\/span>count tags<span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n            <span style=\"color: #75715e; font-style: italic;\">;; <\/span><span style=\"color: #75715e; font-style: italic;\">Disambiguate concepts<\/span>\n            <span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">let<\/span> <span style=\"color: #66d9ef;\">[<\/span>word <span style=\"color: #a6e22e;\">(<\/span>second tag<span style=\"color: #a6e22e;\">)<\/span>\n                  concepts <span style=\"color: #a6e22e;\">(<\/span>last tag<span style=\"color: #a6e22e;\">)<\/span>\n                  concept <span style=\"color: #a6e22e;\">(<\/span>get-tag-concept concepts<span style=\"color: #a6e22e;\">)<\/span>\n                  label <span style=\"color: #a6e22e;\">(<\/span>second <span style=\"color: #ae81ff;\">(<\/span>re-find #<span style=\"color: #e6db74;\">\"<\/span><span style=\"color: #e6db74;\">::<\/span><span style=\"color: #e6db74;\">(<\/span><span style=\"color: #e6db74;\">.*<\/span><span style=\"color: #e6db74;\">)<\/span><span style=\"color: #e6db74;\">\"<\/span> concepts<span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span>\n                  ambiguous-concepts <span style=\"color: #a6e22e;\">(<\/span><span style=\"color: #66d9ef;\">string<\/span><span style=\"color: #66d9ef;\">\/<\/span>split <span style=\"color: #ae81ff;\">(<\/span>first <span style=\"color: #66d9ef;\">(<\/span><span style=\"color: #66d9ef;\">string<\/span><span style=\"color: #66d9ef;\">\/<\/span>split concepts #<span style=\"color: #e6db74;\">\"<\/span><span style=\"color: #e6db74;\">::<\/span><span style=\"color: #e6db74;\">\"<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span> #<span style=\"color: #e6db74;\">\" \"<\/span><span style=\"color: #a6e22e;\">)<\/span>\n                  predicted-label <span style=\"color: #a6e22e;\">(<\/span>predict-label ambiguous-concepts i tags max-angle<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">]<\/span>\n\n              <span style=\"color: #66d9ef;\">(<\/span><span style=\"color: #f92672;\">when<\/span> <span style=\"color: #a6e22e;\">(<\/span><span style=\"color: #f92672;\">and<\/span> <span style=\"color: #ae81ff;\">(<\/span>= label predicted-label<span style=\"color: #ae81ff;\">)<\/span>\n                         <span style=\"color: #ae81ff;\">(<\/span>not <span style=\"color: #66d9ef;\">(<\/span>empty? label<span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n                         <span style=\"color: #ae81ff;\">(<\/span>not <span style=\"color: #66d9ef;\">(<\/span>empty? predicted-label<span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span>\n                <span style=\"color: #a6e22e;\">(<\/span>swap! true-positive inc<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span>\n\n              <span style=\"color: #66d9ef;\">(<\/span><span style=\"color: #f92672;\">when<\/span> <span style=\"color: #a6e22e;\">(<\/span><span style=\"color: #f92672;\">and<\/span> <span style=\"color: #ae81ff;\">(<\/span>not <span style=\"color: #66d9ef;\">(<\/span>= label predicted-label<span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n                         <span style=\"color: #ae81ff;\">(<\/span>not <span style=\"color: #66d9ef;\">(<\/span>empty? predicted-label<span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span>\n                <span style=\"color: #a6e22e;\">(<\/span>swap! false-positive inc<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span>\n\n              <span style=\"color: #66d9ef;\">(<\/span><span style=\"color: #f92672;\">when<\/span> <span style=\"color: #a6e22e;\">(<\/span><span style=\"color: #f92672;\">and<\/span> <span style=\"color: #ae81ff;\">(<\/span>empty? label<span style=\"color: #ae81ff;\">)<\/span>\n                         <span style=\"color: #ae81ff;\">(<\/span>empty? predicted-label<span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span>\n                <span style=\"color: #a6e22e;\">(<\/span>swap! true-negative inc<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span>\n\n              <span style=\"color: #66d9ef;\">(<\/span><span style=\"color: #f92672;\">when<\/span> <span style=\"color: #a6e22e;\">(<\/span><span style=\"color: #f92672;\">and<\/span> <span style=\"color: #ae81ff;\">(<\/span>not <span style=\"color: #66d9ef;\">(<\/span>empty? label<span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n                         <span style=\"color: #ae81ff;\">(<\/span>empty? predicted-label<span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span>\n                <span style=\"color: #a6e22e;\">(<\/span>swap! false-negative inc<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span>\n\n              <span style=\"color: #66d9ef;\">(<\/span><span style=\"color: #f92672;\">recur<\/span> <span style=\"color: #a6e22e;\">(<\/span>inc i<span style=\"color: #a6e22e;\">)<\/span>\n                     <span style=\"color: #a6e22e;\">(<\/span>first rtags<span style=\"color: #a6e22e;\">)<\/span>\n                     <span style=\"color: #a6e22e;\">(<\/span>rest rtags<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #f92672;\">)<\/span><span style=\"color: #fd971f;\">)<\/span><span style=\"color: #e6db74;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span>\n\n    <span style=\"color: #a6e22e;\">(<\/span>println <span style=\"color: #e6db74;\">\"True positive: \"<\/span> @true-positive<span style=\"color: #a6e22e;\">)<\/span>\n    <span style=\"color: #a6e22e;\">(<\/span>println <span style=\"color: #e6db74;\">\"false positive: \"<\/span> @false-positive<span style=\"color: #a6e22e;\">)<\/span>\n    <span style=\"color: #a6e22e;\">(<\/span>println <span style=\"color: #e6db74;\">\"True negative: \"<\/span> @true-negative<span style=\"color: #a6e22e;\">)<\/span>\n    <span style=\"color: #a6e22e;\">(<\/span>println <span style=\"color: #e6db74;\">\"False negative: \"<\/span> @false-negative<span style=\"color: #a6e22e;\">)<\/span>\n    <span style=\"color: #a6e22e;\">(<\/span>println<span style=\"color: #a6e22e;\">)<\/span>\n\n    <span style=\"color: #a6e22e;\">(<\/span><span style=\"color: #f92672;\">if<\/span> <span style=\"color: #e6db74;\">(<\/span>= 0 @true-positive<span style=\"color: #e6db74;\">)<\/span>\n      <span style=\"color: #e6db74;\">(<\/span><span style=\"color: #f92672;\">let<\/span> <span style=\"color: #fd971f;\">[<\/span>precision 0\n            recall 0\n            accuracy 0\n            f1 0<span style=\"color: #fd971f;\">]<\/span>\n\n        <span style=\"color: #fd971f;\">(<\/span>println <span style=\"color: #e6db74;\">\"Precision: \"<\/span> precision<span style=\"color: #fd971f;\">)<\/span>\n        <span style=\"color: #fd971f;\">(<\/span>println <span style=\"color: #e6db74;\">\"Recall: \"<\/span> recall<span style=\"color: #fd971f;\">)<\/span>\n        <span style=\"color: #fd971f;\">(<\/span>println <span style=\"color: #e6db74;\">\"Accuracy: \"<\/span> accuracy<span style=\"color: #fd971f;\">)<\/span>\n        <span style=\"color: #fd971f;\">(<\/span>println <span style=\"color: #e6db74;\">\"F1: \"<\/span> f1<span style=\"color: #fd971f;\">)<\/span>\n        \n        <span style=\"color: #fd971f;\">{<\/span><span style=\"color: #ae81ff;\">:precision<\/span> precision\n         <span style=\"color: #ae81ff;\">:recall<\/span> recall\n         <span style=\"color: #ae81ff;\">:accuracy<\/span> accuracy\n         <span style=\"color: #ae81ff;\">:f1<\/span> f1<span style=\"color: #fd971f;\">}<\/span><span style=\"color: #e6db74;\">)<\/span>\n      <span style=\"color: #e6db74;\">(<\/span><span style=\"color: #f92672;\">let<\/span> <span style=\"color: #fd971f;\">[<\/span>precision <span style=\"color: #f92672;\">(<\/span>float <span style=\"color: #ae81ff;\">(<\/span>\/ @true-positive <span style=\"color: #66d9ef;\">(<\/span>+ @true-positive @false-positive<span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #f92672;\">)<\/span>\n            recall <span style=\"color: #f92672;\">(<\/span>float <span style=\"color: #ae81ff;\">(<\/span>\/ @true-positive <span style=\"color: #66d9ef;\">(<\/span>+ @true-positive @false-negative<span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #f92672;\">)<\/span>\n            accuracy <span style=\"color: #f92672;\">(<\/span>float <span style=\"color: #ae81ff;\">(<\/span>\/ <span style=\"color: #66d9ef;\">(<\/span>+ @true-positive @true-negative<span style=\"color: #66d9ef;\">)<\/span> <span style=\"color: #66d9ef;\">(<\/span>+ @true-positive @false-negative @false-positive @true-negative<span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #f92672;\">)<\/span>\n            f1 <span style=\"color: #f92672;\">(<\/span>float <span style=\"color: #ae81ff;\">(<\/span>* 2 <span style=\"color: #66d9ef;\">(<\/span>\/ <span style=\"color: #a6e22e;\">(<\/span>* precision recall<span style=\"color: #a6e22e;\">)<\/span> <span style=\"color: #a6e22e;\">(<\/span>+ precision recall<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #f92672;\">)<\/span><span style=\"color: #fd971f;\">]<\/span>\n\n        <span style=\"color: #fd971f;\">(<\/span>println <span style=\"color: #e6db74;\">\"Precision: \"<\/span> precision<span style=\"color: #fd971f;\">)<\/span>\n        <span style=\"color: #fd971f;\">(<\/span>println <span style=\"color: #e6db74;\">\"Recall: \"<\/span> recall<span style=\"color: #fd971f;\">)<\/span>\n        <span style=\"color: #fd971f;\">(<\/span>println <span style=\"color: #e6db74;\">\"Accuracy: \"<\/span> accuracy<span style=\"color: #fd971f;\">)<\/span>\n        <span style=\"color: #fd971f;\">(<\/span>println <span style=\"color: #e6db74;\">\"F1: \"<\/span> f1<span style=\"color: #fd971f;\">)<\/span>\n        \n        <span style=\"color: #fd971f;\">{<\/span><span style=\"color: #ae81ff;\">:precision<\/span> precision\n         <span style=\"color: #ae81ff;\">:recall<\/span> recall\n         <span style=\"color: #ae81ff;\">:accuracy<\/span> accuracy\n         <span style=\"color: #ae81ff;\">:f1<\/span> f1<span style=\"color: #fd971f;\">}<\/span><span style=\"color: #e6db74;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n<\/pre>\n<\/div>\n<\/div>\n<div id=\"outline-container-orgd3e35a0\" class=\"outline-4\">\n<h4 id=\"orgd3e35a0\">Evaluation Baseline: Random Sense<\/h4>\n<div id=\"text-orgd3e35a0\" class=\"outline-text-4\">\n<p>Before evaluating the initial model we described above, the first thing we should be doing is to check the result of each metric we want to evaluate if we take a concept at random. To do this, we will modify the <code>(evaluate-disambiguation-model)<\/code> function we created above to perform the selection of the &#8220;disambiguated&#8221; sense at random. This will provide the baseline to evaluate the other algorithms we will test.<\/p>\n<div class=\"org-src-container\">\n<pre class=\"src src-clojure\"><span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">defn<\/span> <span style=\"color: #a6e22e;\">evaluate-disambiguation-model-random<\/span>\n  <span style=\"color: #66d9ef;\">[<\/span>gold-standard-file &amp; <span style=\"color: #a6e22e;\">{<\/span><span style=\"color: #ae81ff;\">:keys<\/span> <span style=\"color: #e6db74;\">[<\/span>max-angle<span style=\"color: #e6db74;\">]<\/span>\n                         <span style=\"color: #ae81ff;\">:or<\/span> <span style=\"color: #e6db74;\">{<\/span>max-angle 90.0<span style=\"color: #e6db74;\">}<\/span><span style=\"color: #a6e22e;\">}<\/span><span style=\"color: #66d9ef;\">]<\/span>\n  <span style=\"color: #66d9ef;\">(<\/span><span style=\"color: #f92672;\">let<\/span> <span style=\"color: #a6e22e;\">[<\/span>sentences <span style=\"color: #e6db74;\">(<\/span><span style=\"color: #f92672;\">with-open<\/span> <span style=\"color: #fd971f;\">[<\/span>in-file <span style=\"color: #f92672;\">(<\/span><span style=\"color: #66d9ef;\">io<\/span><span style=\"color: #66d9ef;\">\/<\/span>reader gold-standard-file<span style=\"color: #f92672;\">)<\/span><span style=\"color: #fd971f;\">]<\/span>\n                    <span style=\"color: #fd971f;\">(<\/span><span style=\"color: #f92672;\">doall<\/span>\n                     <span style=\"color: #f92672;\">(<\/span><span style=\"color: #66d9ef;\">csv<\/span><span style=\"color: #66d9ef;\">\/<\/span>read-csv in-file <span style=\"color: #ae81ff;\">:separator<\/span> <span style=\"color: #e6db74;\">\\tab<\/span><span style=\"color: #f92672;\">)<\/span><span style=\"color: #fd971f;\">)<\/span><span style=\"color: #e6db74;\">)<\/span>\n        true-positive <span style=\"color: #e6db74;\">(<\/span>atom 0<span style=\"color: #e6db74;\">)<\/span>\n        false-positive <span style=\"color: #e6db74;\">(<\/span>atom 0<span style=\"color: #e6db74;\">)<\/span>\n        true-negative <span style=\"color: #e6db74;\">(<\/span>atom 0<span style=\"color: #e6db74;\">)<\/span>\n        false-negative <span style=\"color: #e6db74;\">(<\/span>atom 0<span style=\"color: #e6db74;\">)<\/span><span style=\"color: #a6e22e;\">]<\/span>\n\n    <span style=\"color: #a6e22e;\">(<\/span><span style=\"color: #f92672;\">doseq<\/span> <span style=\"color: #e6db74;\">[<\/span><span style=\"color: #fd971f;\">[<\/span>sentence<span style=\"color: #fd971f;\">]<\/span> sentences<span style=\"color: #e6db74;\">]<\/span>\n      <span style=\"color: #e6db74;\">(<\/span><span style=\"color: #f92672;\">let<\/span> <span style=\"color: #fd971f;\">[<\/span>tags <span style=\"color: #f92672;\">(<\/span>re-seq #<span style=\"color: #e6db74;\">\"<\/span><span style=\"color: #e6db74; font-weight: bold;\">\\[<\/span><span style=\"color: #e6db74; font-weight: bold;\">\\[<\/span><span style=\"color: #e6db74;\">(<\/span><span style=\"color: #e6db74;\">.*?<\/span><span style=\"color: #e6db74;\">)<\/span><span style=\"color: #e6db74; font-weight: bold;\">\\]<\/span><span style=\"color: #e6db74; font-weight: bold;\">\\]<\/span><span style=\"color: #e6db74; font-weight: bold;\">\\<\/span><span style=\"color: #e6db74; font-weight: bold;\">(<\/span><span style=\"color: #e6db74; font-weight: bold;\">\\<\/span><span style=\"color: #e6db74; font-weight: bold;\">(<\/span><span style=\"color: #e6db74;\">(<\/span><span style=\"color: #e6db74;\">.*?<\/span><span style=\"color: #e6db74;\">)<\/span><span style=\"color: #e6db74; font-weight: bold;\">\\<\/span><span style=\"color: #e6db74; font-weight: bold;\">)<\/span><span style=\"color: #e6db74; font-weight: bold;\">\\<\/span><span style=\"color: #e6db74; font-weight: bold;\">)<\/span><span style=\"color: #e6db74;\">\"<\/span> sentence<span style=\"color: #f92672;\">)<\/span><span style=\"color: #fd971f;\">]<\/span>\n        <span style=\"color: #fd971f;\">(<\/span><span style=\"color: #f92672;\">loop<\/span> <span style=\"color: #f92672;\">[<\/span>i 0\n               tag <span style=\"color: #ae81ff;\">(<\/span>first tags<span style=\"color: #ae81ff;\">)<\/span>\n               rtags <span style=\"color: #ae81ff;\">(<\/span>rest tags<span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #f92672;\">]<\/span>\n\n          <span style=\"color: #f92672;\">(<\/span><span style=\"color: #f92672;\">when-not<\/span> <span style=\"color: #ae81ff;\">(<\/span>= i <span style=\"color: #66d9ef;\">(<\/span>count tags<span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n            <span style=\"color: #75715e; font-style: italic;\">;; <\/span><span style=\"color: #75715e; font-style: italic;\">Disambiguate concepts<\/span>\n            <span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">let<\/span> <span style=\"color: #66d9ef;\">[<\/span>word <span style=\"color: #a6e22e;\">(<\/span>second tag<span style=\"color: #a6e22e;\">)<\/span>\n                  concepts <span style=\"color: #a6e22e;\">(<\/span>last tag<span style=\"color: #a6e22e;\">)<\/span>\n                  concept <span style=\"color: #a6e22e;\">(<\/span>get-tag-concept concepts<span style=\"color: #a6e22e;\">)<\/span>\n                  label <span style=\"color: #a6e22e;\">(<\/span>second <span style=\"color: #ae81ff;\">(<\/span>re-find #<span style=\"color: #e6db74;\">\"<\/span><span style=\"color: #e6db74;\">::<\/span><span style=\"color: #e6db74;\">(<\/span><span style=\"color: #e6db74;\">.*<\/span><span style=\"color: #e6db74;\">)<\/span><span style=\"color: #e6db74;\">\"<\/span> concepts<span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span>\n                  ambiguous-concepts <span style=\"color: #a6e22e;\">(<\/span><span style=\"color: #66d9ef;\">string<\/span><span style=\"color: #66d9ef;\">\/<\/span>split <span style=\"color: #ae81ff;\">(<\/span>first <span style=\"color: #66d9ef;\">(<\/span><span style=\"color: #66d9ef;\">string<\/span><span style=\"color: #66d9ef;\">\/<\/span>split concepts #<span style=\"color: #e6db74;\">\"<\/span><span style=\"color: #e6db74;\">::<\/span><span style=\"color: #e6db74;\">\"<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span> #<span style=\"color: #e6db74;\">\" \"<\/span><span style=\"color: #a6e22e;\">)<\/span>\n                  predicted-label <span style=\"color: #a6e22e;\">(<\/span>rand-nth <span style=\"color: #ae81ff;\">(<\/span>into ambiguous-concepts <span style=\"color: #66d9ef;\">[<\/span><span style=\"color: #ae81ff;\">nil<\/span><span style=\"color: #66d9ef;\">]<\/span><span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">]<\/span>\n\n              <span style=\"color: #66d9ef;\">(<\/span><span style=\"color: #f92672;\">when<\/span> <span style=\"color: #a6e22e;\">(<\/span><span style=\"color: #f92672;\">and<\/span> <span style=\"color: #ae81ff;\">(<\/span>= label predicted-label<span style=\"color: #ae81ff;\">)<\/span>\n                         <span style=\"color: #ae81ff;\">(<\/span>not <span style=\"color: #66d9ef;\">(<\/span>empty? label<span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n                         <span style=\"color: #ae81ff;\">(<\/span>not <span style=\"color: #66d9ef;\">(<\/span>empty? predicted-label<span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span>\n                <span style=\"color: #a6e22e;\">(<\/span>swap! true-positive inc<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span>\n\n              <span style=\"color: #66d9ef;\">(<\/span><span style=\"color: #f92672;\">when<\/span> <span style=\"color: #a6e22e;\">(<\/span><span style=\"color: #f92672;\">and<\/span> <span style=\"color: #ae81ff;\">(<\/span>not <span style=\"color: #66d9ef;\">(<\/span>= label predicted-label<span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n                         <span style=\"color: #ae81ff;\">(<\/span>not <span style=\"color: #66d9ef;\">(<\/span>empty? predicted-label<span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span>\n                <span style=\"color: #a6e22e;\">(<\/span>swap! false-positive inc<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span>\n\n              <span style=\"color: #66d9ef;\">(<\/span><span style=\"color: #f92672;\">when<\/span> <span style=\"color: #a6e22e;\">(<\/span><span style=\"color: #f92672;\">and<\/span> <span style=\"color: #ae81ff;\">(<\/span>empty? label<span style=\"color: #ae81ff;\">)<\/span>\n                         <span style=\"color: #ae81ff;\">(<\/span>empty? predicted-label<span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span>\n                <span style=\"color: #a6e22e;\">(<\/span>swap! true-negative inc<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span>\n\n              <span style=\"color: #66d9ef;\">(<\/span><span style=\"color: #f92672;\">when<\/span> <span style=\"color: #a6e22e;\">(<\/span><span style=\"color: #f92672;\">and<\/span> <span style=\"color: #ae81ff;\">(<\/span>not <span style=\"color: #66d9ef;\">(<\/span>empty? label<span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n                         <span style=\"color: #ae81ff;\">(<\/span>empty? predicted-label<span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span>\n                <span style=\"color: #a6e22e;\">(<\/span>swap! false-negative inc<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span>\n\n              <span style=\"color: #66d9ef;\">(<\/span><span style=\"color: #f92672;\">recur<\/span> <span style=\"color: #a6e22e;\">(<\/span>inc i<span style=\"color: #a6e22e;\">)<\/span>\n                     <span style=\"color: #a6e22e;\">(<\/span>first rtags<span style=\"color: #a6e22e;\">)<\/span>\n                     <span style=\"color: #a6e22e;\">(<\/span>rest rtags<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #f92672;\">)<\/span><span style=\"color: #fd971f;\">)<\/span><span style=\"color: #e6db74;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span>\n\n    <span style=\"color: #a6e22e;\">(<\/span><span style=\"color: #f92672;\">if<\/span> <span style=\"color: #e6db74;\">(<\/span>= 0 @true-positive<span style=\"color: #e6db74;\">)<\/span>\n      <span style=\"color: #e6db74;\">(<\/span><span style=\"color: #f92672;\">let<\/span> <span style=\"color: #fd971f;\">[<\/span>precision 0\n            recall 0\n            accuracy 0\n            f1 0<span style=\"color: #fd971f;\">]<\/span>\n        \n        <span style=\"color: #fd971f;\">{<\/span><span style=\"color: #ae81ff;\">:precision<\/span> precision\n         <span style=\"color: #ae81ff;\">:recall<\/span> recall\n         <span style=\"color: #ae81ff;\">:accuracy<\/span> accuracy\n         <span style=\"color: #ae81ff;\">:f1<\/span> f1<span style=\"color: #fd971f;\">}<\/span><span style=\"color: #e6db74;\">)<\/span>\n      <span style=\"color: #e6db74;\">(<\/span><span style=\"color: #f92672;\">let<\/span> <span style=\"color: #fd971f;\">[<\/span>precision <span style=\"color: #f92672;\">(<\/span>float <span style=\"color: #ae81ff;\">(<\/span>\/ @true-positive <span style=\"color: #66d9ef;\">(<\/span>+ @true-positive @false-positive<span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #f92672;\">)<\/span>\n            recall <span style=\"color: #f92672;\">(<\/span>float <span style=\"color: #ae81ff;\">(<\/span>\/ @true-positive <span style=\"color: #66d9ef;\">(<\/span>+ @true-positive @false-negative<span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #f92672;\">)<\/span>\n            accuracy <span style=\"color: #f92672;\">(<\/span>float <span style=\"color: #ae81ff;\">(<\/span>\/ <span style=\"color: #66d9ef;\">(<\/span>+ @true-positive @true-negative<span style=\"color: #66d9ef;\">)<\/span> <span style=\"color: #66d9ef;\">(<\/span>+ @true-positive @false-negative @false-positive @true-negative<span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #f92672;\">)<\/span>\n            f1 <span style=\"color: #f92672;\">(<\/span>float <span style=\"color: #ae81ff;\">(<\/span>* 2 <span style=\"color: #66d9ef;\">(<\/span>\/ <span style=\"color: #a6e22e;\">(<\/span>* precision recall<span style=\"color: #a6e22e;\">)<\/span> <span style=\"color: #a6e22e;\">(<\/span>+ precision recall<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #f92672;\">)<\/span><span style=\"color: #fd971f;\">]<\/span>\n        \n        <span style=\"color: #fd971f;\">{<\/span><span style=\"color: #ae81ff;\">:precision<\/span> precision\n         <span style=\"color: #ae81ff;\">:recall<\/span> recall\n         <span style=\"color: #ae81ff;\">:accuracy<\/span> accuracy\n         <span style=\"color: #ae81ff;\">:f1<\/span> f1<span style=\"color: #fd971f;\">}<\/span><span style=\"color: #e6db74;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n<\/pre>\n<\/div>\n<p>What we do is to calculate the average random sense disambiguation score for each metric. We run the <code>(evaluate-disambiguation-model-random)<\/code> one thousand times and then calculate the average.<\/p>\n<div class=\"org-src-container\">\n<pre class=\"src src-clojure\"><span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">let<\/span> <span style=\"color: #66d9ef;\">[<\/span>f1 <span style=\"color: #a6e22e;\">(<\/span>atom 0<span style=\"color: #a6e22e;\">)<\/span>\n      precision <span style=\"color: #a6e22e;\">(<\/span>atom 0<span style=\"color: #a6e22e;\">)<\/span>\n      accuracy <span style=\"color: #a6e22e;\">(<\/span>atom 0<span style=\"color: #a6e22e;\">)<\/span>\n      recall <span style=\"color: #a6e22e;\">(<\/span>atom 0<span style=\"color: #a6e22e;\">)<\/span>\n      nb <span style=\"color: #a6e22e;\">(<\/span>atom 0<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">]<\/span>\n  <span style=\"color: #66d9ef;\">(<\/span><span style=\"color: #f92672;\">dotimes<\/span> <span style=\"color: #a6e22e;\">[<\/span>i 1000<span style=\"color: #a6e22e;\">]<\/span>\n    <span style=\"color: #a6e22e;\">(<\/span><span style=\"color: #f92672;\">let<\/span> <span style=\"color: #e6db74;\">[<\/span>results <span style=\"color: #fd971f;\">(<\/span>evaluate-disambiguation-model-random <span style=\"color: #e6db74;\">\"resources\/disambiguation-gold.standard.csv\"<\/span><span style=\"color: #fd971f;\">)<\/span><span style=\"color: #e6db74;\">]<\/span>\n      <span style=\"color: #e6db74;\">(<\/span>swap! nb inc<span style=\"color: #e6db74;\">)<\/span>\n      <span style=\"color: #e6db74;\">(<\/span>swap! f1 + <span style=\"color: #fd971f;\">(<\/span><span style=\"color: #ae81ff;\">:f1<\/span> results<span style=\"color: #fd971f;\">)<\/span><span style=\"color: #e6db74;\">)<\/span>\n      <span style=\"color: #e6db74;\">(<\/span>swap! precision + <span style=\"color: #fd971f;\">(<\/span><span style=\"color: #ae81ff;\">:precision<\/span> results<span style=\"color: #fd971f;\">)<\/span><span style=\"color: #e6db74;\">)<\/span>\n      <span style=\"color: #e6db74;\">(<\/span>swap! recall + <span style=\"color: #fd971f;\">(<\/span><span style=\"color: #ae81ff;\">:recall<\/span> results<span style=\"color: #fd971f;\">)<\/span><span style=\"color: #e6db74;\">)<\/span>\n      <span style=\"color: #e6db74;\">(<\/span>swap! accuracy + <span style=\"color: #fd971f;\">(<\/span><span style=\"color: #ae81ff;\">:accuracy<\/span> results<span style=\"color: #fd971f;\">)<\/span><span style=\"color: #e6db74;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span>\n  <span style=\"color: #66d9ef;\">(<\/span>println <span style=\"color: #e6db74;\">\"Average precision: \"<\/span> <span style=\"color: #a6e22e;\">(<\/span>\/ @precision @nb<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span>\n  <span style=\"color: #66d9ef;\">(<\/span>println <span style=\"color: #e6db74;\">\"Average recall: \"<\/span> <span style=\"color: #a6e22e;\">(<\/span>\/ @recall @nb<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span>\n  <span style=\"color: #66d9ef;\">(<\/span>println <span style=\"color: #e6db74;\">\"Average accuracy: \"<\/span> <span style=\"color: #a6e22e;\">(<\/span>\/ @accuracy @nb<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span>\n  <span style=\"color: #66d9ef;\">(<\/span>println <span style=\"color: #e6db74;\">\"Average F1: \"<\/span> <span style=\"color: #a6e22e;\">(<\/span>\/ @f1 @nb<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n<\/pre>\n<\/div>\n<pre class=\"example\">Average precision:  0.6080861368775368\nAverage recall:  0.49945064118504523\nAverage accuracy:  0.42075299778580666\nAverage F1:  0.5480979214012622\n<\/pre>\n<p>This gives us a baseline <code>precision<\/code> score of <code>0.608<\/code>, <code>recall<\/code> score of <code>0.499<\/code>, <code>accuracy<\/code> score of <code>0.420<\/code> and <code>F1<\/code> score of <code>0.548<\/code>.<\/p>\n<\/div>\n<\/div>\n<div id=\"outline-container-org31396d8\" class=\"outline-4\">\n<h4 id=\"org31396d8\">Evaluation: First Sense Context<\/h4>\n<div id=\"text-org31396d8\" class=\"outline-text-4\">\n<p>Now that we have our baseline in place, let&#8217;s see the performance of the initial <code>first-sensse-context<\/code> disambiguation algorithm as described above compared to a random process. Note that this initial run is using an unoptimized DeepWalk graph with initial general hyperparameters.<\/p>\n<div class=\"org-src-container\">\n<pre class=\"src src-clojure\"><span style=\"color: #ae81ff;\">(<\/span>evaluate-disambiguation-model <span style=\"color: #e6db74;\">\"resources\/disambiguation-gold.standard.csv\"<\/span><span style=\"color: #ae81ff;\">)<\/span>\n<\/pre>\n<\/div>\n<pre class=\"example\">True positive:  246\nfalse positive:  126\nTrue negative:  9\nFalse negative:  36\n\nPrecision:  0.66129035\nRecall:  0.87234044\nAccuracy:  0.6115108\nF1:  0.7522936\n<\/pre>\n<p>As you can see, even using unoptimized default hyperparameters, we calculate an <code>F1<\/code> score of <code>0.752<\/code>, which is an increase of <code>37.26%<\/code> over the baseline. Other measures improve, too: <code>Recall<\/code> (increased by <code>74.67%<\/code>), <code>accuracy<\/code> (increased by <code>45.35%<\/code>) and <code>precision<\/code> (increased by <code>8.75%<\/code>).<\/p>\n<p>We properly identified <code>246<\/code> of <code>417<\/code> examples. <code>126<\/code> were wrongly identified, <code>9<\/code> were properly discarded and <code>36<\/code> have been discarded when they shouldn&#8217;t.<\/p>\n<p>These results are really promising, but can we do better?<\/p>\n<\/div>\n<\/div>\n<\/div>\n<div id=\"outline-container-orge823645\" class=\"outline-3\">\n<h3 id=\"orge823645\">Hyperparameter Optimization<\/h3>\n<div id=\"text-orge823645\" class=\"outline-text-3\">\n<p>Now that we have a disambiguation workflow in place with standards by which to compute performance statistics, the final step is to try to optimize the system by testing multiple different values for some of the hyperparameters that can impact the performance of the system. The parameters we want to optimize are the:<\/p>\n<ol class=\"org-ol\">\n<li>Size of the DeepWalk vectors<\/li>\n<li>Window size of the skip-gram<\/li>\n<li>Depth of the walks<\/li>\n<li>Number of iterations per concept, and the<\/li>\n<li>Angle threshold<\/li>\n<\/ol>\n<p>Then we will perform a <a href=\"https:\/\/en.wikipedia.org\/wiki\/Hyperparameter_optimization#Grid_search\">grid search<\/a> to find the optimal hyperparameters to try to optimize each of the metric.<\/p>\n<div class=\"org-src-container\">\n<pre class=\"src src-clojure\"><span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">defn<\/span> <span style=\"color: #a6e22e;\">svm-grid-search<\/span>\n  <span style=\"color: #66d9ef;\">[<\/span>graph &amp; <span style=\"color: #a6e22e;\">{<\/span><span style=\"color: #ae81ff;\">:keys<\/span> <span style=\"color: #e6db74;\">[<\/span>grid-parameters<span style=\"color: #e6db74;\">]<\/span>\n            <span style=\"color: #ae81ff;\">:or<\/span> <span style=\"color: #e6db74;\">{<\/span>grid-parameters <span style=\"color: #fd971f;\">{<\/span><span style=\"color: #ae81ff;\">:window-size<\/span> <span style=\"color: #f92672;\">[<\/span>10 15<span style=\"color: #f92672;\">]<\/span>\n                                  <span style=\"color: #ae81ff;\">:vector-size<\/span> <span style=\"color: #f92672;\">[<\/span>3 16 64 128 256 512<span style=\"color: #f92672;\">]<\/span>\n                                  <span style=\"color: #ae81ff;\">:walk-length<\/span> <span style=\"color: #f92672;\">[<\/span>5 10 15<span style=\"color: #f92672;\">]<\/span>\n                                  <span style=\"color: #ae81ff;\">:walks-per-vertex<\/span> <span style=\"color: #f92672;\">[<\/span>32 64 128 256<span style=\"color: #f92672;\">]<\/span>\n                                  <span style=\"color: #ae81ff;\">:angle<\/span> <span style=\"color: #f92672;\">[<\/span>30 60 90 120<span style=\"color: #f92672;\">]<\/span>\n                                  <span style=\"color: #ae81ff;\">:selection-metrics<\/span> <span style=\"color: #f92672;\">[<\/span><span style=\"color: #ae81ff;\">:precision<\/span> <span style=\"color: #ae81ff;\">:accuracy<\/span> <span style=\"color: #ae81ff;\">:recall<\/span> <span style=\"color: #ae81ff;\">:f1<\/span><span style=\"color: #f92672;\">]<\/span><span style=\"color: #fd971f;\">}<\/span><span style=\"color: #e6db74;\">}<\/span><span style=\"color: #a6e22e;\">}<\/span><span style=\"color: #66d9ef;\">]<\/span>\n  <span style=\"color: #66d9ef;\">(<\/span><span style=\"color: #f92672;\">let<\/span> <span style=\"color: #a6e22e;\">[<\/span>best <span style=\"color: #e6db74;\">(<\/span>atom <span style=\"color: #fd971f;\">(<\/span><span style=\"color: #f92672;\">-&gt;&gt;<\/span> <span style=\"color: #f92672;\">(<\/span><span style=\"color: #ae81ff;\">:selection-metrics<\/span> grid-parameters<span style=\"color: #f92672;\">)<\/span>\n                        <span style=\"color: #f92672;\">(<\/span>map <span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">fn<\/span> <span style=\"color: #66d9ef;\">[<\/span>selection-metric<span style=\"color: #66d9ef;\">]<\/span>\n                               <span style=\"color: #66d9ef;\">{<\/span>selection-metric <span style=\"color: #a6e22e;\">{<\/span><span style=\"color: #ae81ff;\">:window-size<\/span> <span style=\"color: #ae81ff;\">nil<\/span>\n                                                  <span style=\"color: #ae81ff;\">:vector-size<\/span> <span style=\"color: #ae81ff;\">nil<\/span>\n                                                  <span style=\"color: #ae81ff;\">:walk-length<\/span> <span style=\"color: #ae81ff;\">nil<\/span>\n                                                  <span style=\"color: #ae81ff;\">:walks-per-vertex<\/span> <span style=\"color: #ae81ff;\">nil<\/span>\n                                                  <span style=\"color: #ae81ff;\">:angle<\/span> <span style=\"color: #ae81ff;\">nil<\/span>\n                                                  <span style=\"color: #ae81ff;\">:score<\/span> 0.0<span style=\"color: #a6e22e;\">}<\/span><span style=\"color: #66d9ef;\">}<\/span><span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #f92672;\">)<\/span>\n                        <span style=\"color: #f92672;\">(<\/span>apply merge<span style=\"color: #f92672;\">)<\/span><span style=\"color: #fd971f;\">)<\/span><span style=\"color: #e6db74;\">)<\/span>\n        parameters grid-parameters<span style=\"color: #a6e22e;\">]<\/span>\n    <span style=\"color: #a6e22e;\">(<\/span><span style=\"color: #f92672;\">doseq<\/span> <span style=\"color: #e6db74;\">[<\/span>window-size <span style=\"color: #fd971f;\">(<\/span><span style=\"color: #ae81ff;\">:window-size<\/span> parameters<span style=\"color: #fd971f;\">)<\/span><span style=\"color: #e6db74;\">]<\/span>\n      <span style=\"color: #e6db74;\">(<\/span><span style=\"color: #f92672;\">doseq<\/span> <span style=\"color: #fd971f;\">[<\/span>vector-size <span style=\"color: #f92672;\">(<\/span><span style=\"color: #ae81ff;\">:vector-size<\/span> parameters<span style=\"color: #f92672;\">)<\/span><span style=\"color: #fd971f;\">]<\/span>\n        <span style=\"color: #fd971f;\">(<\/span><span style=\"color: #f92672;\">let<\/span> <span style=\"color: #f92672;\">[<\/span>deep-walk <span style=\"color: #ae81ff;\">(<\/span>create-deep-walk graph \n                                          <span style=\"color: #ae81ff;\">:window-size<\/span> window-size\n                                          <span style=\"color: #ae81ff;\">:vector-size<\/span> vector-size\n                                          <span style=\"color: #ae81ff;\">:learning-rate<\/span> 0.025<span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #f92672;\">]<\/span>\n          <span style=\"color: #f92672;\">(<\/span><span style=\"color: #f92672;\">doseq<\/span> <span style=\"color: #ae81ff;\">[<\/span>walk-length <span style=\"color: #66d9ef;\">(<\/span><span style=\"color: #ae81ff;\">:walk-length<\/span> parameters<span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">]<\/span>\n            <span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">doseq<\/span> <span style=\"color: #66d9ef;\">[<\/span>walks-per-vertex <span style=\"color: #a6e22e;\">(<\/span><span style=\"color: #ae81ff;\">:walks-per-vertex<\/span> parameters<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">]<\/span>\n\n              <span style=\"color: #66d9ef;\">(<\/span>train deep-walk graph walk-length walks-per-vertex<span style=\"color: #66d9ef;\">)<\/span>\n\n              <span style=\"color: #66d9ef;\">(<\/span><span style=\"color: #f92672;\">doseq<\/span> <span style=\"color: #a6e22e;\">[<\/span>angle <span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #ae81ff;\">:angle<\/span> parameters<span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #a6e22e;\">]<\/span>\n                <span style=\"color: #a6e22e;\">(<\/span><span style=\"color: #f92672;\">let<\/span> <span style=\"color: #ae81ff;\">[<\/span>results <span style=\"color: #66d9ef;\">(<\/span><span style=\"color: #f92672;\">binding<\/span> <span style=\"color: #a6e22e;\">[<\/span>vertex-vectors <span style=\"color: #e6db74;\">(<\/span><span style=\"color: #f92672;\">.getVertexVectors<\/span> <span style=\"color: #fd971f;\">(<\/span><span style=\"color: #f92672;\">.lookupTable<\/span> deep-walk<span style=\"color: #fd971f;\">)<\/span><span style=\"color: #e6db74;\">)<\/span><span style=\"color: #a6e22e;\">]<\/span>\n                                <span style=\"color: #a6e22e;\">(<\/span>evaluate-disambiguation-model <span style=\"color: #e6db74;\">\"resources\/disambiguation-gold.standard.csv\"<\/span>\n                                                               <span style=\"color: #ae81ff;\">:max-angle<\/span> angle<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">]<\/span>\n\n                  <span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">doseq<\/span> <span style=\"color: #66d9ef;\">[<\/span>selection-metric <span style=\"color: #a6e22e;\">(<\/span><span style=\"color: #ae81ff;\">:selection-metrics<\/span> parameters<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">]<\/span>\n                    <span style=\"color: #66d9ef;\">(<\/span><span style=\"color: #f92672;\">let<\/span> <span style=\"color: #a6e22e;\">[<\/span>score <span style=\"color: #e6db74;\">(<\/span>get results selection-metric<span style=\"color: #e6db74;\">)<\/span><span style=\"color: #a6e22e;\">]<\/span>\n                      \n                      <span style=\"color: #a6e22e;\">(<\/span>println<span style=\"color: #a6e22e;\">)<\/span>\n                      <span style=\"color: #a6e22e;\">(<\/span>println <span style=\"color: #e6db74;\">\"Window size: \"<\/span> window-size<span style=\"color: #a6e22e;\">)<\/span>\n                      <span style=\"color: #a6e22e;\">(<\/span>println <span style=\"color: #e6db74;\">\"Vector size: \"<\/span> vector-size<span style=\"color: #a6e22e;\">)<\/span>\n                      <span style=\"color: #a6e22e;\">(<\/span>println <span style=\"color: #e6db74;\">\"Walk length: \"<\/span> walk-length<span style=\"color: #a6e22e;\">)<\/span>\n                      <span style=\"color: #a6e22e;\">(<\/span>println <span style=\"color: #e6db74;\">\"Walks per vertex: \"<\/span> walks-per-vertex<span style=\"color: #a6e22e;\">)<\/span>\n                      <span style=\"color: #a6e22e;\">(<\/span>println <span style=\"color: #e6db74;\">\"Angle: \"<\/span> angle<span style=\"color: #a6e22e;\">)<\/span>\n                      <span style=\"color: #a6e22e;\">(<\/span>println <span style=\"color: #e6db74;\">\"Score (\"<\/span>selection-metric<span style=\"color: #e6db74;\">\"): \"<\/span> score<span style=\"color: #a6e22e;\">)<\/span>\n                      <span style=\"color: #a6e22e;\">(<\/span>println<span style=\"color: #a6e22e;\">)<\/span>\n                      <span style=\"color: #a6e22e;\">(<\/span>println<span style=\"color: #a6e22e;\">)<\/span>\n\n                      <span style=\"color: #a6e22e;\">(<\/span><span style=\"color: #f92672;\">when<\/span> <span style=\"color: #e6db74;\">(<\/span>&gt; score <span style=\"color: #fd971f;\">(<\/span><span style=\"color: #ae81ff;\">:score<\/span> <span style=\"color: #f92672;\">(<\/span>get @best selection-metric<span style=\"color: #f92672;\">)<\/span><span style=\"color: #fd971f;\">)<\/span><span style=\"color: #e6db74;\">)<\/span>\n                        <span style=\"color: #e6db74;\">(<\/span>reset! best <span style=\"color: #fd971f;\">(<\/span>assoc-in @best <span style=\"color: #f92672;\">[<\/span>selection-metric<span style=\"color: #f92672;\">]<\/span> <span style=\"color: #f92672;\">{<\/span><span style=\"color: #ae81ff;\">:window-size<\/span> window-size\n                                                                         <span style=\"color: #ae81ff;\">:vector-size<\/span> vector-size\n                                                                         <span style=\"color: #ae81ff;\">:walk-length<\/span> walk-length\n                                                                         <span style=\"color: #ae81ff;\">:walks-per-vertex<\/span> walks-per-vertex\n                                                                         <span style=\"color: #ae81ff;\">:angle<\/span> angle\n                                                                         <span style=\"color: #ae81ff;\">:score<\/span> score<span style=\"color: #f92672;\">}<\/span><span style=\"color: #fd971f;\">)<\/span><span style=\"color: #e6db74;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #f92672;\">)<\/span><span style=\"color: #fd971f;\">)<\/span><span style=\"color: #e6db74;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span>\n  @best<span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n<\/pre>\n<\/div>\n<div class=\"org-src-container\">\n<pre class=\"src src-clojure\"><span style=\"color: #ae81ff;\">(<\/span>svm-grid-search graph <span style=\"color: #ae81ff;\">:grid-parameters<\/span> <span style=\"color: #66d9ef;\">{<\/span><span style=\"color: #ae81ff;\">:window-size<\/span> <span style=\"color: #a6e22e;\">[<\/span>10 15 20<span style=\"color: #a6e22e;\">]<\/span>\n                                         <span style=\"color: #ae81ff;\">:vector-size<\/span> <span style=\"color: #a6e22e;\">[<\/span>64 128 256<span style=\"color: #a6e22e;\">]<\/span>\n                                         <span style=\"color: #ae81ff;\">:walk-length<\/span> <span style=\"color: #a6e22e;\">[<\/span>5 10 15<span style=\"color: #a6e22e;\">]<\/span>\n                                         <span style=\"color: #ae81ff;\">:walks-per-vertex<\/span> <span style=\"color: #a6e22e;\">[<\/span>32 64 128<span style=\"color: #a6e22e;\">]<\/span>\n                                         <span style=\"color: #ae81ff;\">:angle<\/span> <span style=\"color: #a6e22e;\">[<\/span>60 90 120<span style=\"color: #a6e22e;\">]<\/span>\n                                         <span style=\"color: #ae81ff;\">:selection-metrics<\/span> <span style=\"color: #a6e22e;\">[<\/span><span style=\"color: #ae81ff;\">:precision<\/span> <span style=\"color: #ae81ff;\">:accuracy<\/span> <span style=\"color: #ae81ff;\">:recall<\/span> <span style=\"color: #ae81ff;\">:f1<\/span><span style=\"color: #a6e22e;\">]<\/span><span style=\"color: #66d9ef;\">}<\/span><span style=\"color: #ae81ff;\">)<\/span>\n<\/pre>\n<\/div>\n<pre class=\"example\">{:precision\n {:window-size 10,\n  :vector-size 64,\n  :walk-length 5,\n  :walks-per-vertex 32,\n  :angle 60,\n  :score 0.6812866},\n :accuracy\n {:window-size 10,\n  :vector-size 64,\n  :walk-length 5,\n  :walks-per-vertex 32,\n  :angle 90,\n  :score 0.676259},\n :recall\n {:window-size 10,\n  :vector-size 64,\n  :walk-length 5,\n  :walks-per-vertex 32,\n  :angle 90,\n  :score 0.99640286},\n :f1\n {:window-size 10,\n  :vector-size 64,\n  :walk-length 5,\n  :walks-per-vertex 32,\n  :angle 90,\n  :score 0.80406386}}\n<\/pre>\n<p>For the same <code>first-sense-context<\/code> disambiguation algorithm, once the key hyperparameters of the pipeline are optimized, then we endup with a <code>F1<\/code> score of <code>0.8040<\/code>, which is an increase of <code>46.72%<\/code> over the baseline. <code>Recall<\/code> has now increased by <code>99.52%<\/code>, <code>accuracy<\/code> increased by <code>60.73%<\/code> and <code>precision<\/code> increased by <code>12.04%<\/code>.<\/p>\n<p>Still, understand that when performing the grid search in this manner that many of the hyperparameters we are trying to optimize are related to the DeepWalk algorithm. Because of the nature of the algorithm (when the model is trained, the paths are randomly created and the learning starts with random seeds) we can&#8217;t reproduce exactly the same results every time we re-create a <code>DeepWalk<\/code> instance. However the results tend to be similar with a differences of <code>+\/- 0.02<\/code> for most of the metrics.<\/p>\n<p>Depending on the task at hand we don&#8217;t necessarly want to keep the hyperparameters of the best score. If model creation and execution speed is also a governing consideration, we may want to sacrifice <code>0.03<\/code> of the <code>F1<\/code> score to have the <code>:window-size<\/code>, <code>:vector-size<\/code>, <code>:walk-length<\/code> and <code>:walk-per-vertex<\/code> as small as possible.<\/p>\n<p>But in any case, these numbers indicate the kind of performances we may expect from this disambiguation process.<\/p>\n<div class=\"org-src-container\">\n<pre class=\"src src-clojure\"><span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">def<\/span> <span style=\"color: #fd971f;\">deep-walk<\/span> <span style=\"color: #66d9ef;\">(<\/span>create-deep-walk graph \n                                 <span style=\"color: #ae81ff;\">:window-size<\/span> 10\n                                 <span style=\"color: #ae81ff;\">:vector-size<\/span> 128\n                                 <span style=\"color: #ae81ff;\">:learning-rate<\/span> 0.025<span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n\n<span style=\"color: #ae81ff;\">(<\/span>train deep-walk graph 10 128<span style=\"color: #ae81ff;\">)<\/span>\n\n<span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">binding<\/span> <span style=\"color: #66d9ef;\">[<\/span>vertex-vectors <span style=\"color: #a6e22e;\">(<\/span><span style=\"color: #f92672;\">.getVertexVectors<\/span> <span style=\"color: #e6db74;\">(<\/span><span style=\"color: #f92672;\">.lookupTable<\/span> deep-walk<span style=\"color: #e6db74;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">]<\/span>\n  <span style=\"color: #66d9ef;\">(<\/span>evaluate-disambiguation-model <span style=\"color: #e6db74;\">\"resources\/disambiguation-gold.standard.csv\"<\/span>\n                                 <span style=\"color: #ae81ff;\">:max-angle<\/span> 150<span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n<\/pre>\n<\/div>\n<pre class=\"example\">True positive:  229\nfalse positive:  123\nTrue negative:  5\nFalse negative:  1\n\nPrecision:  0.6505682\nRecall:  0.9956522\nAccuracy:  0.65363127\nF1:  0.7869416\n<\/pre>\n<\/div>\n<\/div>\n<div id=\"outline-container-org7b637fc\" class=\"outline-3\">\n<h3 id=\"org7b637fc\">Different Sliding Window Strategies<\/h3>\n<div id=\"text-org7b637fc\" class=\"outline-text-3\">\n<p>Another part of this disambiguation process that has an impact on results is how the &#8220;context&#8221; of a word is being created. There are multiple ways to define a context, such as extending the window size or letting window size extend over sentence boundaries. For clarity&#8217;s sake, we initially defined the context\u00c2\u00a0<a href=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_774d0389f3a45245568f0233190a445a7ad65f53.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-3576\" src=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_774d0389f3a45245568f0233190a445a7ad65f53.png\" alt=\"\" width=\"12\" height=\"14\" \/><\/a> as:<\/p>\n<p><a href=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_73e561775d087190d01c276973ae7b6941ab72e7.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-3578\" src=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_73e561775d087190d01c276973ae7b6941ab72e7.png\" alt=\"\" width=\"343\" height=\"256\" srcset=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_73e561775d087190d01c276973ae7b6941ab72e7.png 343w, https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_73e561775d087190d01c276973ae7b6941ab72e7-300x224.png 300w\" sizes=\"auto, (max-width: 343px) 100vw, 343px\" \/><\/a><\/p>\n<p>One of the issues with this definition of &#8220;context&#8221; is that, so far,\u00c2\u00a0<a href=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_70280f6b9061479517c21b3204e4d4aaaccf740f.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-3591\" src=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_70280f6b9061479517c21b3204e4d4aaaccf740f.png\" alt=\"\" width=\"9\" height=\"14\" \/><\/a> (the senses for each word) is always <code>1<\/code>, which means that the context is always defined with the first sense of a word, whatever it is. This is why we called this method the <code>first-sense-context<\/code> disambiguation. Biasing the context to the first sense ignores the other senses that may be summed into the context vector. However, this initial definition is simple enough and, as we saw above, we still have adequate results with it. Still, what could be other strategies?<\/p>\n<\/div>\n<div id=\"outline-container-org80d87d9\" class=\"outline-4\">\n<h4 id=\"org80d87d9\">Ignore Target Word Senses In Context<\/h4>\n<div id=\"text-org80d87d9\" class=\"outline-text-4\">\n<p>The first test we may do is to remove the target word&#8217;s sense (the word we are trying to disambiguate) from the context. That way, we would make sure that we don&#8217;t bias the context toward that sense. The context would be defined as:<\/p>\n<p><a href=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_727c862fa4ba91e94a62e87da99e9a4131c26a83.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-3592\" src=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_727c862fa4ba91e94a62e87da99e9a4131c26a83.png\" alt=\"\" width=\"389\" height=\"212\" srcset=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_727c862fa4ba91e94a62e87da99e9a4131c26a83.png 389w, https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_727c862fa4ba91e94a62e87da99e9a4131c26a83-300x163.png 300w\" sizes=\"auto, (max-width: 389px) 100vw, 389px\" \/><\/a><\/p>\n<p>We are still picking the first sense of each word that belongs to the context, but we are not considering that sense for the target word when we calculate the context vector. Let&#8217;s see what is the impact of changing the context with this new definition. To test this new context, we have to change the <code>(get-context)<\/code> function to reflect the new machanism and to re-test and re-optimize the hyperparameters.<\/p>\n<div class=\"org-src-container\">\n<pre class=\"src src-clojure\"><span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">defn<\/span> <span style=\"color: #a6e22e;\">get-context<\/span>\n  <span style=\"color: #66d9ef;\">[<\/span>i tags<span style=\"color: #66d9ef;\">]<\/span>\n  <span style=\"color: #66d9ef;\">(<\/span><span style=\"color: #f92672;\">if<\/span> <span style=\"color: #a6e22e;\">(<\/span>= <span style=\"color: #e6db74;\">(<\/span>count tags<span style=\"color: #e6db74;\">)<\/span> 1<span style=\"color: #a6e22e;\">)<\/span>\n    <span style=\"color: #75715e; font-style: italic;\">;; <\/span><span style=\"color: #75715e; font-style: italic;\">There is only one tag in the sentence<\/span>\n    <span style=\"color: #a6e22e;\">(<\/span>get-first-sense-vector <span style=\"color: #e6db74;\">(<\/span>last <span style=\"color: #fd971f;\">(<\/span>first tags<span style=\"color: #fd971f;\">)<\/span><span style=\"color: #e6db74;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span>\n    <span style=\"color: #a6e22e;\">(<\/span><span style=\"color: #f92672;\">if<\/span> <span style=\"color: #e6db74;\">(<\/span>= <span style=\"color: #fd971f;\">(<\/span>count tags<span style=\"color: #fd971f;\">)<\/span> 2<span style=\"color: #e6db74;\">)<\/span>\n      <span style=\"color: #75715e; font-style: italic;\">;; <\/span><span style=\"color: #75715e; font-style: italic;\">There is only 2 tags in the sentence<\/span>\n      <span style=\"color: #e6db74;\">(<\/span><span style=\"color: #f92672;\">if<\/span> <span style=\"color: #fd971f;\">(<\/span>= i 0<span style=\"color: #fd971f;\">)<\/span>\n        <span style=\"color: #fd971f;\">(<\/span>sum-vectors <span style=\"color: #f92672;\">(<\/span>get-first-sense-vector <span style=\"color: #ae81ff;\">(<\/span>last <span style=\"color: #66d9ef;\">(<\/span>second tags<span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #f92672;\">)<\/span><span style=\"color: #fd971f;\">)<\/span>\n        <span style=\"color: #fd971f;\">(<\/span>sum-vectors <span style=\"color: #f92672;\">(<\/span>get-first-sense-vector <span style=\"color: #ae81ff;\">(<\/span>last <span style=\"color: #66d9ef;\">(<\/span>first tags<span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #f92672;\">)<\/span><span style=\"color: #fd971f;\">)<\/span><span style=\"color: #e6db74;\">)<\/span>\n      <span style=\"color: #75715e; font-style: italic;\">;; <\/span><span style=\"color: #75715e; font-style: italic;\">Target concept is the first one of the sentence<\/span>\n      <span style=\"color: #e6db74;\">(<\/span><span style=\"color: #f92672;\">if<\/span> <span style=\"color: #fd971f;\">(<\/span>= i 0<span style=\"color: #fd971f;\">)<\/span>\n        <span style=\"color: #fd971f;\">(<\/span>sum-vectors <span style=\"color: #f92672;\">(<\/span>get-first-sense-vector <span style=\"color: #ae81ff;\">(<\/span>last <span style=\"color: #66d9ef;\">(<\/span>second tags<span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #f92672;\">)<\/span>\n                     <span style=\"color: #f92672;\">(<\/span>get-first-sense-vector <span style=\"color: #ae81ff;\">(<\/span>last <span style=\"color: #66d9ef;\">(<\/span>nth tags <span style=\"color: #a6e22e;\">(<\/span>+ i 2<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #f92672;\">)<\/span><span style=\"color: #fd971f;\">)<\/span>\n        <span style=\"color: #75715e; font-style: italic;\">;; <\/span><span style=\"color: #75715e; font-style: italic;\">Target concept is the last one of the sentence<\/span>\n        <span style=\"color: #fd971f;\">(<\/span><span style=\"color: #f92672;\">if<\/span> <span style=\"color: #f92672;\">(<\/span>= i <span style=\"color: #ae81ff;\">(<\/span>- <span style=\"color: #66d9ef;\">(<\/span>count tags<span style=\"color: #66d9ef;\">)<\/span> 1<span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #f92672;\">)<\/span>\n          <span style=\"color: #f92672;\">(<\/span>sum-vectors <span style=\"color: #ae81ff;\">(<\/span>get-first-sense-vector <span style=\"color: #66d9ef;\">(<\/span>last <span style=\"color: #a6e22e;\">(<\/span>nth tags <span style=\"color: #ae81ff;\">(<\/span>- i 2<span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n                       <span style=\"color: #ae81ff;\">(<\/span>get-first-sense-vector <span style=\"color: #66d9ef;\">(<\/span>last <span style=\"color: #a6e22e;\">(<\/span>nth tags <span style=\"color: #ae81ff;\">(<\/span>- i 1<span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #f92672;\">)<\/span>\n          <span style=\"color: #75715e; font-style: italic;\">;; <\/span><span style=\"color: #75715e; font-style: italic;\">Target is in-between<\/span>\n          <span style=\"color: #f92672;\">(<\/span>sum-vectors <span style=\"color: #ae81ff;\">(<\/span>get-first-sense-vector <span style=\"color: #66d9ef;\">(<\/span>last <span style=\"color: #a6e22e;\">(<\/span>nth tags <span style=\"color: #ae81ff;\">(<\/span>- i 1<span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n                       <span style=\"color: #ae81ff;\">(<\/span>get-first-sense-vector <span style=\"color: #66d9ef;\">(<\/span>last <span style=\"color: #a6e22e;\">(<\/span>nth tags <span style=\"color: #ae81ff;\">(<\/span>+ i 1<span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #f92672;\">)<\/span><span style=\"color: #fd971f;\">)<\/span><span style=\"color: #e6db74;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n<\/pre>\n<\/div>\n<p>Now let&#8217;s re-optimize and re-evaluate the impact of this modification on our gold standard:<\/p>\n<div class=\"org-src-container\">\n<pre class=\"src src-clojure\"><span style=\"color: #ae81ff;\">(<\/span>svm-grid-search graph <span style=\"color: #ae81ff;\">:grid-parameters<\/span> <span style=\"color: #66d9ef;\">{<\/span><span style=\"color: #ae81ff;\">:window-size<\/span> <span style=\"color: #a6e22e;\">[<\/span>10 15 20<span style=\"color: #a6e22e;\">]<\/span>\n                                         <span style=\"color: #ae81ff;\">:vector-size<\/span> <span style=\"color: #a6e22e;\">[<\/span>64 128 256<span style=\"color: #a6e22e;\">]<\/span>\n                                         <span style=\"color: #ae81ff;\">:walk-length<\/span> <span style=\"color: #a6e22e;\">[<\/span>5 10 15<span style=\"color: #a6e22e;\">]<\/span>\n                                         <span style=\"color: #ae81ff;\">:walks-per-vertex<\/span> <span style=\"color: #a6e22e;\">[<\/span>32 64 128<span style=\"color: #a6e22e;\">]<\/span>\n                                         <span style=\"color: #ae81ff;\">:angle<\/span> <span style=\"color: #a6e22e;\">[<\/span>60 90 120<span style=\"color: #a6e22e;\">]<\/span>\n                                         <span style=\"color: #ae81ff;\">:selection-metrics<\/span> <span style=\"color: #a6e22e;\">[<\/span><span style=\"color: #ae81ff;\">:precision<\/span> <span style=\"color: #ae81ff;\">:accuracy<\/span> <span style=\"color: #ae81ff;\">:recall<\/span> <span style=\"color: #ae81ff;\">:f1<\/span><span style=\"color: #a6e22e;\">]<\/span><span style=\"color: #66d9ef;\">}<\/span><span style=\"color: #ae81ff;\">)<\/span>\n<\/pre>\n<\/div>\n<pre class=\"example\">{:precision\n {:window-size 20,\n  :vector-size 128,\n  :walk-length 5,\n  :walks-per-vertex 32,\n  :angle 120,\n  :score 0.6666667},\n :accuracy\n {:window-size 20,\n  :vector-size 128,\n  :walk-length 5,\n  :walks-per-vertex 32,\n  :angle 120,\n  :score 0.66906476},\n :recall\n {:window-size 20,\n  :vector-size 128,\n  :walk-length 5,\n  :walks-per-vertex 32,\n  :angle 120,\n  :score 0.99636364},\n :f1\n {:window-size 20,\n  :vector-size 128,\n  :walk-length 5,\n  :walks-per-vertex 32,\n  :angle 120,\n  :score 0.79883385}}\n<\/pre>\n<p>With this modification to the context creation algorithm, every metric slightly dropped compared to our previous version of the <code>first-sense<\/code> disambiguation algorithm. However, this modification of the algorithm has little impact on the final results if we consider the margin of error we have for the <code>F1<\/code> score incurred by DeepWalk&#8217;s random walks and training process.<\/p>\n<div class=\"org-src-container\">\n<pre class=\"src src-clojure\"><span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">def<\/span> <span style=\"color: #fd971f;\">deep-walk<\/span> <span style=\"color: #66d9ef;\">(<\/span>create-deep-walk graph \n                                 <span style=\"color: #ae81ff;\">:window-size<\/span> 15\n                                 <span style=\"color: #ae81ff;\">:vector-size<\/span> 128\n                                 <span style=\"color: #ae81ff;\">:learning-rate<\/span> 0.025<span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n\n<span style=\"color: #ae81ff;\">(<\/span>train deep-walk graph 15 128<span style=\"color: #ae81ff;\">)<\/span>\n\n<span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">binding<\/span> <span style=\"color: #66d9ef;\">[<\/span>vertex-vectors <span style=\"color: #a6e22e;\">(<\/span><span style=\"color: #f92672;\">.getVertexVectors<\/span> <span style=\"color: #e6db74;\">(<\/span><span style=\"color: #f92672;\">.lookupTable<\/span> deep-walk<span style=\"color: #e6db74;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">]<\/span>\n  <span style=\"color: #66d9ef;\">(<\/span>evaluate-disambiguation-model <span style=\"color: #e6db74;\">\"resources\/disambiguation-gold.standard.csv\"<\/span>\n                                 <span style=\"color: #ae81ff;\">:max-angle<\/span> 120<span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n<\/pre>\n<\/div>\n<\/div>\n<\/div>\n<div id=\"outline-container-org87bde19\" class=\"outline-4\">\n<h4 id=\"org87bde19\">Consider Multiple Contexts<\/h4>\n<div id=\"text-org87bde19\" class=\"outline-text-4\">\n<p>Another modification we will test is to define not a single context but multiple contexts and to keep the most relevant one. So far, we only used the top sense for each word to define the context. The truth is that most of the words have multiple possible senses, which means that we are currently ignoring them. Let&#8217;s again revisit our initial example:<\/p>\n<div class=\"figure\">\n<p><a href=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/sentence.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-3567 size-full\" src=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/sentence.png\" width=\"638\" height=\"110\" srcset=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/sentence.png 638w, https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/sentence-300x52.png 300w\" sizes=\"auto, (max-width: 638px) 100vw, 638px\" \/><\/a><\/p>\n<\/div>\n<p>If the target word is <code>buildings<\/code> then the contexts we created so far only considered the senses <a href=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_d65b40ecc3113c0303a621e3360a8e2f51df152f.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-3593\" src=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_d65b40ecc3113c0303a621e3360a8e2f51df152f.png\" alt=\"\" width=\"84\" height=\"20\" \/><\/a>,\u00c2\u00a0<a href=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_7986902a6150fca4666ab80489c5e19f57d3c27c.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-3594\" src=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_7986902a6150fca4666ab80489c5e19f57d3c27c.png\" alt=\"\" width=\"69\" height=\"20\" \/><\/a> and\u00c2\u00a0<a href=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_77b1bb8b1889bda6d392cbafb767fa4f6c24fc97.png\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-3595\" src=\"https:\/\/fgiasson.com\/blog\/wp-content\/uploads\/2017\/01\/kbpedia-concepts-disambiguation_mkb_77b1bb8b1889bda6d392cbafb767fa4f6c24fc97.png\" alt=\"\" width=\"84\" height=\"20\" \/><\/a> but what about the other two? What we want to do here is to create multiple contexts and then to compare each of the senses of the target word against each of these contexts. Then we will check the angle between each of the senses of the target word against each of the contexts. Then we will keep the sense that is most closely related to one of the context vectors.<\/p>\n<p>The contexts are simply created out of all the possible combinations between each of the senses of each words within the window. With the example above, the possible combinations are:<\/p>\n<div class=\"org-src-container\">\n<pre class=\"src src-clojure\"><span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">for<\/span> <span style=\"color: #66d9ef;\">[<\/span>x <span style=\"color: #a6e22e;\">[<\/span><span style=\"color: #e6db74;\">[<\/span>1 -9 -2<span style=\"color: #e6db74;\">]<\/span> <span style=\"color: #e6db74;\">[<\/span>2 3 -7<span style=\"color: #e6db74;\">]<\/span><span style=\"color: #a6e22e;\">]<\/span> \n      y <span style=\"color: #a6e22e;\">[<\/span><span style=\"color: #e6db74;\">[<\/span>-8 1 1<span style=\"color: #e6db74;\">]<\/span><span style=\"color: #a6e22e;\">]<\/span>\n      z <span style=\"color: #a6e22e;\">[<\/span><span style=\"color: #e6db74;\">[<\/span>-1 -9 0<span style=\"color: #e6db74;\">]<\/span> <span style=\"color: #e6db74;\">[<\/span>0 3 -1<span style=\"color: #e6db74;\">]<\/span><span style=\"color: #a6e22e;\">]<\/span><span style=\"color: #66d9ef;\">]<\/span> \n  <span style=\"color: #66d9ef;\">(<\/span><span style=\"color: #66d9ef;\">clojure.pprint<\/span><span style=\"color: #66d9ef;\">\/<\/span>pprint <span style=\"color: #a6e22e;\">(<\/span>vector x y x<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n<\/pre>\n<\/div>\n<pre class=\"example\">[[1 -9 -2] [-8 1 1] [1 -9 -2]]\n[[1 -9 -2] [-8 1 1] [1 -9 -2]]\n[[2 3 -7] [-8 1 1] [2 3 -7]]\n[[2 3 -7] [-8 1 1] [2 3 -7]]\n<\/pre>\n<p>To now compare this larger number of combinations, we need to modify the evaluation procedure. We have to modify how the contexts are created and how all possible contexts are generated. Then we have to modify how we predict the label for an example by using the most relevant context.<\/p>\n<div class=\"org-src-container\">\n<pre class=\"src src-clojure\"><span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">defn<\/span> <span style=\"color: #a6e22e;\">disambiguate<\/span>\n  <span style=\"color: #66d9ef;\">[<\/span>line<span style=\"color: #66d9ef;\">]<\/span>\n  <span style=\"color: #66d9ef;\">(<\/span><span style=\"color: #f92672;\">let<\/span> <span style=\"color: #a6e22e;\">[<\/span>tags <span style=\"color: #e6db74;\">(<\/span>re-seq #<span style=\"color: #e6db74;\">\"<\/span><span style=\"color: #e6db74; font-weight: bold;\">\\[<\/span><span style=\"color: #e6db74; font-weight: bold;\">\\[<\/span><span style=\"color: #e6db74;\">(<\/span><span style=\"color: #e6db74;\">.*?<\/span><span style=\"color: #e6db74;\">)<\/span><span style=\"color: #e6db74; font-weight: bold;\">\\]<\/span><span style=\"color: #e6db74; font-weight: bold;\">\\]<\/span><span style=\"color: #e6db74; font-weight: bold;\">\\<\/span><span style=\"color: #e6db74; font-weight: bold;\">(<\/span><span style=\"color: #e6db74; font-weight: bold;\">\\<\/span><span style=\"color: #e6db74; font-weight: bold;\">(<\/span><span style=\"color: #e6db74;\">(<\/span><span style=\"color: #e6db74;\">.*?<\/span><span style=\"color: #e6db74;\">)<\/span><span style=\"color: #e6db74; font-weight: bold;\">\\<\/span><span style=\"color: #e6db74; font-weight: bold;\">)<\/span><span style=\"color: #e6db74; font-weight: bold;\">\\<\/span><span style=\"color: #e6db74; font-weight: bold;\">)<\/span><span style=\"color: #e6db74;\">\"<\/span> line<span style=\"color: #e6db74;\">)<\/span><span style=\"color: #a6e22e;\">]<\/span>\n    <span style=\"color: #a6e22e;\">(<\/span><span style=\"color: #f92672;\">loop<\/span> <span style=\"color: #e6db74;\">[<\/span>i 0\n           tag <span style=\"color: #fd971f;\">(<\/span>first tags<span style=\"color: #fd971f;\">)<\/span>\n           rtags <span style=\"color: #fd971f;\">(<\/span>rest tags<span style=\"color: #fd971f;\">)<\/span><span style=\"color: #e6db74;\">]<\/span>\n      <span style=\"color: #e6db74;\">(<\/span><span style=\"color: #f92672;\">let<\/span> <span style=\"color: #fd971f;\">[<\/span>word <span style=\"color: #f92672;\">(<\/span>second tag<span style=\"color: #f92672;\">)<\/span>\n            concepts <span style=\"color: #f92672;\">(<\/span>last tag<span style=\"color: #f92672;\">)<\/span>\n            concept <span style=\"color: #f92672;\">(<\/span>get-tag-concept concepts<span style=\"color: #f92672;\">)<\/span><span style=\"color: #fd971f;\">]<\/span>\n        <span style=\"color: #fd971f;\">(<\/span>println word <span style=\"color: #e6db74;\">\" --&gt; \"<\/span> concepts<span style=\"color: #fd971f;\">)<\/span>\n\n        <span style=\"color: #75715e; font-style: italic;\">;; <\/span><span style=\"color: #75715e; font-style: italic;\">Disambiguate concepts<\/span>\n        <span style=\"color: #fd971f;\">(<\/span><span style=\"color: #f92672;\">let<\/span> <span style=\"color: #f92672;\">[<\/span>ambiguous-concepts <span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #66d9ef;\">string<\/span><span style=\"color: #66d9ef;\">\/<\/span>split <span style=\"color: #66d9ef;\">(<\/span>first <span style=\"color: #a6e22e;\">(<\/span><span style=\"color: #66d9ef;\">string<\/span><span style=\"color: #66d9ef;\">\/<\/span>split concepts #<span style=\"color: #e6db74;\">\"<\/span><span style=\"color: #e6db74;\">::<\/span><span style=\"color: #e6db74;\">\"<\/span><span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span> #<span style=\"color: #e6db74;\">\" \"<\/span><span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #f92672;\">]<\/span>\n          <span style=\"color: #f92672;\">(<\/span><span style=\"color: #f92672;\">doseq<\/span> <span style=\"color: #ae81ff;\">[<\/span>ambiguous-concept ambiguous-concepts<span style=\"color: #ae81ff;\">]<\/span>\n            <span style=\"color: #75715e; font-style: italic;\">;; <\/span><span style=\"color: #75715e; font-style: italic;\">(println \"a-vector: \" (get-vector ambiguous-concept))<\/span>\n            <span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">let<\/span> <span style=\"color: #66d9ef;\">[<\/span>unambiguous <span style=\"color: #a6e22e;\">(<\/span>first <span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">-&gt;&gt;<\/span> <span style=\"color: #66d9ef;\">(<\/span>get-contexts i tags<span style=\"color: #66d9ef;\">)<\/span>\n                                          <span style=\"color: #66d9ef;\">(<\/span>map <span style=\"color: #a6e22e;\">(<\/span><span style=\"color: #f92672;\">fn<\/span> <span style=\"color: #e6db74;\">[<\/span>context<span style=\"color: #e6db74;\">]<\/span>\n                                                 <span style=\"color: #e6db74;\">{<\/span><span style=\"color: #fd971f;\">(<\/span>angle <span style=\"color: #f92672;\">(<\/span>get-vector ambiguous-concept<span style=\"color: #f92672;\">)<\/span> context<span style=\"color: #fd971f;\">)<\/span> ambiguous-concept<span style=\"color: #e6db74;\">}<\/span><span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span>\n                                          <span style=\"color: #66d9ef;\">(<\/span>apply merge<span style=\"color: #66d9ef;\">)<\/span>\n                                          sort<span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">]<\/span>\n\n              <span style=\"color: #75715e; font-style: italic;\">;; <\/span><span style=\"color: #75715e; font-style: italic;\">(println \"b-vector: \" (get-context i tags))                <\/span>\n              <span style=\"color: #66d9ef;\">(<\/span>println <span style=\"color: #e6db74;\">\"Unambiguous concept: \"<\/span> <span style=\"color: #a6e22e;\">(<\/span>second unambiguous<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span>\n              <span style=\"color: #66d9ef;\">(<\/span>println <span style=\"color: #e6db74;\">\"angle: \"<\/span> <span style=\"color: #a6e22e;\">(<\/span>first unambiguous<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span>\n              <span style=\"color: #66d9ef;\">(<\/span>println<span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #f92672;\">)<\/span><span style=\"color: #fd971f;\">)<\/span><span style=\"color: #e6db74;\">)<\/span>\n\n\n        <span style=\"color: #e6db74;\">(<\/span><span style=\"color: #f92672;\">when-not<\/span> <span style=\"color: #fd971f;\">(<\/span>empty? rtags<span style=\"color: #fd971f;\">)<\/span>\n          <span style=\"color: #fd971f;\">(<\/span><span style=\"color: #f92672;\">recur<\/span> <span style=\"color: #f92672;\">(<\/span>inc i<span style=\"color: #f92672;\">)<\/span>\n                 <span style=\"color: #f92672;\">(<\/span>first rtags<span style=\"color: #f92672;\">)<\/span>\n                 <span style=\"color: #f92672;\">(<\/span>rest rtags<span style=\"color: #f92672;\">)<\/span><span style=\"color: #fd971f;\">)<\/span><span style=\"color: #e6db74;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n<\/pre>\n<\/div>\n<div class=\"org-src-container\">\n<pre class=\"src src-clojure\"><span style=\"color: #ae81ff;\">(<\/span>require '<span style=\"color: #66d9ef;\">[<\/span><span style=\"color: #66d9ef;\">clojure.math.combinatorics<\/span> <span style=\"color: #ae81ff;\">:as<\/span> combo<span style=\"color: #66d9ef;\">]<\/span><span style=\"color: #ae81ff;\">)<\/span>\n\n<span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">defn<\/span> <span style=\"color: #a6e22e;\">get-word-senses<\/span>\n  <span style=\"color: #66d9ef;\">[<\/span>i tags<span style=\"color: #66d9ef;\">]<\/span>\n  <span style=\"color: #66d9ef;\">(<\/span><span style=\"color: #f92672;\">let<\/span> <span style=\"color: #a6e22e;\">[<\/span>senses <span style=\"color: #e6db74;\">(<\/span>last <span style=\"color: #fd971f;\">(<\/span>nth tags i<span style=\"color: #fd971f;\">)<\/span><span style=\"color: #e6db74;\">)<\/span><span style=\"color: #a6e22e;\">]<\/span>\n    <span style=\"color: #a6e22e;\">(<\/span><span style=\"color: #f92672;\">if<\/span> <span style=\"color: #e6db74;\">(<\/span>&gt; <span style=\"color: #fd971f;\">(<\/span><span style=\"color: #f92672;\">.indexOf<\/span> senses <span style=\"color: #e6db74;\">\"<\/span><span style=\"color: #e6db74;\">::<\/span><span style=\"color: #e6db74;\">\"<\/span><span style=\"color: #fd971f;\">)<\/span> -1<span style=\"color: #e6db74;\">)<\/span>\n      <span style=\"color: #e6db74;\">(<\/span><span style=\"color: #66d9ef;\">string<\/span><span style=\"color: #66d9ef;\">\/<\/span>split <span style=\"color: #fd971f;\">(<\/span>second <span style=\"color: #f92672;\">(<\/span>re-find #<span style=\"color: #e6db74;\">\"<\/span><span style=\"color: #e6db74;\">(<\/span><span style=\"color: #e6db74;\">.*<\/span><span style=\"color: #e6db74;\">)<\/span><span style=\"color: #e6db74;\">::<\/span><span style=\"color: #e6db74;\">\"<\/span> senses<span style=\"color: #f92672;\">)<\/span><span style=\"color: #fd971f;\">)<\/span> #<span style=\"color: #e6db74;\">\" \"<\/span><span style=\"color: #e6db74;\">)<\/span>\n      senses<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n\n<span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">defn<\/span> <span style=\"color: #a6e22e;\">get-contexts<\/span>\n  <span style=\"color: #66d9ef;\">[<\/span>i tags<span style=\"color: #66d9ef;\">]<\/span>\n  <span style=\"color: #66d9ef;\">(<\/span><span style=\"color: #f92672;\">if<\/span> <span style=\"color: #a6e22e;\">(<\/span>= <span style=\"color: #e6db74;\">(<\/span>count tags<span style=\"color: #e6db74;\">)<\/span> 1<span style=\"color: #a6e22e;\">)<\/span>\n    <span style=\"color: #75715e; font-style: italic;\">;; <\/span><span style=\"color: #75715e; font-style: italic;\">There is only one tag in the sentence<\/span>\n    <span style=\"color: #a6e22e;\">(<\/span>get-first-sense-vector <span style=\"color: #e6db74;\">(<\/span>last <span style=\"color: #fd971f;\">(<\/span>first tags<span style=\"color: #fd971f;\">)<\/span><span style=\"color: #e6db74;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span>\n    <span style=\"color: #a6e22e;\">(<\/span><span style=\"color: #f92672;\">if<\/span> <span style=\"color: #e6db74;\">(<\/span>= <span style=\"color: #fd971f;\">(<\/span>count tags<span style=\"color: #fd971f;\">)<\/span> 2<span style=\"color: #e6db74;\">)<\/span>\n      <span style=\"color: #75715e; font-style: italic;\">;; <\/span><span style=\"color: #75715e; font-style: italic;\">There is only 2 tags in the sentence<\/span>\n      <span style=\"color: #e6db74;\">(<\/span><span style=\"color: #f92672;\">let<\/span> <span style=\"color: #fd971f;\">[<\/span>contexts <span style=\"color: #f92672;\">(<\/span><span style=\"color: #66d9ef;\">combo<\/span><span style=\"color: #66d9ef;\">\/<\/span>cartesian-product <span style=\"color: #ae81ff;\">(<\/span>get-word-senses 0 tags<span style=\"color: #ae81ff;\">)<\/span>\n                                              <span style=\"color: #ae81ff;\">(<\/span>get-word-senses 1 tags<span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #f92672;\">)<\/span><span style=\"color: #fd971f;\">]<\/span>\n        <span style=\"color: #fd971f;\">(<\/span><span style=\"color: #f92672;\">-&gt;&gt;<\/span> contexts\n             <span style=\"color: #f92672;\">(<\/span>mapv <span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">fn<\/span> <span style=\"color: #66d9ef;\">[<\/span>context<span style=\"color: #66d9ef;\">]<\/span>\n                     <span style=\"color: #66d9ef;\">(<\/span>sum-vectors <span style=\"color: #a6e22e;\">(<\/span>get-first-sense-vector <span style=\"color: #ae81ff;\">(<\/span>first context<span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span>\n                                  <span style=\"color: #a6e22e;\">(<\/span>get-first-sense-vector <span style=\"color: #ae81ff;\">(<\/span>second context<span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #f92672;\">)<\/span><span style=\"color: #fd971f;\">)<\/span><span style=\"color: #e6db74;\">)<\/span>\n      <span style=\"color: #75715e; font-style: italic;\">;; <\/span><span style=\"color: #75715e; font-style: italic;\">Target concept is the first one of the sentence<\/span>\n      <span style=\"color: #e6db74;\">(<\/span><span style=\"color: #f92672;\">if<\/span> <span style=\"color: #fd971f;\">(<\/span>= i 0<span style=\"color: #fd971f;\">)<\/span>\n        <span style=\"color: #fd971f;\">(<\/span><span style=\"color: #f92672;\">let<\/span> <span style=\"color: #f92672;\">[<\/span>contexts <span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #66d9ef;\">combo<\/span><span style=\"color: #66d9ef;\">\/<\/span>cartesian-product <span style=\"color: #66d9ef;\">(<\/span>get-word-senses 0 tags<span style=\"color: #66d9ef;\">)<\/span>\n                                                <span style=\"color: #66d9ef;\">(<\/span>get-word-senses 1 tags<span style=\"color: #66d9ef;\">)<\/span>\n                                                <span style=\"color: #66d9ef;\">(<\/span>get-word-senses 2 tags<span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #f92672;\">]<\/span>\n          <span style=\"color: #f92672;\">(<\/span><span style=\"color: #f92672;\">-&gt;&gt;<\/span> contexts\n               <span style=\"color: #ae81ff;\">(<\/span>mapv <span style=\"color: #66d9ef;\">(<\/span><span style=\"color: #f92672;\">fn<\/span> <span style=\"color: #a6e22e;\">[<\/span>context<span style=\"color: #a6e22e;\">]<\/span>\n                       <span style=\"color: #a6e22e;\">(<\/span>sum-vectors <span style=\"color: #ae81ff;\">(<\/span>get-first-sense-vector <span style=\"color: #66d9ef;\">(<\/span>first context<span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n                                    <span style=\"color: #ae81ff;\">(<\/span>get-first-sense-vector <span style=\"color: #66d9ef;\">(<\/span>second context<span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n                                    <span style=\"color: #ae81ff;\">(<\/span>get-first-sense-vector <span style=\"color: #66d9ef;\">(<\/span>nth context 2<span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #f92672;\">)<\/span><span style=\"color: #fd971f;\">)<\/span>\n        <span style=\"color: #75715e; font-style: italic;\">;; <\/span><span style=\"color: #75715e; font-style: italic;\">Target concept is the last one of the sentence<\/span>\n        <span style=\"color: #fd971f;\">(<\/span><span style=\"color: #f92672;\">if<\/span> <span style=\"color: #f92672;\">(<\/span>= i <span style=\"color: #ae81ff;\">(<\/span>- <span style=\"color: #66d9ef;\">(<\/span>count tags<span style=\"color: #66d9ef;\">)<\/span> 1<span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #f92672;\">)<\/span>\n          <span style=\"color: #f92672;\">(<\/span><span style=\"color: #f92672;\">let<\/span> <span style=\"color: #ae81ff;\">[<\/span>contexts <span style=\"color: #66d9ef;\">(<\/span><span style=\"color: #66d9ef;\">combo<\/span><span style=\"color: #66d9ef;\">\/<\/span>cartesian-product <span style=\"color: #a6e22e;\">(<\/span>get-word-senses <span style=\"color: #ae81ff;\">(<\/span>- i 2<span style=\"color: #ae81ff;\">)<\/span> tags<span style=\"color: #a6e22e;\">)<\/span>\n                                                  <span style=\"color: #a6e22e;\">(<\/span>get-word-senses <span style=\"color: #ae81ff;\">(<\/span>- i 1<span style=\"color: #ae81ff;\">)<\/span> tags<span style=\"color: #a6e22e;\">)<\/span>\n                                                  <span style=\"color: #a6e22e;\">(<\/span>get-word-senses i tags<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">]<\/span>\n            <span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">-&gt;&gt;<\/span> contexts\n                 <span style=\"color: #66d9ef;\">(<\/span>mapv <span style=\"color: #a6e22e;\">(<\/span><span style=\"color: #f92672;\">fn<\/span> <span style=\"color: #ae81ff;\">[<\/span>context<span style=\"color: #ae81ff;\">]<\/span>\n                         <span style=\"color: #ae81ff;\">(<\/span>sum-vectors <span style=\"color: #66d9ef;\">(<\/span>get-first-sense-vector <span style=\"color: #a6e22e;\">(<\/span>first context<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span>\n                                      <span style=\"color: #66d9ef;\">(<\/span>get-first-sense-vector <span style=\"color: #a6e22e;\">(<\/span>second context<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span>\n                                      <span style=\"color: #66d9ef;\">(<\/span>get-first-sense-vector <span style=\"color: #a6e22e;\">(<\/span>nth context 2<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #f92672;\">)<\/span>\n          <span style=\"color: #75715e; font-style: italic;\">;; <\/span><span style=\"color: #75715e; font-style: italic;\">Target is in-between <\/span>\n          <span style=\"color: #f92672;\">(<\/span><span style=\"color: #f92672;\">let<\/span> <span style=\"color: #ae81ff;\">[<\/span>contexts <span style=\"color: #66d9ef;\">(<\/span><span style=\"color: #66d9ef;\">combo<\/span><span style=\"color: #66d9ef;\">\/<\/span>cartesian-product <span style=\"color: #a6e22e;\">(<\/span>get-word-senses <span style=\"color: #ae81ff;\">(<\/span>- i 1<span style=\"color: #ae81ff;\">)<\/span> tags<span style=\"color: #a6e22e;\">)<\/span>\n                                                  <span style=\"color: #a6e22e;\">(<\/span>get-word-senses i tags<span style=\"color: #a6e22e;\">)<\/span>\n                                                  <span style=\"color: #a6e22e;\">(<\/span>get-word-senses <span style=\"color: #ae81ff;\">(<\/span>+ i 1<span style=\"color: #ae81ff;\">)<\/span> tags<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">]<\/span>\n            <span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">-&gt;&gt;<\/span> contexts\n                 <span style=\"color: #66d9ef;\">(<\/span>mapv <span style=\"color: #a6e22e;\">(<\/span><span style=\"color: #f92672;\">fn<\/span> <span style=\"color: #ae81ff;\">[<\/span>context<span style=\"color: #ae81ff;\">]<\/span>\n                         <span style=\"color: #ae81ff;\">(<\/span>sum-vectors <span style=\"color: #66d9ef;\">(<\/span>get-first-sense-vector <span style=\"color: #a6e22e;\">(<\/span>first context<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span>\n                                      <span style=\"color: #66d9ef;\">(<\/span>get-first-sense-vector <span style=\"color: #a6e22e;\">(<\/span>second context<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span>\n                                      <span style=\"color: #66d9ef;\">(<\/span>get-first-sense-vector <span style=\"color: #a6e22e;\">(<\/span>nth context 2<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #f92672;\">)<\/span><span style=\"color: #fd971f;\">)<\/span><span style=\"color: #e6db74;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n\n<span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">defn<\/span> <span style=\"color: #a6e22e;\">predict-label<\/span>\n  <span style=\"color: #66d9ef;\">[<\/span>ambiguous-concepts i tags max-angle<span style=\"color: #66d9ef;\">]<\/span>\n  <span style=\"color: #66d9ef;\">(<\/span>second\n   <span style=\"color: #a6e22e;\">(<\/span>first\n    <span style=\"color: #e6db74;\">(<\/span><span style=\"color: #f92672;\">-&gt;&gt;<\/span> ambiguous-concepts\n         <span style=\"color: #fd971f;\">(<\/span>map <span style=\"color: #f92672;\">(<\/span><span style=\"color: #f92672;\">fn<\/span> <span style=\"color: #ae81ff;\">[<\/span>ambiguous-concept<span style=\"color: #ae81ff;\">]<\/span>\n                <span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">let<\/span> <span style=\"color: #66d9ef;\">[<\/span>unambiguous <span style=\"color: #a6e22e;\">(<\/span>first <span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">-&gt;&gt;<\/span> <span style=\"color: #66d9ef;\">(<\/span>get-contexts i tags<span style=\"color: #66d9ef;\">)<\/span>\n                                              <span style=\"color: #66d9ef;\">(<\/span>map <span style=\"color: #a6e22e;\">(<\/span><span style=\"color: #f92672;\">fn<\/span> <span style=\"color: #e6db74;\">[<\/span>context<span style=\"color: #e6db74;\">]<\/span>\n                                                     <span style=\"color: #e6db74;\">(<\/span><span style=\"color: #f92672;\">if-let<\/span> <span style=\"color: #fd971f;\">[<\/span>a <span style=\"color: #f92672;\">(<\/span>get-vector ambiguous-concept<span style=\"color: #f92672;\">)<\/span><span style=\"color: #fd971f;\">]<\/span>\n                                                       <span style=\"color: #fd971f;\">(<\/span><span style=\"color: #f92672;\">if-not<\/span> <span style=\"color: #f92672;\">(<\/span>nil? context<span style=\"color: #f92672;\">)<\/span>\n                                                         <span style=\"color: #f92672;\">(<\/span><span style=\"color: #f92672;\">if<\/span> <span style=\"color: #ae81ff;\">(<\/span>&gt; <span style=\"color: #66d9ef;\">(<\/span>angle a context<span style=\"color: #66d9ef;\">)<\/span> max-angle<span style=\"color: #ae81ff;\">)<\/span>\n                                                           <span style=\"color: #ae81ff;\">{<\/span><span style=\"color: #ae81ff;\">}<\/span>\n                                                           <span style=\"color: #ae81ff;\">{<\/span><span style=\"color: #66d9ef;\">(<\/span>angle a context<span style=\"color: #66d9ef;\">)<\/span> ambiguous-concept<span style=\"color: #ae81ff;\">}<\/span><span style=\"color: #f92672;\">)<\/span>\n                                                         <span style=\"color: #f92672;\">{<\/span><span style=\"color: #f92672;\">}<\/span><span style=\"color: #fd971f;\">)<\/span>\n                                                       <span style=\"color: #fd971f;\">{<\/span><span style=\"color: #fd971f;\">}<\/span><span style=\"color: #e6db74;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span>\n                                              <span style=\"color: #66d9ef;\">(<\/span>apply merge<span style=\"color: #66d9ef;\">)<\/span>\n                                              sort<span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">]<\/span>\n                  <span style=\"color: #66d9ef;\">{<\/span><span style=\"color: #a6e22e;\">(<\/span>first unambiguous<span style=\"color: #a6e22e;\">)<\/span> <span style=\"color: #a6e22e;\">(<\/span>second unambiguous<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">}<\/span><span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #f92672;\">)<\/span><span style=\"color: #fd971f;\">)<\/span>\n         <span style=\"color: #fd971f;\">(<\/span>apply merge<span style=\"color: #fd971f;\">)<\/span>\n         sort<span style=\"color: #e6db74;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n\n<span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">defn<\/span> <span style=\"color: #a6e22e;\">predict-label-angle<\/span>\n  <span style=\"color: #66d9ef;\">[<\/span>ambiguous-concepts i tags<span style=\"color: #66d9ef;\">]<\/span>\n  <span style=\"color: #66d9ef;\">(<\/span>first\n   <span style=\"color: #a6e22e;\">(<\/span><span style=\"color: #f92672;\">-&gt;&gt;<\/span> ambiguous-concepts\n        <span style=\"color: #e6db74;\">(<\/span>map <span style=\"color: #fd971f;\">(<\/span><span style=\"color: #f92672;\">fn<\/span> <span style=\"color: #f92672;\">[<\/span>ambiguous-concept<span style=\"color: #f92672;\">]<\/span>\n               <span style=\"color: #f92672;\">(<\/span><span style=\"color: #f92672;\">let<\/span> <span style=\"color: #ae81ff;\">[<\/span>unambiguous <span style=\"color: #66d9ef;\">(<\/span>first <span style=\"color: #a6e22e;\">(<\/span><span style=\"color: #f92672;\">-&gt;&gt;<\/span> <span style=\"color: #ae81ff;\">(<\/span>get-contexts i tags<span style=\"color: #ae81ff;\">)<\/span>\n                                             <span style=\"color: #ae81ff;\">(<\/span>map <span style=\"color: #66d9ef;\">(<\/span><span style=\"color: #f92672;\">fn<\/span> <span style=\"color: #a6e22e;\">[<\/span>context<span style=\"color: #a6e22e;\">]<\/span>\n                                                     <span style=\"color: #a6e22e;\">(<\/span><span style=\"color: #f92672;\">if-let<\/span> <span style=\"color: #e6db74;\">[<\/span>a <span style=\"color: #fd971f;\">(<\/span>get-vector ambiguous-concept<span style=\"color: #fd971f;\">)<\/span><span style=\"color: #e6db74;\">]<\/span>\n                                                       <span style=\"color: #e6db74;\">(<\/span><span style=\"color: #f92672;\">if-not<\/span> <span style=\"color: #fd971f;\">(<\/span>nil? context<span style=\"color: #fd971f;\">)<\/span>\n                                                         <span style=\"color: #fd971f;\">{<\/span><span style=\"color: #f92672;\">(<\/span>angle a context<span style=\"color: #f92672;\">)<\/span> ambiguous-concept<span style=\"color: #fd971f;\">}<\/span>\n                                                         <span style=\"color: #fd971f;\">{<\/span><span style=\"color: #fd971f;\">}<\/span><span style=\"color: #e6db74;\">)<\/span>\n                                                       <span style=\"color: #e6db74;\">{<\/span><span style=\"color: #e6db74;\">}<\/span><span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n                                             <span style=\"color: #ae81ff;\">(<\/span>apply merge<span style=\"color: #ae81ff;\">)<\/span>\n                                             sort<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">]<\/span>\n                 <span style=\"color: #ae81ff;\">{<\/span><span style=\"color: #66d9ef;\">(<\/span>first unambiguous<span style=\"color: #66d9ef;\">)<\/span> <span style=\"color: #66d9ef;\">(<\/span>second unambiguous<span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">}<\/span><span style=\"color: #f92672;\">)<\/span><span style=\"color: #fd971f;\">)<\/span><span style=\"color: #e6db74;\">)<\/span>\n        <span style=\"color: #e6db74;\">(<\/span>apply merge<span style=\"color: #e6db74;\">)<\/span>\n        sort<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n<\/pre>\n<\/div>\n<p>Now, using this changed disambiguation method, let&#8217;s optimize the hyperparameters of the pipeline to see how this method can perform compared to the baseline.<\/p>\n<div class=\"org-src-container\">\n<pre class=\"src src-clojure\"><span style=\"color: #ae81ff;\">(<\/span>svm-grid-search graph <span style=\"color: #ae81ff;\">:grid-parameters<\/span> <span style=\"color: #66d9ef;\">{<\/span><span style=\"color: #ae81ff;\">:window-size<\/span> <span style=\"color: #a6e22e;\">[<\/span>10 15 20<span style=\"color: #a6e22e;\">]<\/span>\n                                         <span style=\"color: #ae81ff;\">:vector-size<\/span> <span style=\"color: #a6e22e;\">[<\/span>64 128 256<span style=\"color: #a6e22e;\">]<\/span>\n                                         <span style=\"color: #ae81ff;\">:walk-length<\/span> <span style=\"color: #a6e22e;\">[<\/span>5 10 15<span style=\"color: #a6e22e;\">]<\/span>\n                                         <span style=\"color: #ae81ff;\">:walks-per-vertex<\/span> <span style=\"color: #a6e22e;\">[<\/span>32 64 128<span style=\"color: #a6e22e;\">]<\/span>\n                                         <span style=\"color: #ae81ff;\">:angle<\/span> <span style=\"color: #a6e22e;\">[<\/span>60 90 120<span style=\"color: #a6e22e;\">]<\/span>\n                                         <span style=\"color: #ae81ff;\">:selection-metrics<\/span> <span style=\"color: #a6e22e;\">[<\/span><span style=\"color: #ae81ff;\">:precision<\/span> <span style=\"color: #ae81ff;\">:accuracy<\/span> <span style=\"color: #ae81ff;\">:recall<\/span> <span style=\"color: #ae81ff;\">:f1<\/span><span style=\"color: #a6e22e;\">]<\/span><span style=\"color: #66d9ef;\">}<\/span><span style=\"color: #ae81ff;\">)<\/span>\n<\/pre>\n<\/div>\n<pre class=\"example\">{:precision\n {:window-size 15,\n  :vector-size 64,\n  :walk-length 5,\n  :walks-per-vertex 32,\n  :angle 60,\n  :score 0.69970846},\n :accuracy\n {:window-size 15,\n  :vector-size 64,\n  :walk-length 5,\n  :walks-per-vertex 32,\n  :angle 90,\n  :score 0.676259},\n :recall\n {:window-size 15,\n  :vector-size 64,\n  :walk-length 5,\n  :walks-per-vertex 32,\n  :angle 90,\n  :score 0.98566306},\n :f1\n {:window-size 15,\n  :vector-size 64,\n  :walk-length 5,\n  :walks-per-vertex 32,\n  :angle 90,\n  :score 0.8029197}}\n<\/pre>\n<p>These results are nearly the same as the <code>first-sense-context<\/code> method described above. Our first intuition is that this more complicated method may have little or no impact on the overall performance. Yet, as we will see below, this is actually not the case in all situations.<\/p>\n<div class=\"org-src-container\">\n<pre class=\"src src-clojure\"><span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">def<\/span> <span style=\"color: #fd971f;\">deep-walk<\/span> <span style=\"color: #66d9ef;\">(<\/span>create-deep-walk graph \n                                 <span style=\"color: #ae81ff;\">:window-size<\/span> 15\n                                 <span style=\"color: #ae81ff;\">:vector-size<\/span> 128\n                                 <span style=\"color: #ae81ff;\">:learning-rate<\/span> 0.025<span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n\n<span style=\"color: #ae81ff;\">(<\/span>train deep-walk graph 15 128<span style=\"color: #ae81ff;\">)<\/span>\n\n<span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">binding<\/span> <span style=\"color: #66d9ef;\">[<\/span>vertex-vectors <span style=\"color: #a6e22e;\">(<\/span><span style=\"color: #f92672;\">.getVertexVectors<\/span> <span style=\"color: #e6db74;\">(<\/span><span style=\"color: #f92672;\">.lookupTable<\/span> deep-walk<span style=\"color: #e6db74;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">]<\/span>\n  <span style=\"color: #66d9ef;\">(<\/span>evaluate-disambiguation-model <span style=\"color: #e6db74;\">\"resources\/disambiguation-gold.standard.csv\"<\/span>\n                                 <span style=\"color: #ae81ff;\">:max-angle<\/span> 120<span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n<\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div id=\"outline-container-org21e39d8\" class=\"outline-3\">\n<h3 id=\"org21e39d8\">Focus Evaluation Of Multiple Senses<\/h3>\n<div id=\"text-org21e39d8\" class=\"outline-text-3\">\n<p>Now that we evaluated how different strategies perform on the entire gold standard, what we will do next is to evaluate how the different strategies perform when we only focus on the example that has multiple possible senses. To run this evaluation, we will modify the <code>(evaluate-disambiguation-model)<\/code> function such that we only predict the disambiguated word sense when multiple senses have been tagged for that word.<\/p>\n<div class=\"org-src-container\">\n<pre class=\"src src-clojure\"><span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">defn<\/span> <span style=\"color: #a6e22e;\">evaluate-disambiguation-model<\/span>\n  <span style=\"color: #66d9ef;\">[<\/span>gold-standard-file &amp; <span style=\"color: #a6e22e;\">{<\/span><span style=\"color: #ae81ff;\">:keys<\/span> <span style=\"color: #e6db74;\">[<\/span>max-angle<span style=\"color: #e6db74;\">]<\/span>\n                         <span style=\"color: #ae81ff;\">:or<\/span> <span style=\"color: #e6db74;\">{<\/span>max-angle 90.0<span style=\"color: #e6db74;\">}<\/span><span style=\"color: #a6e22e;\">}<\/span><span style=\"color: #66d9ef;\">]<\/span>\n  <span style=\"color: #66d9ef;\">(<\/span><span style=\"color: #f92672;\">let<\/span> <span style=\"color: #a6e22e;\">[<\/span>sentences <span style=\"color: #e6db74;\">(<\/span><span style=\"color: #f92672;\">with-open<\/span> <span style=\"color: #fd971f;\">[<\/span>in-file <span style=\"color: #f92672;\">(<\/span><span style=\"color: #66d9ef;\">io<\/span><span style=\"color: #66d9ef;\">\/<\/span>reader gold-standard-file<span style=\"color: #f92672;\">)<\/span><span style=\"color: #fd971f;\">]<\/span>\n                    <span style=\"color: #fd971f;\">(<\/span><span style=\"color: #f92672;\">doall<\/span>\n                     <span style=\"color: #f92672;\">(<\/span><span style=\"color: #66d9ef;\">csv<\/span><span style=\"color: #66d9ef;\">\/<\/span>read-csv in-file <span style=\"color: #ae81ff;\">:separator<\/span> <span style=\"color: #e6db74;\">\\tab<\/span><span style=\"color: #f92672;\">)<\/span><span style=\"color: #fd971f;\">)<\/span><span style=\"color: #e6db74;\">)<\/span>\n        true-positive <span style=\"color: #e6db74;\">(<\/span>atom 0<span style=\"color: #e6db74;\">)<\/span>\n        false-positive <span style=\"color: #e6db74;\">(<\/span>atom 0<span style=\"color: #e6db74;\">)<\/span>\n        true-negative <span style=\"color: #e6db74;\">(<\/span>atom 0<span style=\"color: #e6db74;\">)<\/span>\n        false-negative <span style=\"color: #e6db74;\">(<\/span>atom 0<span style=\"color: #e6db74;\">)<\/span><span style=\"color: #a6e22e;\">]<\/span>\n\n    <span style=\"color: #a6e22e;\">(<\/span><span style=\"color: #f92672;\">doseq<\/span> <span style=\"color: #e6db74;\">[<\/span><span style=\"color: #fd971f;\">[<\/span>sentence<span style=\"color: #fd971f;\">]<\/span> sentences<span style=\"color: #e6db74;\">]<\/span>\n      <span style=\"color: #e6db74;\">(<\/span><span style=\"color: #f92672;\">let<\/span> <span style=\"color: #fd971f;\">[<\/span>tags <span style=\"color: #f92672;\">(<\/span>re-seq #<span style=\"color: #e6db74;\">\"<\/span><span style=\"color: #e6db74; font-weight: bold;\">\\[<\/span><span style=\"color: #e6db74; font-weight: bold;\">\\[<\/span><span style=\"color: #e6db74;\">(<\/span><span style=\"color: #e6db74;\">.*?<\/span><span style=\"color: #e6db74;\">)<\/span><span style=\"color: #e6db74; font-weight: bold;\">\\]<\/span><span style=\"color: #e6db74; font-weight: bold;\">\\]<\/span><span style=\"color: #e6db74; font-weight: bold;\">\\<\/span><span style=\"color: #e6db74; font-weight: bold;\">(<\/span><span style=\"color: #e6db74; font-weight: bold;\">\\<\/span><span style=\"color: #e6db74; font-weight: bold;\">(<\/span><span style=\"color: #e6db74;\">(<\/span><span style=\"color: #e6db74;\">.*?<\/span><span style=\"color: #e6db74;\">)<\/span><span style=\"color: #e6db74; font-weight: bold;\">\\<\/span><span style=\"color: #e6db74; font-weight: bold;\">)<\/span><span style=\"color: #e6db74; font-weight: bold;\">\\<\/span><span style=\"color: #e6db74; font-weight: bold;\">)<\/span><span style=\"color: #e6db74;\">\"<\/span> sentence<span style=\"color: #f92672;\">)<\/span><span style=\"color: #fd971f;\">]<\/span>\n        <span style=\"color: #fd971f;\">(<\/span><span style=\"color: #f92672;\">loop<\/span> <span style=\"color: #f92672;\">[<\/span>i 0\n               tag <span style=\"color: #ae81ff;\">(<\/span>first tags<span style=\"color: #ae81ff;\">)<\/span>\n               rtags <span style=\"color: #ae81ff;\">(<\/span>rest tags<span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #f92672;\">]<\/span>\n\n          <span style=\"color: #f92672;\">(<\/span><span style=\"color: #f92672;\">when-not<\/span> <span style=\"color: #ae81ff;\">(<\/span>= i <span style=\"color: #66d9ef;\">(<\/span>count tags<span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n            <span style=\"color: #75715e; font-style: italic;\">;; <\/span><span style=\"color: #75715e; font-style: italic;\">Disambiguate concepts<\/span>\n            <span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">let<\/span> <span style=\"color: #66d9ef;\">[<\/span>word <span style=\"color: #a6e22e;\">(<\/span>second tag<span style=\"color: #a6e22e;\">)<\/span>\n                  concepts <span style=\"color: #a6e22e;\">(<\/span>last tag<span style=\"color: #a6e22e;\">)<\/span>\n                  concept <span style=\"color: #a6e22e;\">(<\/span>get-tag-concept concepts<span style=\"color: #a6e22e;\">)<\/span>\n                  label <span style=\"color: #a6e22e;\">(<\/span>second <span style=\"color: #ae81ff;\">(<\/span>re-find #<span style=\"color: #e6db74;\">\"<\/span><span style=\"color: #e6db74;\">::<\/span><span style=\"color: #e6db74;\">(<\/span><span style=\"color: #e6db74;\">.*<\/span><span style=\"color: #e6db74;\">)<\/span><span style=\"color: #e6db74;\">\"<\/span> concepts<span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span>\n                  ambiguous-concepts <span style=\"color: #a6e22e;\">(<\/span><span style=\"color: #66d9ef;\">string<\/span><span style=\"color: #66d9ef;\">\/<\/span>split <span style=\"color: #ae81ff;\">(<\/span>first <span style=\"color: #66d9ef;\">(<\/span><span style=\"color: #66d9ef;\">string<\/span><span style=\"color: #66d9ef;\">\/<\/span>split concepts #<span style=\"color: #e6db74;\">\"<\/span><span style=\"color: #e6db74;\">::<\/span><span style=\"color: #e6db74;\">\"<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span> #<span style=\"color: #e6db74;\">\" \"<\/span><span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">]<\/span>\n              <span style=\"color: #66d9ef;\">(<\/span><span style=\"color: #f92672;\">when<\/span> <span style=\"color: #a6e22e;\">(<\/span>&gt; <span style=\"color: #ae81ff;\">(<\/span>count ambiguous-concepts<span style=\"color: #ae81ff;\">)<\/span> 1<span style=\"color: #a6e22e;\">)<\/span>\n                <span style=\"color: #a6e22e;\">(<\/span><span style=\"color: #f92672;\">let<\/span> <span style=\"color: #ae81ff;\">[<\/span>predicted-label <span style=\"color: #66d9ef;\">(<\/span>predict-label ambiguous-concepts i tags max-angle<span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">]<\/span>\n                  <span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">when<\/span> <span style=\"color: #66d9ef;\">(<\/span><span style=\"color: #f92672;\">and<\/span> <span style=\"color: #a6e22e;\">(<\/span>= label predicted-label<span style=\"color: #a6e22e;\">)<\/span>\n                             <span style=\"color: #a6e22e;\">(<\/span>not <span style=\"color: #e6db74;\">(<\/span>empty? label<span style=\"color: #e6db74;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span>\n                             <span style=\"color: #a6e22e;\">(<\/span>not <span style=\"color: #e6db74;\">(<\/span>empty? predicted-label<span style=\"color: #e6db74;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span>\n                    <span style=\"color: #66d9ef;\">(<\/span>swap! true-positive inc<span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n\n                  <span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">when<\/span> <span style=\"color: #66d9ef;\">(<\/span><span style=\"color: #f92672;\">and<\/span> <span style=\"color: #a6e22e;\">(<\/span>not <span style=\"color: #e6db74;\">(<\/span>= label predicted-label<span style=\"color: #e6db74;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span>\n                             <span style=\"color: #a6e22e;\">(<\/span>not <span style=\"color: #e6db74;\">(<\/span>empty? predicted-label<span style=\"color: #e6db74;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span>\n                    <span style=\"color: #66d9ef;\">(<\/span>swap! false-positive inc<span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n\n                  <span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">when<\/span> <span style=\"color: #66d9ef;\">(<\/span><span style=\"color: #f92672;\">and<\/span> <span style=\"color: #a6e22e;\">(<\/span>empty? label<span style=\"color: #a6e22e;\">)<\/span>\n                             <span style=\"color: #a6e22e;\">(<\/span>empty? predicted-label<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span>\n                    <span style=\"color: #66d9ef;\">(<\/span>swap! true-negative inc<span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n\n                  <span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">when<\/span> <span style=\"color: #66d9ef;\">(<\/span><span style=\"color: #f92672;\">and<\/span> <span style=\"color: #a6e22e;\">(<\/span>not <span style=\"color: #e6db74;\">(<\/span>empty? label<span style=\"color: #e6db74;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span>\n                             <span style=\"color: #a6e22e;\">(<\/span>empty? predicted-label<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span>\n                    <span style=\"color: #66d9ef;\">(<\/span>swap! false-negative inc<span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span>\n              <span style=\"color: #66d9ef;\">(<\/span><span style=\"color: #f92672;\">recur<\/span> <span style=\"color: #a6e22e;\">(<\/span>inc i<span style=\"color: #a6e22e;\">)<\/span>\n                     <span style=\"color: #a6e22e;\">(<\/span>first rtags<span style=\"color: #a6e22e;\">)<\/span>\n                     <span style=\"color: #a6e22e;\">(<\/span>rest rtags<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #f92672;\">)<\/span><span style=\"color: #fd971f;\">)<\/span><span style=\"color: #e6db74;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span>\n\n    <span style=\"color: #a6e22e;\">(<\/span>println <span style=\"color: #e6db74;\">\"True positive: \"<\/span> @true-positive<span style=\"color: #a6e22e;\">)<\/span>\n    <span style=\"color: #a6e22e;\">(<\/span>println <span style=\"color: #e6db74;\">\"false positive: \"<\/span> @false-positive<span style=\"color: #a6e22e;\">)<\/span>\n    <span style=\"color: #a6e22e;\">(<\/span>println <span style=\"color: #e6db74;\">\"True negative: \"<\/span> @true-negative<span style=\"color: #a6e22e;\">)<\/span>\n    <span style=\"color: #a6e22e;\">(<\/span>println <span style=\"color: #e6db74;\">\"False negative: \"<\/span> @false-negative<span style=\"color: #a6e22e;\">)<\/span>\n    <span style=\"color: #a6e22e;\">(<\/span>println<span style=\"color: #a6e22e;\">)<\/span>\n\n    <span style=\"color: #a6e22e;\">(<\/span><span style=\"color: #f92672;\">if<\/span> <span style=\"color: #e6db74;\">(<\/span>= 0 @true-positive<span style=\"color: #e6db74;\">)<\/span>\n      <span style=\"color: #e6db74;\">(<\/span><span style=\"color: #f92672;\">let<\/span> <span style=\"color: #fd971f;\">[<\/span>precision 0\n            recall 0\n            accuracy 0\n            f1 0<span style=\"color: #fd971f;\">]<\/span>\n\n        <span style=\"color: #fd971f;\">(<\/span>println <span style=\"color: #e6db74;\">\"Precision: \"<\/span> precision<span style=\"color: #fd971f;\">)<\/span>\n        <span style=\"color: #fd971f;\">(<\/span>println <span style=\"color: #e6db74;\">\"Recall: \"<\/span> recall<span style=\"color: #fd971f;\">)<\/span>\n        <span style=\"color: #fd971f;\">(<\/span>println <span style=\"color: #e6db74;\">\"Accuracy: \"<\/span> accuracy<span style=\"color: #fd971f;\">)<\/span>\n        <span style=\"color: #fd971f;\">(<\/span>println <span style=\"color: #e6db74;\">\"F1: \"<\/span> f1<span style=\"color: #fd971f;\">)<\/span>\n        \n        <span style=\"color: #fd971f;\">{<\/span><span style=\"color: #ae81ff;\">:precision<\/span> precision\n         <span style=\"color: #ae81ff;\">:recall<\/span> recall\n         <span style=\"color: #ae81ff;\">:accuracy<\/span> accuracy\n         <span style=\"color: #ae81ff;\">:f1<\/span> f1<span style=\"color: #fd971f;\">}<\/span><span style=\"color: #e6db74;\">)<\/span>\n      <span style=\"color: #e6db74;\">(<\/span><span style=\"color: #f92672;\">let<\/span> <span style=\"color: #fd971f;\">[<\/span>precision <span style=\"color: #f92672;\">(<\/span>float <span style=\"color: #ae81ff;\">(<\/span>\/ @true-positive <span style=\"color: #66d9ef;\">(<\/span>+ @true-positive @false-positive<span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #f92672;\">)<\/span>\n            recall <span style=\"color: #f92672;\">(<\/span>float <span style=\"color: #ae81ff;\">(<\/span>\/ @true-positive <span style=\"color: #66d9ef;\">(<\/span>+ @true-positive @false-negative<span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #f92672;\">)<\/span>\n            accuracy <span style=\"color: #f92672;\">(<\/span>float <span style=\"color: #ae81ff;\">(<\/span>\/ <span style=\"color: #66d9ef;\">(<\/span>+ @true-positive @true-negative<span style=\"color: #66d9ef;\">)<\/span> <span style=\"color: #66d9ef;\">(<\/span>+ @true-positive @false-negative @false-positive @true-negative<span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #f92672;\">)<\/span>\n            f1 <span style=\"color: #f92672;\">(<\/span>float <span style=\"color: #ae81ff;\">(<\/span>* 2 <span style=\"color: #66d9ef;\">(<\/span>\/ <span style=\"color: #a6e22e;\">(<\/span>* precision recall<span style=\"color: #a6e22e;\">)<\/span> <span style=\"color: #a6e22e;\">(<\/span>+ precision recall<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #f92672;\">)<\/span><span style=\"color: #fd971f;\">]<\/span>\n\n        <span style=\"color: #fd971f;\">(<\/span>println <span style=\"color: #e6db74;\">\"Precision: \"<\/span> precision<span style=\"color: #fd971f;\">)<\/span>\n        <span style=\"color: #fd971f;\">(<\/span>println <span style=\"color: #e6db74;\">\"Recall: \"<\/span> recall<span style=\"color: #fd971f;\">)<\/span>\n        <span style=\"color: #fd971f;\">(<\/span>println <span style=\"color: #e6db74;\">\"Accuracy: \"<\/span> accuracy<span style=\"color: #fd971f;\">)<\/span>\n        <span style=\"color: #fd971f;\">(<\/span>println <span style=\"color: #e6db74;\">\"F1: \"<\/span> f1<span style=\"color: #fd971f;\">)<\/span>\n        \n        <span style=\"color: #fd971f;\">{<\/span><span style=\"color: #ae81ff;\">:precision<\/span> precision\n         <span style=\"color: #ae81ff;\">:recall<\/span> recall\n         <span style=\"color: #ae81ff;\">:accuracy<\/span> accuracy\n         <span style=\"color: #ae81ff;\">:f1<\/span> f1<span style=\"color: #fd971f;\">}<\/span><span style=\"color: #e6db74;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n<\/pre>\n<\/div>\n<\/div>\n<div id=\"outline-container-org4402a5c\" class=\"outline-4\">\n<h4 id=\"org4402a5c\">Evaluation At Random<\/h4>\n<div id=\"text-org4402a5c\" class=\"outline-text-4\">\n<p>To put the next evaluations into context, the first thing we will do is to check what would be the score for each metric if we would pick the concept at random. What we do is to modify the <code>(evaluate-disambiguation-model)<\/code> function to take the &#8220;disambiguated&#8221; concept at random. Then we run that evaluation one thousand times and display the average of these runs for each metric. This will give us the foundation to evaluate the performance of each of the strategies below.<\/p>\n<div class=\"org-src-container\">\n<pre class=\"src src-clojure\"><span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">defn<\/span> <span style=\"color: #a6e22e;\">evaluate-disambiguation-model-random<\/span>\n  <span style=\"color: #66d9ef;\">[<\/span>gold-standard-file &amp; <span style=\"color: #a6e22e;\">{<\/span><span style=\"color: #ae81ff;\">:keys<\/span> <span style=\"color: #e6db74;\">[<\/span>max-angle<span style=\"color: #e6db74;\">]<\/span>\n                         <span style=\"color: #ae81ff;\">:or<\/span> <span style=\"color: #e6db74;\">{<\/span>max-angle 90.0<span style=\"color: #e6db74;\">}<\/span><span style=\"color: #a6e22e;\">}<\/span><span style=\"color: #66d9ef;\">]<\/span>\n  <span style=\"color: #66d9ef;\">(<\/span><span style=\"color: #f92672;\">let<\/span> <span style=\"color: #a6e22e;\">[<\/span>sentences <span style=\"color: #e6db74;\">(<\/span><span style=\"color: #f92672;\">with-open<\/span> <span style=\"color: #fd971f;\">[<\/span>in-file <span style=\"color: #f92672;\">(<\/span><span style=\"color: #66d9ef;\">io<\/span><span style=\"color: #66d9ef;\">\/<\/span>reader gold-standard-file<span style=\"color: #f92672;\">)<\/span><span style=\"color: #fd971f;\">]<\/span>\n                    <span style=\"color: #fd971f;\">(<\/span><span style=\"color: #f92672;\">doall<\/span>\n                     <span style=\"color: #f92672;\">(<\/span><span style=\"color: #66d9ef;\">csv<\/span><span style=\"color: #66d9ef;\">\/<\/span>read-csv in-file <span style=\"color: #ae81ff;\">:separator<\/span> <span style=\"color: #e6db74;\">\\tab<\/span><span style=\"color: #f92672;\">)<\/span><span style=\"color: #fd971f;\">)<\/span><span style=\"color: #e6db74;\">)<\/span>\n        true-positive <span style=\"color: #e6db74;\">(<\/span>atom 0<span style=\"color: #e6db74;\">)<\/span>\n        false-positive <span style=\"color: #e6db74;\">(<\/span>atom 0<span style=\"color: #e6db74;\">)<\/span>\n        true-negative <span style=\"color: #e6db74;\">(<\/span>atom 0<span style=\"color: #e6db74;\">)<\/span>\n        false-negative <span style=\"color: #e6db74;\">(<\/span>atom 0<span style=\"color: #e6db74;\">)<\/span><span style=\"color: #a6e22e;\">]<\/span>\n\n    <span style=\"color: #a6e22e;\">(<\/span><span style=\"color: #f92672;\">doseq<\/span> <span style=\"color: #e6db74;\">[<\/span><span style=\"color: #fd971f;\">[<\/span>sentence<span style=\"color: #fd971f;\">]<\/span> sentences<span style=\"color: #e6db74;\">]<\/span>\n      <span style=\"color: #e6db74;\">(<\/span><span style=\"color: #f92672;\">let<\/span> <span style=\"color: #fd971f;\">[<\/span>tags <span style=\"color: #f92672;\">(<\/span>re-seq #<span style=\"color: #e6db74;\">\"<\/span><span style=\"color: #e6db74; font-weight: bold;\">\\[<\/span><span style=\"color: #e6db74; font-weight: bold;\">\\[<\/span><span style=\"color: #e6db74;\">(<\/span><span style=\"color: #e6db74;\">.*?<\/span><span style=\"color: #e6db74;\">)<\/span><span style=\"color: #e6db74; font-weight: bold;\">\\]<\/span><span style=\"color: #e6db74; font-weight: bold;\">\\]<\/span><span style=\"color: #e6db74; font-weight: bold;\">\\<\/span><span style=\"color: #e6db74; font-weight: bold;\">(<\/span><span style=\"color: #e6db74; font-weight: bold;\">\\<\/span><span style=\"color: #e6db74; font-weight: bold;\">(<\/span><span style=\"color: #e6db74;\">(<\/span><span style=\"color: #e6db74;\">.*?<\/span><span style=\"color: #e6db74;\">)<\/span><span style=\"color: #e6db74; font-weight: bold;\">\\<\/span><span style=\"color: #e6db74; font-weight: bold;\">)<\/span><span style=\"color: #e6db74; font-weight: bold;\">\\<\/span><span style=\"color: #e6db74; font-weight: bold;\">)<\/span><span style=\"color: #e6db74;\">\"<\/span> sentence<span style=\"color: #f92672;\">)<\/span><span style=\"color: #fd971f;\">]<\/span>\n        <span style=\"color: #fd971f;\">(<\/span><span style=\"color: #f92672;\">loop<\/span> <span style=\"color: #f92672;\">[<\/span>i 0\n               tag <span style=\"color: #ae81ff;\">(<\/span>first tags<span style=\"color: #ae81ff;\">)<\/span>\n               rtags <span style=\"color: #ae81ff;\">(<\/span>rest tags<span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #f92672;\">]<\/span>\n          <span style=\"color: #f92672;\">(<\/span><span style=\"color: #f92672;\">when-not<\/span> <span style=\"color: #ae81ff;\">(<\/span>= i <span style=\"color: #66d9ef;\">(<\/span>count tags<span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n            <span style=\"color: #75715e; font-style: italic;\">;; <\/span><span style=\"color: #75715e; font-style: italic;\">Disambiguate concepts<\/span>\n            <span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">let<\/span> <span style=\"color: #66d9ef;\">[<\/span>word <span style=\"color: #a6e22e;\">(<\/span>second tag<span style=\"color: #a6e22e;\">)<\/span>\n                  concepts <span style=\"color: #a6e22e;\">(<\/span>last tag<span style=\"color: #a6e22e;\">)<\/span>\n                  concept <span style=\"color: #a6e22e;\">(<\/span>get-tag-concept concepts<span style=\"color: #a6e22e;\">)<\/span>\n                  label <span style=\"color: #a6e22e;\">(<\/span>second <span style=\"color: #ae81ff;\">(<\/span>re-find #<span style=\"color: #e6db74;\">\"<\/span><span style=\"color: #e6db74;\">::<\/span><span style=\"color: #e6db74;\">(<\/span><span style=\"color: #e6db74;\">.*<\/span><span style=\"color: #e6db74;\">)<\/span><span style=\"color: #e6db74;\">\"<\/span> concepts<span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span>\n                  ambiguous-concepts <span style=\"color: #a6e22e;\">(<\/span><span style=\"color: #66d9ef;\">string<\/span><span style=\"color: #66d9ef;\">\/<\/span>split <span style=\"color: #ae81ff;\">(<\/span>first <span style=\"color: #66d9ef;\">(<\/span><span style=\"color: #66d9ef;\">string<\/span><span style=\"color: #66d9ef;\">\/<\/span>split concepts #<span style=\"color: #e6db74;\">\"<\/span><span style=\"color: #e6db74;\">::<\/span><span style=\"color: #e6db74;\">\"<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span> #<span style=\"color: #e6db74;\">\" \"<\/span><span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">]<\/span>\n              <span style=\"color: #66d9ef;\">(<\/span><span style=\"color: #f92672;\">when<\/span> <span style=\"color: #a6e22e;\">(<\/span>&gt; <span style=\"color: #ae81ff;\">(<\/span>count ambiguous-concepts<span style=\"color: #ae81ff;\">)<\/span> 1<span style=\"color: #a6e22e;\">)<\/span>\n                <span style=\"color: #a6e22e;\">(<\/span><span style=\"color: #f92672;\">let<\/span> <span style=\"color: #ae81ff;\">[<\/span>predicted-label <span style=\"color: #66d9ef;\">(<\/span>rand-nth <span style=\"color: #a6e22e;\">(<\/span>into ambiguous-concepts <span style=\"color: #e6db74;\">[<\/span><span style=\"color: #ae81ff;\">nil<\/span><span style=\"color: #e6db74;\">]<\/span><span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">]<\/span>\n                  <span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">when<\/span> <span style=\"color: #66d9ef;\">(<\/span><span style=\"color: #f92672;\">and<\/span> <span style=\"color: #a6e22e;\">(<\/span>= label predicted-label<span style=\"color: #a6e22e;\">)<\/span>\n                             <span style=\"color: #a6e22e;\">(<\/span>not <span style=\"color: #e6db74;\">(<\/span>empty? label<span style=\"color: #e6db74;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span>\n                             <span style=\"color: #a6e22e;\">(<\/span>not <span style=\"color: #e6db74;\">(<\/span>empty? predicted-label<span style=\"color: #e6db74;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span>\n                    <span style=\"color: #66d9ef;\">(<\/span>swap! true-positive inc<span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n\n                  <span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">when<\/span> <span style=\"color: #66d9ef;\">(<\/span><span style=\"color: #f92672;\">and<\/span> <span style=\"color: #a6e22e;\">(<\/span>not <span style=\"color: #e6db74;\">(<\/span>= label predicted-label<span style=\"color: #e6db74;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span>\n                             <span style=\"color: #a6e22e;\">(<\/span>not <span style=\"color: #e6db74;\">(<\/span>empty? predicted-label<span style=\"color: #e6db74;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span>\n                    <span style=\"color: #66d9ef;\">(<\/span>swap! false-positive inc<span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n\n                  <span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">when<\/span> <span style=\"color: #66d9ef;\">(<\/span><span style=\"color: #f92672;\">and<\/span> <span style=\"color: #a6e22e;\">(<\/span>empty? label<span style=\"color: #a6e22e;\">)<\/span>\n                             <span style=\"color: #a6e22e;\">(<\/span>empty? predicted-label<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span>\n                    <span style=\"color: #66d9ef;\">(<\/span>swap! true-negative inc<span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n\n                  <span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">when<\/span> <span style=\"color: #66d9ef;\">(<\/span><span style=\"color: #f92672;\">and<\/span> <span style=\"color: #a6e22e;\">(<\/span>not <span style=\"color: #e6db74;\">(<\/span>empty? label<span style=\"color: #e6db74;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span>\n                             <span style=\"color: #a6e22e;\">(<\/span>empty? predicted-label<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span>\n                    <span style=\"color: #66d9ef;\">(<\/span>swap! false-negative inc<span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span>\n              <span style=\"color: #66d9ef;\">(<\/span><span style=\"color: #f92672;\">recur<\/span> <span style=\"color: #a6e22e;\">(<\/span>inc i<span style=\"color: #a6e22e;\">)<\/span>\n                     <span style=\"color: #a6e22e;\">(<\/span>first rtags<span style=\"color: #a6e22e;\">)<\/span>\n                     <span style=\"color: #a6e22e;\">(<\/span>rest rtags<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #f92672;\">)<\/span><span style=\"color: #fd971f;\">)<\/span><span style=\"color: #e6db74;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span>\n\n    <span style=\"color: #a6e22e;\">(<\/span><span style=\"color: #f92672;\">if<\/span> <span style=\"color: #e6db74;\">(<\/span>= 0 @true-positive<span style=\"color: #e6db74;\">)<\/span>\n      <span style=\"color: #e6db74;\">(<\/span><span style=\"color: #f92672;\">let<\/span> <span style=\"color: #fd971f;\">[<\/span>precision 0\n            recall 0\n            accuracy 0\n            f1 0<span style=\"color: #fd971f;\">]<\/span>\n        <span style=\"color: #fd971f;\">{<\/span><span style=\"color: #ae81ff;\">:precision<\/span> precision\n         <span style=\"color: #ae81ff;\">:recall<\/span> recall\n         <span style=\"color: #ae81ff;\">:accuracy<\/span> accuracy\n         <span style=\"color: #ae81ff;\">:f1<\/span> f1<span style=\"color: #fd971f;\">}<\/span><span style=\"color: #e6db74;\">)<\/span>\n      <span style=\"color: #e6db74;\">(<\/span><span style=\"color: #f92672;\">let<\/span> <span style=\"color: #fd971f;\">[<\/span>precision <span style=\"color: #f92672;\">(<\/span>float <span style=\"color: #ae81ff;\">(<\/span>\/ @true-positive <span style=\"color: #66d9ef;\">(<\/span>+ @true-positive @false-positive<span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #f92672;\">)<\/span>\n            recall <span style=\"color: #f92672;\">(<\/span>float <span style=\"color: #ae81ff;\">(<\/span>\/ @true-positive <span style=\"color: #66d9ef;\">(<\/span>+ @true-positive @false-negative<span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #f92672;\">)<\/span>\n            accuracy <span style=\"color: #f92672;\">(<\/span>float <span style=\"color: #ae81ff;\">(<\/span>\/ <span style=\"color: #66d9ef;\">(<\/span>+ @true-positive @true-negative<span style=\"color: #66d9ef;\">)<\/span> <span style=\"color: #66d9ef;\">(<\/span>+ @true-positive @false-negative @false-positive @true-negative<span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #f92672;\">)<\/span>\n            f1 <span style=\"color: #f92672;\">(<\/span>float <span style=\"color: #ae81ff;\">(<\/span>* 2 <span style=\"color: #66d9ef;\">(<\/span>\/ <span style=\"color: #a6e22e;\">(<\/span>* precision recall<span style=\"color: #a6e22e;\">)<\/span> <span style=\"color: #a6e22e;\">(<\/span>+ precision recall<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span><span style=\"color: #f92672;\">)<\/span><span style=\"color: #fd971f;\">]<\/span>\n        <span style=\"color: #fd971f;\">{<\/span><span style=\"color: #ae81ff;\">:precision<\/span> precision\n         <span style=\"color: #ae81ff;\">:recall<\/span> recall\n         <span style=\"color: #ae81ff;\">:accuracy<\/span> accuracy\n         <span style=\"color: #ae81ff;\">:f1<\/span> f1<span style=\"color: #fd971f;\">}<\/span><span style=\"color: #e6db74;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n<\/pre>\n<\/div>\n<div class=\"org-src-container\">\n<pre class=\"src src-clojure\"><span style=\"color: #ae81ff;\">(<\/span><span style=\"color: #f92672;\">let<\/span> <span style=\"color: #66d9ef;\">[<\/span>f1 <span style=\"color: #a6e22e;\">(<\/span>atom 0<span style=\"color: #a6e22e;\">)<\/span>\n      precision <span style=\"color: #a6e22e;\">(<\/span>atom 0<span style=\"color: #a6e22e;\">)<\/span>\n      recall <span style=\"color: #a6e22e;\">(<\/span>atom 0<span style=\"color: #a6e22e;\">)<\/span>\n      accuracy <span style=\"color: #a6e22e;\">(<\/span>atom 0<span style=\"color: #a6e22e;\">)<\/span>\n      nb <span style=\"color: #a6e22e;\">(<\/span>atom 0<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">]<\/span>\n  <span style=\"color: #66d9ef;\">(<\/span><span style=\"color: #f92672;\">dotimes<\/span> <span style=\"color: #a6e22e;\">[<\/span>i 1000<span style=\"color: #a6e22e;\">]<\/span>\n    <span style=\"color: #a6e22e;\">(<\/span><span style=\"color: #f92672;\">let<\/span> <span style=\"color: #e6db74;\">[<\/span>results <span style=\"color: #fd971f;\">(<\/span>evaluate-disambiguation-model-random <span style=\"color: #e6db74;\">\"resources\/disambiguation-gold.standard.csv\"<\/span><span style=\"color: #fd971f;\">)<\/span><span style=\"color: #e6db74;\">]<\/span>\n      <span style=\"color: #e6db74;\">(<\/span>swap! nb inc<span style=\"color: #e6db74;\">)<\/span>\n      <span style=\"color: #e6db74;\">(<\/span>swap! f1 + <span style=\"color: #fd971f;\">(<\/span><span style=\"color: #ae81ff;\">:f1<\/span> results<span style=\"color: #fd971f;\">)<\/span><span style=\"color: #e6db74;\">)<\/span>\n      <span style=\"color: #e6db74;\">(<\/span>swap! precision + <span style=\"color: #fd971f;\">(<\/span><span style=\"color: #ae81ff;\">:precision<\/span> results<span style=\"color: #fd971f;\">)<\/span><span style=\"color: #e6db74;\">)<\/span>\n      <span style=\"color: #e6db74;\">(<\/span>swap! recall + <span style=\"color: #fd971f;\">(<\/span><span style=\"color: #ae81ff;\">:recall<\/span> results<span style=\"color: #fd971f;\">)<\/span><span style=\"color: #e6db74;\">)<\/span>\n      <span style=\"color: #e6db74;\">(<\/span>swap! accuracy + <span style=\"color: #fd971f;\">(<\/span><span style=\"color: #ae81ff;\">:accuracy<\/span> results<span style=\"color: #fd971f;\">)<\/span><span style=\"color: #e6db74;\">)<\/span><span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span>\n  <span style=\"color: #66d9ef;\">(<\/span>println <span style=\"color: #e6db74;\">\"Average precision: \"<\/span> <span style=\"color: #a6e22e;\">(<\/span>\/ @precision @nb<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span>\n  <span style=\"color: #66d9ef;\">(<\/span>println <span style=\"color: #e6db74;\">\"Average recall: \"<\/span> <span style=\"color: #a6e22e;\">(<\/span>\/ @recall @nb<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span>\n  <span style=\"color: #66d9ef;\">(<\/span>println <span style=\"color: #e6db74;\">\"Average accuracy: \"<\/span> <span style=\"color: #a6e22e;\">(<\/span>\/ @accuracy @nb<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span>\n  <span style=\"color: #66d9ef;\">(<\/span>println <span style=\"color: #e6db74;\">\"Average F1: \"<\/span> <span style=\"color: #a6e22e;\">(<\/span>\/ @f1 @nb<span style=\"color: #a6e22e;\">)<\/span><span style=\"color: #66d9ef;\">)<\/span><span style=\"color: #ae81ff;\">)<\/span>\n<\/pre>\n<\/div>\n<pre class=\"example\">Average precision:  0.3583671303540468\nAverage recall:  0.4984975547492504\nAverage accuracy:  0.295875000089407\nAverage F1:  0.4158918097615242\n<\/pre>\n<p>This gives us a baseline <code>precision<\/code> score of <code>0.358<\/code>, <code>recall<\/code> score of <code>0.498<\/code>, <code>accuracy<\/code> score of <code>0.295<\/code> and <code>F1<\/code> score of <code>0.4158<\/code>. As we can notice if we compare with the random baseline with the full gold standard is that when we focus on the words which have multiple possible senses, the problem becomes much harder.<\/p>\n<\/div>\n<\/div>\n<div id=\"outline-container-org0bc6f34\" class=\"outline-4\">\n<h4 id=\"org0bc6f34\">Evaluation Using First Sense Context<\/h4>\n<div id=\"text-org0bc6f34\" class=\"outline-text-4\">\n<div class=\"org-src-container\">\n<pre class=\"src src-clojure\"><span style=\"color: #ae81ff;\">(<\/span>svm-grid-search graph <span style=\"color: #ae81ff;\">:grid-parameters<\/span> <span style=\"color: #66d9ef;\">{<\/span><span style=\"color: #ae81ff;\">:window-size<\/span> <span style=\"color: #a6e22e;\">[<\/span>10 15 20<span style=\"color: #a6e22e;\">]<\/span>\n                                         <span style=\"color: #ae81ff;\">:vector-size<\/span> <span style=\"color: #a6e22e;\">[<\/span>64 128 256<span style=\"color: #a6e22e;\">]<\/span>\n                                         <span style=\"color: #ae81ff;\">:walk-length<\/span> <span style=\"color: #a6e22e;\">[<\/span>5 10 15<span style=\"color: #a6e22e;\">]<\/span>\n                                         <span style=\"color: #ae81ff;\">:walks-per-vertex<\/span> <span style=\"color: #a6e22e;\">[<\/span>32 64 128<span style=\"color: #a6e22e;\">]<\/span>\n                                         <span style=\"color: #ae81ff;\">:angle<\/span> <span style=\"color: #a6e22e;\">[<\/span>60 90 120<span style=\"color: #a6e22e;\">]<\/span>\n                                         <span style=\"color: #ae81ff;\">:selection-metrics<\/span> <span style=\"color: #a6e22e;\">[<\/span><span style=\"color: #ae81ff;\">:precision<\/span> <span style=\"color: #ae81ff;\">:accuracy<\/span> <span style=\"color: #ae81ff;\">:recall<\/span> <span style=\"color: #ae81ff;\">:f1<\/span><span style=\"color: #a6e22e;\">]<\/span><span style=\"color: #66d9ef;\">}<\/span><span style=\"color: #ae81ff;\">)<\/span>\n<\/pre>\n<\/div>\n<pre class=\"example\">{:precision\n {:window-size 15,\n  :vector-size 128,\n  :walk-length 5,\n  :walks-per-vertex 32,\n  :angle 60,\n  :score 0.4397163},\n :accuracy\n {:window-size 15,\n  :vector-size 256,\n  :walk-length 5,\n  :walks-per-vertex 32,\n  :angle 60,\n  :score 0.41875},\n :recall\n {:window-size 10,\n  :vector-size 64,\n  :walk-length 5,\n  :walks-per-vertex 32,\n  :angle 90,\n  :score 1.0},\n :f1\n {:window-size 15,\n  :vector-size 256,\n  :walk-length 5,\n  :walks-per-vertex 32,\n  :angle 60,\n  :score 0.58295965}}\n<\/pre>\n<p>For the same <code>first-sense-context<\/code> disambiguation algorithm, once the key hyperparameters of the pipeline are optimized, then we endup with a <code>F1<\/code> score of <code>0.5829<\/code> which is an increase of <code>40.19%<\/code> over the baseline. <code>Recall<\/code> increases by <code>100.64%<\/code>, <code>accuracy<\/code> increases by <code>41.55%<\/code> and <code>precision<\/code> increases by <code>22.72%<\/code>.<\/p>\n<\/div>\n<\/div>\n<div id=\"outline-container-org8afcf4a\" class=\"outline-4\">\n<h4 id=\"org8afcf4a\">Evaluation Without Target Word&#8217;s Senses<\/h4>\n<div id=\"text-org8afcf4a\" class=\"outline-text-4\">\n<p>Now let&#8217;s re-run the procedure when we don&#8217;t focus on the target word that we want to disambiguate to create the context vector.<\/p>\n<div class=\"org-src-container\">\n<pre class=\"src src-clojure\"><span style=\"color: #ae81ff;\">(<\/span>svm-grid-search graph <span style=\"color: #ae81ff;\">:grid-parameters<\/span> <span style=\"color: #66d9ef;\">{<\/span><span style=\"color: #ae81ff;\">:window-size<\/span> <span style=\"color: #a6e22e;\">[<\/span>10 15 20<span style=\"color: #a6e22e;\">]<\/span>\n                                         <span style=\"color: #ae81ff;\">:vector-size<\/span> <span style=\"color: #a6e22e;\">[<\/span>64 128 256<span style=\"color: #a6e22e;\">]<\/span>\n                                         <span style=\"color: #ae81ff;\">:walk-length<\/span> <span style=\"color: #a6e22e;\">[<\/span>5 10 15<span style=\"color: #a6e22e;\">]<\/span>\n                                         <span style=\"color: #ae81ff;\">:walks-per-vertex<\/span> <span style=\"color: #a6e22e;\">[<\/span>32 64 128<span style=\"color: #a6e22e;\">]<\/span>\n                                         <span style=\"color: #ae81ff;\">:angle<\/span> <span style=\"color: #a6e22e;\">[<\/span>60 90 120<span style=\"color: #a6e22e;\">]<\/span>\n                                         <span style=\"color: #ae81ff;\">:selection-metrics<\/span> <span style=\"color: #a6e22e;\">[<\/span><span style=\"color: #ae81ff;\">:precision<\/span> <span style=\"color: #ae81ff;\">:accuracy<\/span> <span style=\"color: #ae81ff;\">:recall<\/span> <span style=\"color: #ae81ff;\">:f1<\/span><span style=\"color: #a6e22e;\">]<\/span><span style=\"color: #66d9ef;\">}<\/span><span style=\"color: #ae81ff;\">)<\/span>\n<\/pre>\n<\/div>\n<pre class=\"example\">{:precision\n {:window-size 10,\n  :vector-size 128,\n  :walk-length 5,\n  :walks-per-vertex 32,\n  :angle 90,\n  :score 0.38931298},\n :accuracy\n {:window-size 10,\n  :vector-size 128,\n  :walk-length 5,\n  :walks-per-vertex 32,\n  :angle 120,\n  :score 0.38125},\n :recall\n {:window-size 10,\n  :vector-size 64,\n  :walk-length 5,\n  :walks-per-vertex 32,\n  :angle 120,\n  :score 1.0},\n :f1\n {:window-size 10,\n  :vector-size 128,\n  :walk-length 5,\n  :walks-per-vertex 32,\n  :angle 120,\n  :score 0.55203617}}\n<\/pre>\n<p>In this case, the <code>F1<\/code> score increased by <code>32.76%<\/code> over the baseline. <code>Recall<\/code> increases by <code>238.07%<\/code>, <code>accuracy<\/code> increases by <code>28.87%<\/code> and <code>precision<\/code> increases by <code>8.65%<\/code>. Like we experienced previously, all of the results decreased compared to the version of the algorithm that included the sense of the target word in the calculation of the context vector. This option is not improving matters, and should be dropped from further consideration.<\/p>\n<\/div>\n<\/div>\n<div id=\"outline-container-org40fa095\" class=\"outline-4\">\n<h4 id=\"org40fa095\">Evaluation With Multiple Contexts<\/h4>\n<div id=\"text-org40fa095\" class=\"outline-text-4\">\n<p>Finally let&#8217;s evaluate the other modification we did that keeps the context vector that best matches one of the senses for the target word.<\/p>\n<div class=\"org-src-container\">\n<pre class=\"src src-clojure\"><span style=\"color: #ae81ff;\">(<\/span>svm-grid-search graph <span style=\"color: #ae81ff;\">:grid-parameters<\/span> <span style=\"color: #66d9ef;\">{<\/span><span style=\"color: #ae81ff;\">:window-size<\/span> <span style=\"color: #a6e22e;\">[<\/span>10 15 20<span style=\"color: #a6e22e;\">]<\/span>\n                                         <span style=\"color: #ae81ff;\">:vector-size<\/span> <span style=\"color: #a6e22e;\">[<\/span>64 128 256<span style=\"color: #a6e22e;\">]<\/span>\n                                         <span style=\"color: #ae81ff;\">:walk-length<\/span> <span style=\"color: #a6e22e;\">[<\/span>5 10 15<span style=\"color: #a6e22e;\">]<\/span>\n                                         <span style=\"color: #ae81ff;\">:walks-per-vertex<\/span> <span style=\"color: #a6e22e;\">[<\/span>32 64 128<span style=\"color: #a6e22e;\">]<\/span>\n                                         <span style=\"color: #ae81ff;\">:angle<\/span> <span style=\"color: #a6e22e;\">[<\/span>60 90 120<span style=\"color: #a6e22e;\">]<\/span>\n                                         <span style=\"color: #ae81ff;\">:selection-metrics<\/span> <span style=\"color: #a6e22e;\">[<\/span><span style=\"color: #ae81ff;\">:precision<\/span> <span style=\"color: #ae81ff;\">:accuracy<\/span> <span style=\"color: #ae81ff;\">:recall<\/span> <span style=\"color: #ae81ff;\">:f1<\/span><span style=\"color: #a6e22e;\">]<\/span><span style=\"color: #66d9ef;\">}<\/span><span style=\"color: #ae81ff;\">)<\/span>\n<\/pre>\n<\/div>\n<pre class=\"example\">{:precision\n {:window-size 15,\n  :vector-size 64,\n  :walk-length 5,\n  :walks-per-vertex 32,\n  :angle 60,\n  :score 0.4566929},\n :accuracy\n {:window-size 15,\n  :vector-size 64,\n  :walk-length 5,\n  :walks-per-vertex 32,\n  :angle 90,\n  :score 0.44375},\n :recall\n {:window-size 15,\n  :vector-size 64,\n  :walk-length 5,\n  :walks-per-vertex 32,\n  :angle 90,\n  :score 0.9583333},\n :f1\n {:window-size 15,\n  :vector-size 64,\n  :walk-length 5,\n  :walks-per-vertex 32,\n  :angle 90,\n  :score 0.6079295}}\n<\/pre>\n<p>In this case, the <code>F1<\/code> score increased by <code>46.2%<\/code> over the baseline. <code>Recall<\/code> increases by <code>223.97%<\/code>, <code>accuracy<\/code> increases by <code>50%<\/code>, and <code>precision<\/code> increases by <code>27.44%<\/code>. Except for the <code>recall<\/code>, all the other metrics improved compared to the <code>first-sense-context<\/code> method. Though we previously found that there were no apparent improvements using this method against using the <code>first-sense-context<\/code> algorithm, when we look at the words that have multiple senses only, we see that there is a clear improvement on all the metrics except for the <code>recall<\/code>, which is slightly lower. This suggests that in a situation where multiple words have multiple possible senses, then this method should outperform the <code>first-sense-context<\/code> one.<\/p>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div id=\"outline-container-org75d0d0a\" class=\"outline-2\">\n<h2 id=\"org75d0d0a\">Conclusion<\/h2>\n<div id=\"text-org75d0d0a\" class=\"outline-text-2\">\n<p>Disambiguating tagged concepts from a knowledge graph such as KBpedia is not an easy task. However, if a coherent and consistent knowledge graph such as KBpedia is used as the starting point, along with existing machine learning algorithms such as DeepWalk, it is possible to obtain good disambiguation performance in a short period of time without the need to create extensive training sets.<\/p>\n<p>With the tests we performed in this article, we gained insight into the kinds of things that may have much influence on the disambiguation process. We also learned how different strategies could be tested to get the best results for a specific disambiguation task.<\/p>\n<p>Results from these disambiguation processes can easily become the input into other tasks such as a document classification. [Disambiguated], tagged concepts should be a better basis for classifying the documents. If the concepts are not disambiguated, then much noise is added, leading to lower performance of the classification task.<\/p>\n<p>There are two main advantages of using such a disambiguation system. The first advantage is that we have no need to create any kind of training set, which would obviously be quite expensive to create considering that hundreds of contexts for each concept within the knowledge graph would need to be disambiguated manually. With a knowledge graph of the size of KBpedia, that would mean a training set of about 3.8 million examples. But since the training takes place on the structure of the knowledge graph that already exists, there is no need to create such a training set. Also, different parts of the knowledge graph could be used to create different kinds of knowledge structures ([i.e.], in this article we used the sub-class-of paths within the knowledge structure, but we could have used different aspects, different relationship paths, etc.). Alternate knowledge structures may result in different graph embeddings that may perform better in different situations.<\/p>\n<p>The second advantage is that since we only use simple linear algebra to disambiguate the concepts, performance is really fast. On average, we can disambiguate a single word within <code>0.65<\/code> milliseconds which means that we can disambiguate about 1,538 words per second with this method on a commodity desktop computer.<\/p>\n<p>Many more experiments could be performed to increase the performance of these methods, while retaining the ease of model creation and the disambiguation speed. New context algorithms could be tested, bigger sliding windows, etc. New graph embedding creation algorithms could be tested. New DeepWalk graph walk algorithms could be used such as breadth and depth first searches, etc.<\/p>\n<\/div>\n<\/div>\n<div id=\"footnotes\">\n<h2 class=\"footnotes\">Footnotes<\/h2>\n<div id=\"text-footnotes\">\n<div class=\"footdef\">\n<p><sup><a id=\"fn.1\" class=\"footnum\" href=\"#fnr.1\">1<\/a><\/sup>Perozzi, B., Al-Rfou, R., &amp; Skiena, S. (2014, August). Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 701-710). ACM.<\/p>\n<\/div>\n<\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>One of the most important natural language processing tasks is to &#8220;tag&#8221; concepts in text. Tagging a concept means determining whether words or phrases in a text document matches any of the concepts that exist in some kind of a knowledge structure (such as a knowledge graph, an ontology, a taxonomy, a vocabulary, etc.). (BTW, [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[293,251,287],"tags":[263,252,299,300,296,289],"class_list":["post-3564","post","type-post","status-publish","format-standard","hentry","category-artificial-intelligence","category-clojure","category-cognonto","tag-ai","tag-clojure-2","tag-deeplearning4j","tag-disambiguation","tag-knowledgegraph","tag-machinelearning"],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/fgiasson.com\/blog\/index.php\/wp-json\/wp\/v2\/posts\/3564","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/fgiasson.com\/blog\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/fgiasson.com\/blog\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/fgiasson.com\/blog\/index.php\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/fgiasson.com\/blog\/index.php\/wp-json\/wp\/v2\/comments?post=3564"}],"version-history":[{"count":9,"href":"https:\/\/fgiasson.com\/blog\/index.php\/wp-json\/wp\/v2\/posts\/3564\/revisions"}],"predecessor-version":[{"id":3602,"href":"https:\/\/fgiasson.com\/blog\/index.php\/wp-json\/wp\/v2\/posts\/3564\/revisions\/3602"}],"wp:attachment":[{"href":"https:\/\/fgiasson.com\/blog\/index.php\/wp-json\/wp\/v2\/media?parent=3564"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/fgiasson.com\/blog\/index.php\/wp-json\/wp\/v2\/categories?post=3564"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/fgiasson.com\/blog\/index.php\/wp-json\/wp\/v2\/tags?post=3564"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}