In my previous blog post, Create a Domain Text Classifier Using Cognonto, I explained how one can use the KBpedia Knowledge Graph to automatically create positive and negative training corpuses for different machine learning tasks. I explained how SVM classifiers could be trained and used to check if an input text belongs to the defined domain or not.
This article is the first of two articles.In first part I will extend on this idea to explain how the KBpedia Knowledge Graph can be used, along with other machine learning techniques, to cope with different situations and use cases. I will cover the concepts of feature selection, hyperparameter optimization, and ensemble learning (in part 2 of this series). The emphasis here is on the testing and refining of machine learners, versus the set up and configuration times that dominate other approaches.
Depending on the domain of interest, and depending on the required
recall, different strategies and techniques can lead to better predictions. More often than not, multiple different training corpuses, learners and hyperparameters need to be tested before ending up with the initial best possible prediction model. This is why I will strongly emphasize the fact that the KBpedia Knowledge Graph and Cognonto can be used to automate fully the creation of a wide range of different training corpuses, to create models, to optimize their hyperparameters, and to evaluate those models.