Skip to content

Knorex R&D Team Publishes Research Paper on Data-less Machine Learning Algorithm for Text Classification

We are thrilled to announce that our Research & Prototyping Team’s (R&P) research paper relating to contextual targeting titled “Learning from noisy out-of-domain corpus using data-less classification” was accepted for publication in the Natural Language Engineering (NLE) journal, Cambridge University Press. NLE is one of the most reputable and longest-standing journals in Natural Language Processing (NLP).

If you are a data scientist or dealing with machine learning, you will understand the painstaking and tedious effort that one must go through to label documents as it forms the foundation for proper machine learning training datasets. Text classification models often suffer from a lack of accurately labeled documents. The available labeled documents may also be out of domain, making the trained model not able to perform well in the target domain.

Our team has invented a novel technology to build cross-domain contextual targeting models without any labeled data. It focuses on explaining data-less classification methods, to learn from the automatically selected keywords and unlabeled in-domain data. Our model has achieved state-of-the-art accuracy and has successfully outperformed methods trained on thousands of manually labeled web pages.

Kudos Yiping Jin & Phu Le from our R&P team!