Why is data clustering important? Data clustering is vital to make sense of a large set of patent data. It also helps in discovering new businesses and understanding the competition in those business areas. Whether you’re trying to find the “whitespace,” (areas that appear unpatented), understand what areas or jurisdictions competitors are working in, or how active one technology area is relative to another, it is essential to clearly define the project goals and apply them to the categorization process.
The experts must create the relevant taxonomy after going through the technology area in detail and understanding the objective of the analysis, to classify the results into more manageable and useful categories. The taxonomy should:
Each cluster/category in taxonomy is weighted based on the patent fields like Title, Abstract, Claims & Description. Depending on the objectives, users can select one or more patent fields. Normally, a patent is assigned a cluster if the information in the above fields properly captures it.
A single patent may be divided into many categories since certain clusters overlap.
Graphical visualizations between technology clusters (taxonomy) and other data points, such as portfolio comparison based on the taxonomy, technology cluster mapping, trend analysis on clusters, etc., could generate numerous insights.
Experts manually go through each patent to categorize them into the technology clusters. This process is very time-consuming and complex. Most patent analytics tools have come up with clustering engines that are based on ML and NLP where the tools automatically categorize the patents into technology clusters. However, the clustering engines are not transparent. Most AI-based clustering engines are based on a black-box approach, where the user has no idea what is happening at the backend of these engines and how or why the engine has categorized patents into respective categories.
XLSCOUT has developed a glass box approach, where we are bringing explainability to the patent landscapes by making the patent categorization explainable to the users. The explainable AI-based patent categorization has a validated accuracy of 95%. We believe that this transparency helps in the adoption of this new age technology giving more confidence to our users.