Data scientists at Google Cloud recently published a paper extolling the virtues of its open-source algorithm that can perform more extensive and accurate patent searches faster and improve patent quality.
“Today, we are excited to release a white paper that outlines a methodology to train a BERT (bidirectional encoder representation from transformers) model on over 100 million patent publications from the U.S. and other countries using open-source tooling,” said Google Patents data scientists Rob Srebrovic and Jay Yonamine.
“The paper describes how to use the trained model for a number of use cases, including how to more effectively perform prior art searching to determine the novelty of a patent application, automatically generate classification codes to assist with patent categorization, and autocomplete.”
Google released the BERT model in 2018 (paper, original blog post). The company said that it marked a major advancement in natural language processing by “dramatically outperforming existing state-of-the-art frameworks across a swath of language modeling tasks.”
Natural language processing using cloud AI is the new state-of-the art for many types of searches.
Google Cloud analysis of BERT vs. other search algorithms
Transformer-based approaches like BERT have supplanted other models across a number of language prediction tasks. As a result, say Google researchers, BERT and its extensions have quickly become widely adopted.
In the white paper, Serbrovic and Yonomine provide the first application of the BERT algorithm “trained exclusively on patent text, focusing primarily on the use case of synonym generation but also highlighting additional use cases for general classification and autocomplete.”
“Our hope,” conclude the researchers, “is that this can help practitioners in corporations, academia, and governmental patent offices get started with the BERT framework and apply it to new use cases and research initiatives.
The USPTO is among those working on AI tools to conduct patent searches. WIPO lists 65 AI initiatives here. Many private businesses are also developing AI search and analysis tool, including Clear Access IP.
There are several paid patent databases that operate on deep-learning AI and natural language processing, from companies such as Questel, Derwent Innovation, Clarivate and PatSnap, which provide a comprehensive guide to portfolio analysis, document comparison and high-performance searches across a large number of countries
With the goal of faster, more accurate patent examinations and better quality and reliability, one hopes that technology companies, large and small, will respect the improved output from AI patent search, as well as the increased prospect of non-contentious licensing.
Image source: Google Cloud; voxafrica.com