We’ve been closely monitoring the progress of the USPTO utilizing effective API design to provide free and open APIs for their vast store of patent information. Though it’s still early days, in general, we’re impressed. The Director of the US Patent and Trademark Office (USPTO) Michelle Lee has set out some positive goals for open data and API management and testing. The USPTO Open Data Portal, though still designated as beta, has been up and running for several months. It currently includes seven different APIs, in different stages of completion, with more being added approximately 1-2 per month.
The seven APIs are: Cancer Moonshot Patent Data, Patent Assignment Search, PatentsView, Bulk Data Storage System (BDSS), Unified Event API, and Patent Application Information Retrieval (PAIR) Bulk Data.
PatentsView is the most complete of the seven APIs. It has a dedicated website already available called USPTO Patents Website (beta) with a searchable interface and multiple settings. The PatentsView patent data is current through July 15, 2016, and besides the website, is available through the API and bulk downloads. PatentsView was included in a White House blog post earlier this year on Open Data initiatives.
Bulk Data Storage System (BDSS) allows the public to discover, search by date, and download patent and trademark data in bulk form. API Syntax and documentation is provided. Examples include grabbing the 2.46GB XML Patent Grant Data file that contains the full text, images/drawings, and complex work units (tables, mathematical expressions, chemical structures, and genetic sequence data) of each patent grant issued weekly (Tuesdays) in CY2016. Or the 3.75 GB XML Patent Grant Data file which contains the full text, images/drawings, and complex work units (tables, mathematical expressions, chemical structures, and genetic sequence data) of each patent application (non-provisional utility and plant) published weekly (Thursdays) in CY2016.
The Unified Event API is a comprehensive listing of USPTO related events, including public events, speeches, training, and much more.
PAIR Bulk Data (PBD) allows searching and downloading of USPTO patent applications. The PBD API contains the bibliographic, published document and patent term extension data tabs in PBD from 1981 to present. It’s possible to download the entire dataset covering over 9.4 million records.
Patents and Trial Appeal Board (PTAB) API allows access to the USPTO Patent Trial and Appeal Board publicly available documents data. PTAB requires no API identification or authentication keys. Currently, 7% of the public America Invents Action (AIA) documents filed prior to Aug. 31, 2016, is not available. A live search link for Patent Trial and Appeal Board (PTAB) Bulk Data is available where you can search by Trial Number, Document Number, Trial Date and more. Currently 387,086 items are listed.
Patent Assignment Search (Beta) API is used to retrieve patent assignment information. It contains all recorded Patent Assignment information from August 1980 to the present. An XML file with search results contains a list of found documents and their categorization. The Advanced Search tab allows filtering by Application number, Assignee name, Patent number, Publication number and much more.
Cancer Moonshot Patent Data Set API is new on the USPTO site. This curated dataset consists of 269,353 patent documents (published patent applications and granted patents) spanning the 1976 to 2016 period and is intended to help identify promising R&D on the horizon in diagnostics, therapeutics, data analytics, and model biological systems.