Below are issues I’ve noticed with the uspto’s web site, apis and the bulk data they provide. They do a lot of good things but, with tough love, I provide this list of opportunities for improvement.
- Fixed: Recently reissued patents to not seem to have cpc assignments
- The bulk cpc files on https://bulkdata.uspto.gov/ only contains cpc assignments for utility patents. Notably absent are cpc assignments for plant patents and those for reissued patents. This not only affects me, the patentsview api is missing these same cpcs since their source is the bulk file that only contains cpcs for utility patents.
- Listed here are patents that are not online at uspto.gov but are here on my website.
- Ditto for a little of 10,000 dead trademarks. They are not in the uspto’s tsdr system but are available on this site.
- Their PEDS web service has a number of what I consider flaws as listed here.
- The patentsview web service they support (sponsor?) also has issues as listed here
- patft has the document_number associated with each patent but the bulk granted patent xml files do not contain this field. This makes it difficult to determine if patent applications referenced by their document number ever went on to become granted patents. The field is present in appft and is in the patent application xml.
The bulk application xml has the document number and serial number for each application.
The bulk grant xml has the patent number, serial number and referenced applications by document numbers. If a patent’s document number was included in the grant xml it would be possible to determine if a referenced application was granted without having to download the multiple gigabytes of application xml.¹
- in patft the series code is displayed with the serial number but you cannot search using the series code. You can search by serial number but it’s not unique.
- Plant patents issued after November 1976 are generally in the fully searchable database in patft. For some reason, there are three modern plant patents that are not fully searchable. I also found a reissued patent, design patent and utility patent that are not fully searchable. These patents don’t appear to be in the bulk xml files the uspto produces. The patentsview api uses the xml files to build its database and their database is missing these patents. In all, I found 265 patents that aren’t fully searchable in patft.
- There are 40 modern patents that are in patft but are not in the bulk xml files
- PP6893 was filed and issued on the same day, while PP6724 was issued before its application date!
- There are 305 patents not in the bulk xml files. They are the 265 patents from #9 plus the 40 patents from #10.
- Fixed: There were 62 plant patents assigned to three cpcs which have been deleted. cpc/A01H5/0205 or cpc/A01H5/0222 or cpc/A01H5/0277
CPC Notice of Changes 501-RP0484 (A01H)
- PP1817’s second page is blank. Similarly PP1818’s first two pages are blank and PP3209’s second page is also blank. PP1015, PP1016, PP1225
¹The patentsview database is built from the roughly 2,000 bulk grant xml files that are available. It cannot determine if referenced applications went on to be granted patents since the document number field is not included in the bulk grant xml files. The grant xml files are available from 1976 on while application xml is only available from 2001.