Patents for a Particular City

Say someone asks for help getting a list of utility patents from Los Angeles, what would you do? You can’t use the USPTO’s peds api, it does not offer a search by location. You could use the USPTO’s patft, a search for ic/”Los Angeles” and is/CA and apt/1 would do the trick (ic/ matches the inventor’s city and is/ matches the inventor’s state and apt/1 matches the application type of utility patents). The problem is that it only returns 50 patents at a time (29,148patents met those conditions at the time this was written). There isn’t a download option so your mouse finger would get quite tired from all the clicking you’d have to do.

One solution would be to use the query tool provided by the patentsview api! Like the USPTO’s patft and peds searches, it will return data on patents issued in 1976 onward. It takes a couple of screens to enter what you want, but if you stick with it, it will email you a link to a csv file (or json file) of the patents that met your criteria.

Here’s what I just did: Click on the Advanced Search link on the query tool page. Under the Patents section set Patent Type equals Utility and then click on the +Add to Search link. Under the Inventors section set the fields to “Inventor Location At Issue equals United States (country) California (state) Los Angeles (City)” and then click the +Add to Search link. Then click the Submit Search link.

On the subsequent screen you get to pick the fields you want returned. I selected my favorite fields as the request for help didn’t specify any.

Click on the Preview Query when you’ve selected all the fields you want. You then can specify a sort order, enter your email address, prove you are not a robot and click on the Submit Query link. That’s it! A short while later an emailed arrived with a link to my csv file.

A couple of things to point out about my field choices: I included the inventor sequence. There can be, and usually are, multiple inventors on a patent from potentially different cities or even different countries. Each inventor will be on a separate row in the csv file, so a patent with five inventors will have five rows in the csv file. A sequence of 0 indicates the first inventor on the patent. I could have added that to my search criteria as shown below. Again, what constitutes a patent from Los Angeles was not specified (any inventor from there or only if the first inventor hails from there or possibly an assignee from there). I also selected the pre disambiguated names as sometime the api would massage the names in an effort to be helpful. (They try to figure out if John Doe on one patent is the same person as John Q Doe on another patent; if they think they are, they’d change the data and use one name consistently.)

Another thing to point out is that this only includes inventors where the USPTO’s city field is Los Angeles (the patentsview database is built from bulk patent files the USPTO makes available to anyone). There are patents with a city of “late of Los Angeles” (to indicate a deceased inventor) or Los Angeles County that would not be included in the query I made. I’d have to do separate queries and merge the csv files to include these patents.

One important api limitation to point out is that it will only return 100,000 patents, the result set will be silently truncated. We’re fine in this case, we were safely under the limit. If you search for Toyko, Japan however you would reach the limit. You’d have to make multiple queries using the patent’s grant date for example (add in that the issue date was before a specific date and then a separate query adding in that the issue date was greater than or equal to that date). You’d have to play around with each query to make sure that less than 100,000 patents are returned.

Also of note, the patentsview database is updated roughly quarterly while the USPTO’s patft is updated weekly. Your patentsview results may not contain the most recently issued patents. Oh, and as one last, slightly troubling caveat, the patentsview database contains all the patents in the bulk files just mentioned, even around 8,000 patents which were withdrawn after issue. This means that there would be a small chance (8,000 withdrawn patents are included in the database of roughly five million patents) that some of the patents in your csv file have been withdrawn. The USPTO’s patft does not return data for patents that have been withdrawn. I’ve pointed this out to the patentsview team but they haven’t taken any action yet.

X Patents

There was a fire at the patent office in 1836 that destroyed a lot of patents and patent models. The only other copies were with the inventors. At the time, patents were not numbered as they are today. The patents known to have been issued, whether the surviving documentation was recovered or not, were assigned an X patent number. Only around 2,600 of the 10,00 X Patents have been recovered.

Astonishingly, from time to time an unrecovered X Patent turns up, even after 185 years! datamp.org has been involved with two such findings as information on X Patents can be found there. In both cases the great great grandson of the original patent holder found datamp while looking for information about the patent documents they had. They are 5,125X and 6,037X.

Even more astonishing, someone I know has been actively involved in the recovery of a X Patents and, get this, has documented her efforts! Be sure to check out the link below.

Hampton, Barbara J. (2021) “Stalking the Wild X Patent,” Journal of the Patent and Trademark Resource Center Association: Vol. 31 , Article 4. Available at: https://tigerprints.clemson.edu/jprca/vol31/iss1/4

Google Patents

Google patents is a good thing! It can be difficult to find the patent associated with a tool marked with a patent date. Sometimes the digits of the date or year are not as clear as you’d like, is that a zero, eight, or nine? Even worse is the dreaded “Patent Applied For”, you have no idea if or when the patent was issued. My success rate is not great but I was able to find one once, using google patents.

The pictured item is a fillet cutter, marked only “Pat Applied For”, in this case I knew what the item was, it can be even more difficult to find a patent if you don’t know what the item is.

I could have done classification searches on the patent office’s web site but I would have had to look at a lot of patents. Instead, I searched google patents for fillet cutter and was successful after only viewing a few other fillet cutters, which wasn’t unpleasant at all!

Bulk Data Problems

The USPTO (United States Patent and Trademark Office) made some, but not all, of its data available to anyone who wanted to download it. The first thing to be aware of is that some of the patents in the bulk grant xml files were subsequently withdrawn. The last time I checked there were around 8,000 withdrawn patents in the bulk grant xml files. The second thing to be aware of is that there are around 300 granted patents whose data is inexplicably absent from the bulk grant files. Percentagewise, it’s a miniscule oversight but shouldn’t 100% of the granted patents be present in the bulk grant files? Hey USPTO, how about producing a catchup file that contains the missing granted patents?

The other flaws I have noticed have to do with the bulk classification files. The patent office stopped producing the bulk United States Patent Classification (USPC) file despite their continued use. It is true that since June 2015 utility patents are not assigned USPCs, but plant patents and design patents still receive them (also reissued plant or design patents). The last bulk USPC file produced ended with assignments for PP29260 and D816289, both issued April 24, 2018. (See the image above.)

The other classification in use is the Cooperative Patent Classification (CPC). There is a bulk CPC file but it only contains CPC assignments for utility patents. CPC assignments for reissued patents and plant patents are not included in the bulk data file. It seems to be well guarded secret, or at least not widely publicized, that plant patents receive CPC assignments. I’ve found that roughly half of all plant patents have received one or more CPC assignment as shown on this page of mine.

The real problem is that the patentsview api uses the bulk data files to build their database. From the above this means that they are missing the ~300 patents that are not present in the bulk grant files, plant patents after PP29260 and design patents after D816289 don’t have USPC assignments, approximately half of the 30,000 plant patents are missing their CPC assignments as are all reissued patents. By choice, they load all the data from the bulk grant xml files, so this means their database contains around 8,000 withdrawn patents (see my previous post on withdrawn patents). Take a look at the Data¬†Collection Phase that shows how they process the bulk files. The diagram above shows corrections that should be made to their loading process.

If you have been paying particular attention, you will have noticed that in the patentsview database plant patents after PP29260 are only searchable by their at-issue International Patent Classification (IPC), it’s the only classification system they have data for, as they lack a bulk post-PP29260-USPC file and a non-utility-patent bulk CPC file. There are lots of things you could do with patentviews, but one thing you can not do is effectively search plant patents by USPC or CPC.

The USPTO’s patft, its online search page, has all of the above with the exception of four missing granted, non-withdrawn patents that are not in the bulk grant files (6,287,179; 6,392,191; 6,394,333, and 6,558,580). patft has CPC assignments for plant and reissued patent, and USPC assignments for plant and design patents after PP29260 and D816289 but they do not make this data available as bulk download files.

Bulk data is great, depending on what you do with it, but the USPTO still has a monopoly on some of its data. I have been unsuccessful in my efforts to have this rectified and would appreciate any help!

Notations in the diagram above

  1. Patentsview should not load withdrawn patents into their database
  2. The USPTO should resume producing the bulk USPC file
  3. The bulk CPC file should contain all patents, not just utility patents

Withdrawn Patents

Patents get withdrawn from time to time. Some are never issued but some are withdrawn after being issued. In the latter case, data for the withdrawn patent can be found in the wild. The patent office maintains a list of withdrawn patents at http://www.uspto.gov/patents-application-process/patent-search/withdrawn-patent-number Separately, the patentsview api team processes the bulk grant patent xml files and makes their files available for download. If one compares the patentsview patent.tsv file to the patent office’s withdrawn patent list, one finds (or found at the time this was written) 7,930 patents in both files. The patent office removes withdrawn patents from its web site, they are not returned by searches but this is not the case with the patentsview api. It will return withdrawn patents, which is pretty bizarre. I don’t know of another patent platform that does that. I raised a git issue to point this out to the otherwise fine patentsview folks but nothing has changed. (Two take-aways here, one that there is data for withdrawn patents in the grant xml files and the other is that patentsview loads them into their database.)

Another source of data for withdrawn patents is the USPat dvds once produced by the patent office. The data is available for download as thousands of zip files containing tiff images of patents, both withdrawn ones and ones that were not withdrawn. In the zip files I have analyzed, I have found 5,191 withdrawn patents among the millions of patents that have not been withdrawn.

The last source that I know of for data on withdrawn patents is the Official Gazettes (OGs) produced by the patent office each week. Some patents appear in the OGS that are subsequently withdrawn. An example would be PP31,892 which would have been issued on June 23, 2020. That patent wasn’t in the grant xml for the patents granted on June 23, 2020 but it did appear the OG for that date. It is also listed on the patent office’s withdrawn patent page. Interestingly, PP31,893 was also withdrawn but it is not present in xml file for June 23, 2020 and the OG says “Patent Not Issued For This Number”. Above is an image that shows the OG entries for these two withdrawn patents.

A possible source, that I haven’t fully investigated, is Hathi Trust. They have scanned many of the OGs that were physically published. The last printed OG was September 24, 2002, more recent ones are only published electronically.

So if you are interested in withdrawn patents, they are out there! (That is, there may be xml data, tiffs and/or OG html and images available.) Oh, and another trick to finding which patents are withdrawn is to do a search in patft for ccl/WITHDRAWN, slightly nonsensical syntax but it works!

NYPL/UMD Plant Patent Project

One of the more surprising elements of plant patents is that their online images are in black and white! Patent and Trademark Resource Centers (PTRC) scattered across the US receive color copies of them but the online community is left guessing what each patented plant looks like in color. A few years ago, Ken Johnson at the PTRC in New York City’s Public Library (NYPL) began scanning the color copies they received. He put them online with the giant caveat that they cannot be used for legal purposes, only the official color copies can be used legally. One of the libraries at the University of Maryland (UMD) is also a PTRC and they have taken up scanning plant patents not scanned by the New York Public Library. So, if you are wondering what a particular plant patent looks like in color, head over to https://www.lib.umd.edu/plantpatents or http://www.nypl.org/collections/nypl-recommendations/guides/plant-patents-2012 Not all of the nearly 33,000 plant patents have been scanned, but they are working on it. Be sure to check out the UMD project’s credits page, I might be mentioned on it. Oh, and if you are curious what the rose plant above looks like, unofficially of course, it’s here.

Datamp

DATAMP = Directory of American Tool and Machinery Patents


If you are looking for the patent associated with an antique tool, head over to datamp.org. It’s quite possible you’ll be able to find the tool patent you are looking for among the 70,000 or so patents there. I’m a developer of the site and one of the data stewards that enters patent data so I highly recommend the site!

Here’s the most recent patent that was entered into datamp:

  • US Patent: 38,520US Patent: 38,520
    Improvement in tools for cutting and beveling barrel-heads Patentee: William Watkins - Crete IL Granted:1863-05-12 This plane "planes and bevels the edge of barrel heads". An example was sold by Martin Donnelly in 2016. This is one of 71,784 patents currently in the database at datamp.org

When apis fail you

Sometimes there isn’t a way other than screen scraping to get the data you want, which is unfortunate. I’d like to programmatically retrieve classification fields for the plant patents issued each Tuesday. I can’t use the patentsview api since its data lags behind, it’s updated roughly quarterly while the patent office’s site is updated each Tuesday. Plus the api does not return uspc classifications on newer plant patents as the patent offices has stopped producing the bulk file of them (the last file produced stopped with PP29260, issued April 24, 2018). The api also does not return cpcs that are now coming back on about half of the plant patents, as there is no bulk source of them (the bulk cpc file only contains utility patents, fans of reissued patents are also out of luck). See this page if you don’t believe me that some plant patents do get cpc assignments!

Similarly, I could use try to use the PEDS (Patent Examination Data System) api but only returns one uspc classification per patent when multiples are allowed1 and it also does not return cpcs. So, having no other free option, you can’t blame a guy if he makes requests weekly to patft and scrapes the page of data that is returned!

1If you want to check for yourself, these plant patents each had 4 uspc assignments when I scrapped them PP23484, PP23723, PP23924, PP24080, PP24201, PP24521, PP24634, PP24828. Compare peds and patentsview to patft to see the disparity.