Patents for a Particular City

Say someone asks for help getting a list of utility patents from Los Angeles, what would you do? You can’t use the USPTO’s peds api, it does not offer a search by location. You could use the USPTO’s patft, a search for ic/”Los Angeles” and is/CA and apt/1 would do the trick (ic/ matches the inventor’s city and is/ matches the inventor’s state and apt/1 matches the application type of utility patents). The problem is that it only returns 50 patents at a time (29,148patents met those conditions at the time this was written). There isn’t a download option so your mouse finger would get quite tired from all the clicking you’d have to do.

One solution would be to use the query tool provided by the patentsview api! Like the USPTO’s patft and peds searches, it will return data on patents issued in 1976 onward. It takes a couple of screens to enter what you want, but if you stick with it, it will email you a link to a csv file (or json file) of the patents that met your criteria.

Here’s what I just did: Click on the Advanced Search link on the query tool page. Under the Patents section set Patent Type equals Utility and then click on the +Add to Search link. Under the Inventors section set the fields to “Inventor Location At Issue equals United States (country) California (state) Los Angeles (City)” and then click the +Add to Search link. Then click the Submit Search link.

On the subsequent screen you get to pick the fields you want returned. I selected my favorite fields as the request for help didn’t specify any.

Click on the Preview Query when you’ve selected all the fields you want. You then can specify a sort order, enter your email address, prove you are not a robot and click on the Submit Query link. That’s it! A short while later an emailed arrived with a link to my csv file.

A couple of things to point out about my field choices: I included the inventor sequence. There can be, and usually are, multiple inventors on a patent from potentially different cities or even different countries. Each inventor will be on a separate row in the csv file, so a patent with five inventors will have five rows in the csv file. A sequence of 0 indicates the first inventor on the patent. I could have added that to my search criteria as shown below. Again, what constitutes a patent from Los Angeles was not specified (any inventor from there or only if the first inventor hails from there or possibly an assignee from there). I also selected the pre disambiguated names as sometime the api would massage the names in an effort to be helpful. (They try to figure out if John Doe on one patent is the same person as John Q Doe on another patent; if they think they are, they’d change the data and use one name consistently.)

Another thing to point out is that this only includes inventors where the USPTO’s city field is Los Angeles (the patentsview database is built from bulk patent files the USPTO makes available to anyone). There are patents with a city of “late of Los Angeles” (to indicate a deceased inventor) or Los Angeles County that would not be included in the query I made. I’d have to do separate queries and merge the csv files to include these patents.

One important api limitation to point out is that it will only return 100,000 patents, the result set will be silently truncated. We’re fine in this case, we were safely under the limit. If you search for Toyko, Japan however you would reach the limit. You’d have to make multiple queries using the patent’s grant date for example (add in that the issue date was before a specific date and then a separate query adding in that the issue date was greater than or equal to that date). You’d have to play around with each query to make sure that less than 100,000 patents are returned.

Also of note, the patentsview database is updated roughly quarterly while the USPTO’s patft is updated weekly. Your patentsview results may not contain the most recently issued patents. Oh, and as one last, slightly troubling caveat, the patentsview database contains all the patents in the bulk files just mentioned, even around 8,000 patents which were withdrawn after issue. This means that there would be a small chance (8,000 withdrawn patents are included in the database of roughly five million patents) that some of the patents in your csv file have been withdrawn. The USPTO’s patft does not return data for patents that have been withdrawn. I’ve pointed this out to the patentsview team but they haven’t taken any action yet.

Bulk Data Problems

The USPTO (United States Patent and Trademark Office) made some, but not all, of its data available to anyone who wanted to download it. The first thing to be aware of is that some of the patents in the bulk grant xml files were subsequently withdrawn. The last time I checked there were around 8,000 withdrawn patents in the bulk grant xml files. The second thing to be aware of is that there are around 300 granted patents whose data is inexplicably absent from the bulk grant files. Percentagewise, it’s a miniscule oversight but shouldn’t 100% of the granted patents be present in the bulk grant files? Hey USPTO, how about producing a catchup file that contains the missing granted patents?

The other flaws I have noticed have to do with the bulk classification files. The patent office stopped producing the bulk United States Patent Classification (USPC) file despite their continued use. It is true that since June 2015 utility patents are not assigned USPCs, but plant patents and design patents still receive them (also reissued plant or design patents). The last bulk USPC file produced ended with assignments for PP29260 and D816289, both issued April 24, 2018. (See the image above.)

The other classification in use is the Cooperative Patent Classification (CPC). There is a bulk CPC file but it only contains CPC assignments for utility patents. CPC assignments for reissued patents and plant patents are not included in the bulk data file. It seems to be well guarded secret, or at least not widely publicized, that plant patents receive CPC assignments. I’ve found that roughly half of all plant patents have received one or more CPC assignment as shown on this page of mine.

The real problem is that the patentsview api uses the bulk data files to build their database. From the above this means that they are missing the ~300 patents that are not present in the bulk grant files, plant patents after PP29260 and design patents after D816289 don’t have USPC assignments, approximately half of the 30,000 plant patents are missing their CPC assignments as are all reissued patents. By choice, they load all the data from the bulk grant xml files, so this means their database contains around 8,000 withdrawn patents (see my previous post on withdrawn patents). Take a look at the Data Collection Phase that shows how they process the bulk files. The diagram above shows corrections that should be made to their loading process.

If you have been paying particular attention, you will have noticed that in the patentsview database plant patents after PP29260 are only searchable by their at-issue International Patent Classification (IPC), it’s the only classification system they have data for, as they lack a bulk post-PP29260-USPC file and a non-utility-patent bulk CPC file. There are lots of things you could do with patentviews, but one thing you can not do is effectively search plant patents by USPC or CPC.

The USPTO’s patft, its online search page, has all of the above with the exception of four missing granted, non-withdrawn patents that are not in the bulk grant files (6,287,179; 6,392,191; 6,394,333, and 6,558,580). patft has CPC assignments for plant and reissued patent, and USPC assignments for plant and design patents after PP29260 and D816289 but they do not make this data available as bulk download files.

Bulk data is great, depending on what you do with it, but the USPTO still has a monopoly on some of its data. I have been unsuccessful in my efforts to have this rectified and would appreciate any help!

Notations in the diagram above

  1. Patentsview should not load withdrawn patents into their database
  2. The USPTO should resume producing the bulk USPC file
  3. The bulk CPC file should contain all patents, not just utility patents

When apis fail you

Sometimes there isn’t a way other than screen scraping to get the data you want, which is unfortunate. I’d like to programmatically retrieve classification fields for the plant patents issued each Tuesday. I can’t use the patentsview api since its data lags behind, it’s updated roughly quarterly while the patent office’s site is updated each Tuesday. Plus the api does not return uspc classifications on newer plant patents as the patent offices has stopped producing the bulk file of them (the last file produced stopped with PP29260, issued April 24, 2018). The api also does not return cpcs that are now coming back on about half of the plant patents, as there is no bulk source of them (the bulk cpc file only contains utility patents, fans of reissued patents are also out of luck). See this page if you don’t believe me that some plant patents do get cpc assignments!

Similarly, I could use try to use the PEDS (Patent Examination Data System) api but only returns one uspc classification per patent when multiples are allowed1 and it also does not return cpcs. So, having no other free option, you can’t blame a guy if he makes requests weekly to patft and scrapes the page of data that is returned!

1If you want to check for yourself, these plant patents each had 4 uspc assignments when I scrapped them PP23484, PP23723, PP23924, PP24080, PP24201, PP24521, PP24634, PP24828. Compare peds and patentsview to patft to see the disparity.

The Python Wrapper

This has nothing to do with a snake in a hoodie, laying down rhythmic rhymes, that would be Python The Rapper 🙂 The same people who wrote the Patentsview api also wrote a python wrapper that produces a csv file for you. All you need to do is download the code and dependencies (instructions provided in the link) and write a configuration file for your query or queries (more than one query can be specified in a configuration file). I realized that the queries it makes can be chained together, where the output of one becomes the input of another. I posted about it here, in the patentsview forum.

Patents not in bulk xml files

I nearly fell out of my chair when the patent office announced that they would be giving away their data for free! How cool is that? Free data, where is the catch? It turned out there isn’t one, well maybe a very small one, percentagewise that is.

Each quarter the Patentsview api team processes all the bulk grant xml files the patent office makes available, something like a zillion of them (actually closer to 2,000 at the time of this writing). That’s how they create the database that their api uses. After processing the files and updating their database, they then make their data available for download, and get this, it’s also free!

Separately, the patent office constantly updates a list of withdrawn patents. I thought hey, I wonder if there are any missing patents in Patentsview’s patent.tsv file? I’m that sort of guy, you know. Often interesting things can be found when you examine the gaps and overlaps of related data files or data sets. In theory, any patent numbers not in their patent file should correspond to withdrawn patents, right? It turned out to be only partially true, which was really unexpected.

First, I found that there are around 8,000 patents in both files, patents that were issued and whose data was included in bulk grant xml files but later they were withdrawn! This is problematic for the Patentsview database, as I tried to point out to them. They really should exclude withdrawn patents from their database but they don’t see it that way. In other words, a search using their api could contain patents that have been withdrawn. As a user of the api I find this unacceptable.

Second, I did find unexplained gaps! There are 306 patent numbers not in the Patentsview file that do not correspond to withdrawn patents. I double checked a few and the patents are indeed missing from the bulk xml files. Percentagewise it’s miniscule, 306 patents out of 7.5M patents, put shouldn’t it be 0%? Again, this is problematic for the Patentsview database. Searches using their api cannot include these patents as they were not part of their underlying source. Another serious flaw with their api, even though it isn’t their fault that the patents are not part of the bulk xml files. I pointed this out to them but an alternate source of this data has not been found.

I think the patent office should produce a “catchup” xml file containing the 306 valid patents that are not in the bulk xml grant files. The Patentsview people could then add them to their database and other bulk data consumers could do whatever they want to do with them. If the Patentsview people also excluded withdrawn patents, I’d be a lot happier as a consumer of their api. There are other flaws with the api, but this would be a step in the right direction.

(Some of the missing patents are listed on this page, and some listed on this page, for a total of 306 patents.)

TSDR and the api key

The T in USPTO (Unites States Patent and Trademark Office), you should have just learned, stands for Trademark. Their TSDR (Trademark Status & Document Retrieval) api deals with trademarks. If you’ve used it recently, you probably noticed that they now require an api key, which you can get by registering with them. Their Swagger-UI page is at but it doesn’t allow you to enter your now-required api key and it has a number of other omissions (listed in my github repo). Further, their api does not accept browser requests coming from domains other than their own (they’re blocked by CORS policy), which is why the Swagger-UI page I created does not work (though the generated curl commands work and my modified swagger object can be imported into postman from I emailed them to point out these problems but they have not updated their Swagger-UI page or allow CORS requests.

I’ve been down this road before, I cannot get the patentsview people to adopt the Swagger-UI page I created for their api. Using an api’s Swagger-UI page can be a great way to learn the ins and outs of an api, but it takes a little cooperation from the api provider! By contrast, the Swagger-UI page for the USPTO’s PEDS (Patent Examination Data System) api works as it should, without my involvement.

Developer Candy: Swagger UI for the patentsview api

Russ Allen, developer and patent enthusiast, created a Swagger UI json object and explains its usefulness.

Pretend for a moment that you are a developer working on something cool that needs to call a web service. If you are lucky, the web service provider will have made a Swagger UI web page available for you. It’s an opensource project that generates a web page that lets users make calls to the web service by filling in form fields. It’s similar to Postman with a lot of setup work done for you. At the heart of Swagger UI is a json object that specifies all the api does or will do if you play by its rules (properly use its verbs and endpoints by passing what it expects in the formats it accepts). All an api provider needs to do is to create the json object and plug it into the Swagger UI package they’ve downloaded from That’s nothing more than copying the json object and dist folder to their web site. Then they just need to update the index.html file with the url of the file containing their json object and boom, their web site has a Swagger UI web page for the whole world to use!

Russ noticed that the patentsview api did not have a Swagger UI web page so he created the necessary json object. Below is an example that shows both the power of the patentsview api and of Swagger UI. We start by filling in the Swagger UI web page’s form fields that will issue a get to the patentsview’s /patents/query endpoint but we intentionally made a mistake, perhaps you’ll be able to spot it.

When we press the Execute button in the Swagger UI web page, the response is added to the UI page.

It seems we’ve made the api mad by requesting a field in the f parameter that it doesn’t yet support. Fortunately for us, the patentsview api developers thoughtfully return an x-status-reason response header explaining exactly why it returned a 400 or Bad Request response code. How cool is that? (Note that not all api providers go to this extent to be helpful.) If we correct our request and press the Execute button again, we are rewarded with the api’s data returned nicely formatted.

The Swagger UI web page and this api’s x-status-reason become a powerful tool for developers and interacting with the api is a great way to quickly learn the ins and outs of the api before writing any code. Try this very demo here!

The json object can also open doors in the opensource community. Several opensource projects use the json object as input and convert it to other formats or generate tests etc. Like many things in life, there are two standards. There’s the Swagger 2.0 specification and the newer Swagger 3.0, also known as the OpenAPI specification. Russ used one of these opensource projects to convert the Swagger 2.0 json object into its corresponding Swagger 3.0/OpenAPI object. Having both versions maximizes usefulness. Some opensource projects accept either version but there are ones looking for a specific version. There’s a nicely formatted list of these projects and which version(s) they accept at Oh, and if Russ hasn’t sold you on the power of the json object, he suggests that you try importing it into Postman to see what happens: Boom, nicely loaded Postman page just itching to hit the patentsview api endpoints for you! In Postman:
File -> Import -> Import From Link [Swagger (v1/v2)]

Russ would like to contribute the two json objects to the patentsview api project if its developers would care to host a Swagger UI page at Otherwise, the patentsview Swagger 2.0 UI page is available at and the Swagger 3.0/OpenAPI version is at The UI pages look the same, but the underlying json objects are distinct and correspond to their respective Swagger versions.

Note: Currently the X-Status-Reason header is not being displayed in either version of the UI (in chrome at least). I’ve opened an issue to address this.