Patents not in bulk xml files

I nearly fell out of my chair when the patent office announced that they would be giving away their data for free! How cool is that? Free data, where is the catch? It turned out there isn’t one, well maybe a very small one, percentagewise that is.

Each quarter the Patentsview api team processes all the bulk grant xml files the patent office makes available, something like a zillion of them (actually closer to 2,000 at the time of this writing). That’s how they create the database that their api uses. After processing the files and updating their database, they then make their data available for download, and get this, it’s also free!

Separately, the patent office constantly updates a list of withdrawn patents. I thought hey, I wonder if there are any missing patents in Patentsview’s patent.tsv file? I’m that sort of guy, you know. Often interesting things can be found when you examine the gaps and overlaps of related data files or data sets. In theory, any patent numbers not in their patent file should correspond to withdrawn patents, right? It turned out to be only partially true, which was really unexpected.

First, I found that there are around 8,000 patents in both files, patents that were issued and whose data was included in bulk grant xml files but later they were withdrawn! This is problematic for the Patentsview database, as I tried to point out to them. They really should exclude withdrawn patents from their database but they don’t see it that way. In other words, a search using their api could contain patents that have been withdrawn. As a user of the api I find this unacceptable.

Second, I did find unexplained gaps! There are 306 patent numbers not in the Patentsview file that do not correspond to withdrawn patents. I double checked a few and the patents are indeed missing from the bulk xml files. Percentagewise it’s miniscule, 306 patents out of 7.5M patents, put shouldn’t it be 0%? Again, this is problematic for the Patentsview database. Searches using their api cannot include these patents as they were not part of their underlying source. Another serious flaw with their api, even though it isn’t their fault that the patents are not part of the bulk xml files. I pointed this out to them but an alternate source of this data has not been found.

I think the patent office should produce a “catchup” xml file containing the 306 valid patents that are not in the bulk xml grant files. The Patentsview people could then add them to their database and other bulk data consumers could do whatever they want to do with them. If the Patentsview people also excluded withdrawn patents, I’d be a lot happier as a consumer of their api. There are other flaws with the api, but this would be a step in the right direction.

(Some of the missing patents are listed on this page, and some listed on this page, for a total of 306 patents.)

A Rookie Mistake

There are around 11,000 registration certificates that are not online. They correspond to dead trademarks that have no legal standing. I was researching tool companies that held some of these missing trademarks. I requested copies from the patent office through a patent librarian I know. On one request I mixed up the serial number and registration number of the trademark I was interested in. The former number is not that useful and the latter is all important. On the other end of my request was an intern at the patent office. When I met her in person, I related the tale of how she schooled me on my rookie mistake. The group around us burst into laughter, it turned out she had been a teacher at some point and schooling people was nothing new to her!

Mike White’s excellent US Trademark number guide succinctly explains the trademark numbering peculiarities brought about by the Trademark Act of 1946. When my son was younger, he liked the cartoon Ben 10 which has a Null Void- somewhere in dimensional space that you don’t want to wind up. One of the side effects of the 1946 Trademark Act was that it created a Null Void of registration numbers (for reasons too complex to explain in parenthesis, registration numbers 444,812 through 500,000 where never issued). The request I was schooled on was a request for a “registration number” 493,259, which was never issued. The registration certificate is online, where its serial and registration numbers can be seen if you don’t interchange them. (Some of the above is excerpted from this article.)

Someone I know

I was adding Lee Valley Tools’, a modern day tool manufacturer, patents to datamp.org when I came across a patentee who is a member of my tool club! The first name was different but the middle name, the name I know him by, and last name were the same as was the small town in Georgia. I checked with him and it was indeed his patent! I assume he’s the only member of my tool club to have a tool patent. [Charles Paul Hamler 5,694,696] Stanley Rule & Level Co. had patented, but did not commercially sell, a similar insert 837,978, issued ninety-one years earlier.

I purchased the tool from Lee Valley and I asked Paul to autograph it. I’m guessing I’m the only one on my block whose hand plane can be turned into a scraper plane! It will be a future chair moment if I come across another club member’s tool patent.

Patent Office Tie

A few years ago I spoke to a group of patent librarians who were at the patent office for a training seminar. Three things horrified me when I checked in and opened an otherwise innocent looking manila envelope. The first was the attendees list showing the number of people that I’d be speaking to. I knew roughly how many patent libraries there are but I hadn’t thought that more than one librarian per library would be attending from some of the libraries. The second thing was agenda showing that I would be the fourth and final speaker on the fourth and final day of the seminar. I had a deck of slides to present to people who would likely be powerpointed out by the time I would begin speaking. The third thing was a coupon for 15% off at the patent office’s gift shop. Yes, I know that doesn’t sound scary but it foiled my best laid plan. You see I thought I was beyond cleaver when I found a neck tie on ebay that had patents on it, but there in everyone’s packet was a coupon showing the very same tie! What are the odds of that? Clearly I could not wear that tie. People would think that I was uncreative, unprepared and worse of all, a cheap sake.

Luckily I was prepared and wore my backup tie that showed antique airplanes flying over a map making it look vaguely like mechanical drawings. Not even close on the appropriateness scale but no one could accuse me of being uncreative, unprepared or cheap. The really funny part was that in preparing for my speech I tried to find some of the patents on the tie, so I could incorporate them as examples in my speech. I tried, I mean I really tried in a pull-out-all-the-stops kind of way using every trick I knew, but I failed to find a single one of the “patents” on the tie. So there you have it in full exposé mode, the truth that couldn’t be suppressed; the patent office’s gift shop sells a tie with fake patents on it! If only I had Geraldo’s number…

Prior Art

I previously posted my origin story but it turns out to be not quite true as I found prior art! I thought finding an antique router plane was what triggered my interest in patents, that was until I visited my parents. They still live in the house I grew up in. On a shelf was this 1974 Parker Brothers game that I played as a kid.

The 12 game cards are real patents. The cards show their issue dates but not their patent numbers and most of their titles were changed, possibly to make them more amusing. “Massage apparatus”, for example, becomes “Pounds away weight reducer” on the game card. Finding the patents online became a game for me, a game within the board game. The USPTO web site only allows patent number, issue date and category searches for patents issued before 1976, this was before google patents was a thing.

Here’s how the search for the Adjustable Clothes Pins went.

Step one: determine a likely uspto patent classification:

I searched the title field on the USPTO’s site for clothes pin in the “modern” patents (post 1975 patents are fully searchable). The search criteria was: ttl/”clothes pin” (in English that’s searching the title field for the two word phrase clothes pin) At the time I did this eleven patents were returned. I clicked on 4,945,613 to find its Current U.S. Classes: 24/501; 24/511

Step two: Issue a search in the non-modern (pre-1976) section where the only criteria allowed are patent number, classification and/or issue date. Fortunately wildcards are allowed on the advanced search page. My search was Query: “isd/11/9/1915 and ccl/24/$” Select years: “1976 to present [full-text]” (quotes not entered in the query box). In English that’s a patent issued on November 9, 1915 that is in classification 24 with any sub classification (there are usually a hundred or more sub classifications per class).

Fourteen patents meet this criteria. An image of each patent is displayed after clicking on the links returned from the USPTO site. The sixth one returned was the one I was looking for.

The game isn’t being made any more but it does appear on ebay pretty regularly. Below is the display of an rss feed that serves up a different playing card’s patent each day.

  • US Patent: 363,037US Patent: 363,037
    Means and apparatus for propelling and guiding baloons Patentee: Charles Richard Edouard Wulff - Paris France Granted:1887-05-17 1 of 12…

Ralph’s marking gauge

I’m not normally a devious person, or at least I usually don’t act on what the hilarious devil on my shoulder whispers in my ear. This story would be an exception. Ordinarily it is hard to find humor in intellectual property, hopefully you will see it here.

My friend Ralph was into collecting patented marking gauges (specialized woodworking tool used when parts need to interconnect or be of the same length, from a time when that was done using hand tools). He also lead the creation of a web site to share the patent data associated with his collection and invited collectors of other patented tools to add their data to the site. This was before google patents was a thing, back when it was harder to find older patents.

The listing on the right in the picture was created by Ralph on the patented tool web site he helped launch (datamp.org). The tool on the left was being discussed on an antique tool listserver when an online tool dealer had it for sale on his web site. The tool dealer did not know what the tool was but the listing contained a patent date. I identified the tool by finding the patent even though the patent date in the listing contained a typo and then I looked to see if the patent was in datamp.

In Ralph’s listing it originally said ‘Not known to have been produced’ which I took to mean he didn’t own one or hadn’t ever seen one (something can be patented but never produced commercially). At the time Ralph owned so many marking gauges that it was nearly impossible to find one he didn’t already have. I saw an opportunity to both score a gauge for him as well as have a little fun. I figured buying it before posting about it was the right thing to do. It belonged in Ralph’s collection but, based on the advise I received, I neglected to mention the purchase in my post. To quote from it, “Quick, someone buy this thing before Ralph does :-)” as Ralph was also on the listserver and everyone knew of his collection. My devious side wanted him to see that it was sold when he viewed the listing on the tool dealer’s web site. I waited a half day or so until I let him know that I bought it and would sell it to him for what it cost me. Fun is fun but extortion is a line I wouldn’t cross for a friend. My shoulder devil and I thought it was pretty funny how we used Ralph’s own web site against him!

My Origin Story

Quite a few years ago now, I came across an antique router (the pictured woodworking tool, before ones with power cords were invented) with a patent date on it. I knew that my friend Ralph was into patent searching so how hard could it be? I tried to make sense of the US patent office’s categories in an attempt to find this patent. I gave up and in the end, I decided that the shotgun approach (can I trademark this?) would be best. I’d view all 1035 patents issued on the day in question. Thanks to a new feature (Jan 17, 2003) on the uspto site, I was able to generate urls to directly view the images I wanted.

After a while of shotgun(tm?) searching, I gave up on the notion of having a single favorite patent for this day. My favorite changed with nearly every image I retrieved. I was blown away by the diversity and peculiarity of the day. Seven different shoe related patents? Patented undergarments and hosiery? Two lawn mover patents? A book mark? A tonsil snare? Ouch. Perhaps even more painful, a patented scalp syringe. A patent assigned to Kodak (my first post-college employer) for film. Wicket and stake for indoor croquet? It might be easier to list what wasn’t patented on this day. Three tobacco pipes and a cap for one? A Fountain pen? A neck tie? Seven game related patents? A Christmas tree holder? Honest, I’m not making this up, I’m not this creative. Four telegraph related patents? Three automobile turn signals? One of which involves a fake arm and hand dropping down to signal a turn. Ok, maybe this is my favorite of them all. A patented pitchfork attachment? Who knew there were pitchfork attachments, let alone patented ones? Where can I get one? What a day! A foldable golf club (golf stick)? A vehicle that looks like a city bus? Given the 46 pages you’d probably be able to build one for yourself without any infringement worries. A garment hanger? A stopping mechanism for roving and the like machines? We are fortunate to live in a time free from runaway rooving machines- thanks to Mr Adonias D. Bolduc, of New Bedford, Mass.’s patent. My hard drive runneth over with these images. Was every patent Tuesday like this? Fifty-four apparatuses (apparati)? It seems like including the word apparatus in the title was a surefire way to get your idea patented. I almost didn’t want to find the router patent- this was that interesting. For most of the patents, I viewed only the first image page. I was compelled to view more pages on some patents. How could I not read the description of crab wrapper and wrapping method?

Oh, I almost forgot, the patent I was looking for turned out to be for the thumb screw, not the router. I found it after viewing all but 164 of that day’s patents. See 1,541,518 or shotgun June 9, 1925 or shotgun another issue date (shameless plugs for datamp.org). Since finding the router and its patent, I’ve become hooked on creating tools to find patents, exploring patent apis and consuming bulk data. Had I known that the patent was for the thumb screw I may have had success navigating the patent office’s classification system and none of what followed would have happened.

A slight misunderstanding

Here is another, hopefully funny, Intellectual Property related story. There is not a lot of humor in IP so when it does occur it should be shared, even if it puts its author in not the best light…

Hathi Trust is an online library that currently has over 17 million scanned books. It’s based at the University of Michigan but many libraries belong to the Trust and have contributed scanned books. For years I’ve been using three patent related collections “llalwani’s Patent Indexes collection“, “llalwani’s Trademark Indexes collection” and “llalwani’s Official Gazette collection” I had assumed that llalwani was an acronym for a contributing institution to Hathi Trust. A few years later I discovered more patent related books that weren’t part of a collection. The Hathi site said I could log in to create either public or private collections. I registered as rrjallen and created collections for the whole world to use: “rrjallen’s Canadian Patent Gazettes collection“, “rrjallen’s Patent Commissioner Decisions collection“, “rrjallen’s French Patents” and “rrjallen’s Great Britain Patent Office Publications“. The big advantage to a collection is that it can be searched as a whole rather than having to searching each individual book. “llalwani’s Official Gazette collection” now contains 1,904 scanned gazettes. Searching the entire collection for something takes only one click. This is a very powerful advantage you wouldn’t have at a physical library containing the same books. I even created links into Leena’s patent gazette collection for easier navigation.

At some point, after creating my own collections, a light bulb went off inside my head. llalwani must be a user id, not an acronym! Over the years I had come across other patent related books that should be included in llalwani’s collections. The problem was that there wasn’t a way of contacting the person (or institution, as I thought at that time) who created the collection. When I figured out that llalwani was a person I tried googling for more information. Google found a University of Michigan web page that mentioned a Leena Lalwani and that her email address contained llalwani. I sent her an email asking if she was the llalwani that had created the patent collections at Hathi Trust. I said in the email if she wasn’t this person please disregard what has to be the weirdest email she’d ever receive! It turned out that she was indeed the person I was looking for. Further, I learned, she’s a patent librarian at the University of Michigan, not just a random person at the university. So in my mind llalwani went from the acronym of an institution to an ordinary person to a patent librarian!

What a great collaboration! It was much less work for me to ask her add the books I discovered to her collections than for me to create similar collections of my own. Her patent index collection is one of the most used collections at Hathi Trust (it’s listed on the featured collection page at hathitrust.org). By having her update her three collections everyone in the world gains, most of all me! In my correspondences with Leena I had not mentioned that my degree is from Big Ten rival, the University of Illinois. I wasn’t sure if she would be as cooperative if she knew that. It probably wouldn’t be a big deal unless I had gone to Ohio State! (Michigan’s archrival)  I finally did divulge my alma mater during a presentation Leena was at, shortly after meeting her in person for the first time.

Chair Moments

I’ve mentioned my friend Ralph (seated in the photo) in a previous post or two. More than a few years ago, the two of us did a patent searching presentation for the antique tool collectors club that we both belonged to (Midwest Tool Collectors Association, mwtca.org). It was an hour long presentation during a three day conference and there were approximately 200 people in attendance. There was nothing else against it on the agenda so not all of the people were there by choice. In other words, not everyone was excited about learning how to search for patents. (The typical use case is finding the patent for an antique tool marked with a patent date.) It was right after lunch and I should mention that the average club member’s age is probably mid sixties.

Quick sidebar on Ralph: he was so into collecting antique marking gages (specialized woodworking hand tool used when parts need to interconnect or be of the same length or width) that he lead the creation of a web site so he could share the patent data he had accumulated. He also invited other collectors to share their patent data for whatever tool they collected or researched. The site was launched in 2002, before google patent was a thing, and it currently has data for more than 68,000 patents. (Directory of American Tool and Machinery Patents, datamp.org)

As the presentation began, Ralph and I were introduced and we walked up to the lectern in a thundering silence. I did a brief introduction, then Ralph spoke and then I did the conclusion before opening up for questions. It was while Ralph was speaking that someone fell out of his chair and into the isle! It remains a mystery which speaker put him into a post-lunch stupor but, to be clear, it was on Ralph’s watch that the event in question occurred! Since then I use the term “Chair Moment” when referring to something IP related that I found interesting or amusing. As in I was so amused or excited by the discovery that I metaphorically fell out of my chair, to ironically twist what actually happened while my friend Ralph was presenting.

Vaughan family patents

In a previous post I mentioned my friend Ralph’s two patents and his great-great-grandfather’s purse lock patent. That’s three generations of patentless relatives bookended by two generations of patent holders. While researching a tool manufacturer, I found a family where all five generations held patents.

Alexander Vaughan, father
Sanford Vaughan, son
Howard Vaughan, grandson
Howard Vaughan, Jr., great-grandson
Charles Vaughan, great-great-grandson

Clearly this is an overachieving family! As further trivia, Alexander was a witness on Sanford’s patent D29,109 and Sanford was a witness on Alexander’s D29,767. See Vaughan Manufacturing’s web site for more information on the company’s history.