When apis fail you

Sometimes there isn’t a way other than screen scraping to get the data you want, which is unfortunate. I’d like to programmatically retrieve classification fields for the plant patents issued each Tuesday. I can’t use the patentsview api since its data lags behind, it’s updated roughly quarterly while the patent office’s site is updated each Tuesday. Plus the api does not return uspc classifications on newer plant patents as the patent offices has stopped producing the bulk file of them (the last file produced stopped with PP29260, issued April 24, 2018). The api also does not return cpcs that are now coming back on about half of the plant patents, as there is no bulk source of them (the bulk cpc file only contains utility patents, fans of reissued patents are also out of luck). See this page if you don’t believe me that some plant patents do get cpc assignments!

Similarly, I could use try to use the PEDS (Patent Examination Data System) api but only returns one uspc classification per patent when multiples are allowed1 and it also does not return cpcs. So, having no other free option, you can’t blame a guy if he makes requests weekly to patft and scrapes the page of data that is returned!

1If you want to check for yourself, these plant patents each had 4 uspc assignments when I scrapped them PP23484, PP23723, PP23924, PP24080, PP24201, PP24521, PP24634, PP24828. Compare peds and patentsview to patft to see the disparity.

Leave a Reply

Your email address will not be published. Required fields are marked *