Springer Publishing

Thursday 8 June 2017

International Patent Classification as a printable treemap

The IPC (International Patent Classification) can be downloaded in *.xml format!

Therefore, it contains tree-like data: categories and subcategories of classes of patents. If the data is tree-like, it can be transformed into a treemap, which is a cool "square-ish" version of the same data.

So I used R to do it.

I made it printable!

Download the IPC valid symbols xml file.

symb<-xmlTreeParse("YOURDIRECTORY//ipc_valid_symbols_20170101\\ipc_valid_symbols_20170101.xml",useInternalNode=T)
symb2<-unlist(lapply(xpathApply(symb,"//*",xmlAttrs),function(x) x["symbol"]))
symb3<-symb2[grep("^.{14}",symb2)]

#Treemapping IPC xml classes
c4<-sapply(symb3,function(x) paste0(substr(x,1,1),".", substr(x,2,3),".", substr(x,4,4),".", substr(x,5,8),".",substr(x,8,14),collapse=""))
treemapready<-cbind(substr(c4,1,1),substr(c4,1,4),substr(c4,1,6),substr(c4,1,11),substr(c4,1,18))
treemapready4<-cbind(substr(unique(treemapready[,4]),1,1),substr(unique(treemapready[,4]),1,4),substr(unique(treemapready[,4]),1,6),unique(treemapready[,4]))
treemapready4df<-data.frame(treemapready4)
tmr4df<-cbind(treemapready4df,as.numeric(table(treemapready[,4])))
colnames(tmr4df)<-c("a","b","c","d","e")
treemap(tmr4df,index=colnames(tmr4df[,1:4]),vSize="e")
treegraph(tmr4df,index=colnames(tmr4df[,1:4]))
##
pdf("ipc.pdf",width=8.5,height=11,paper='special')
treemap(cbind(tmr4df,f="#FFFFFF"),index=colnames(tmr4df[,1:4]),vSize="e",type="color",vColor="f")
dev.off()

Check you working directory and go find the output pdf!


As we say in Gaelic, "sin agad e!😀"
There you go!

No comments:

Post a Comment