Springer Publishing

Friday 27 November 2015

Antisapstains, including chlorophenols!

Sapstain is the grey/blue stain in wood caused by wood-staining fungi Grosmannia clavigera.

Antisapstains are a small group of chemicals used in wood treatment and pulp mills. They are covered by law. Canadian Law (BC regulation) defines the following substances as antisapstains:
http://www.canlii.org/en/bc/laws/regu/bc-reg-300-90/latest/bc-reg-300-90.html

DDAC                                                        IPBC                                                    TCMTB

Cu-8












As well as Chlorophenol,Which is further defined as the sum of chlorophenols and chlorophenates

2-chlorophenol, 3-chlorophenol, 4-chlorophenol and deprotonated forms.



Table of abbreviations, names, SMILES, CAS and chemical formula.



DDAC didecyldimethyl ammonium chloride CCCCCCCCCC[N+](C)(C)CCCCCCCCCC.[Cl-] 7173-51-5 C22H48NCl

IPBC 3-iodo-2-propynyl butylcarbamate CCCCOC(=O)NCC#CI 55406-53-60 C8H12INO2

TCMTB 2-(thiocyanomethylthio)-benzothiazole N#CSCSC1=Nc2ccccc2S1 21564-17-0 C9H6N2S3

Cu-8 copper-8-quinolinolate [O-]c1cccc2cccnc12.Cu 10380-28-6 C9H6NOCu

Tuesday 27 October 2015

Open Science Working List

academia.eduwww.academia.edu/ social network
altmetrics http://www.altmetric.com measuring scholarly impact
authorclaim http://authorclaim.org/ measuring researcher impact
citeulike http://www.citeulike.org citation bookmarking
crossref http://www.crossref.org article metadata search doi resolver
crowdometer http://crowdometer.org/
datacite https://www.datacite.org/ doi provider
depsy http://depsy.org measuring scholarly impact
faculty of 1000 http://f1000.com/
Fast Track Impact http://fasttrackimpact.com research impact
figshare http://figshare.com/ data repository
github https://github.com/ computer programming
Global Research Identifier Database (GRID) www.grid.ac database
google scholar scholar.google.com/ measuring researcher impact
hypothesis https://hypothes.is annotation organize collaborate
impactstory https://impactstory.org/ measuring scholarly impact
journal of brief ideas beta.briefideas.org/ journal
journalreview.org https://www.journalreview.org/
kudos www.growkudos.com research impact
mendeley https://www.mendeley.com/ citation bookmarking
microsoft academic search http://academic.research.microsoft.com/ measuring researcher impact
mozilla science lab https://www.mozillascience.org/
open access infrastructure for research in europe https://www.openaire.eu/
open knowledge https://okfn.org/ data repository
open researcher and contributor id http://orcid.org/ researcher identification
open science framework (OSF) http://osf.io research publishing framework
papercritic http://www.papercritic.com/ monitoring feedback and conversation
peerj https://peerj.com/
plos impact explorer http://altmetric.com/interface/plos.html scientific conversation impact
plum analytics http://plumanalytics.com/ measuring scholarly impact
public library of science https://www.plos.org/ library journal
publons https://publons.com/ scientific review
pubpeer https://pubpeer.com/
readermeter http://readermeter.org/ impact
researcherid www.researcherid.com/ researcher identification
researchgate http://www.researchgate.net/ social network
rio journal http://riojournal.com/ journal
Securing a Hybrid Environment for Research Preservation and Access, SHERPA http://www.sherpa.ac.uk repository development
sciencecard http://50.17.213.175/ measuring researcher impact
scienceopen https://www.scienceopen.com/ publishing network
scinote scinote.net electronic lab notebook
slideshare www.slideshare.net/
sparrho https://www.sparrho.com/ recommender search engine
the new reddit journal of science https://www.reddit.com/r/science/ journal
the winnower https://thewinnower.com/ journal
wikipathways www.wikipathways.org/
wikipedia https://www.wikipedia.org/
wiktionary https://en.wiktionary.org/
zenodo http://zenodo.org/ data repository

the content mine,contentmine.org,
journal of open humanities data,
science.ai
Scihub, http://sci-hub.io, http://sci-hub.cc
The open journal, http://theoj.org, 
Protocols.io, https://www.protocols.io
Pubchase, www.pubchase.com,

Monday 19 October 2015

PubPeer - Scientific Conversation



PubPeer, The Online Journal Club, is a program that is involved in carrying on the conversation of science, mostly after work has been published.

Nuts and Bolts

Essentially, it appears to work by searching articles based on DOI or other unique identifier (e.g. PubMed ID) through the PubPeer interface. Once the article is found, you can provide comments on it.

Getting started

You become a member by inputting the DOI of a paper you published, selecting which author you are, then providing your institutional email address. ResearchGate is another service that requires an institutional email address to get started.

Providing Commentary

Of course there are guidelines on how to provide appropriate commentary through PubPeer.

The Browser Extension

I, as a good scientist, installed the browser extension. I tried it out searching the keyword "naphthenic acids". No PubPeer results on the first page. I also searched "cancer", "pubpeer" and "metabolomics" and there was no PubPeer commentary on any article on the first page.

Finally, I went to the PubMed featured comment for the day (Oct. 2, 2015) and saw the following page. The yellow bar above the article title shows how many comments are on PubPeer.



To access the PubPeer comment, you click on the white words "1 comment on PubPeer". You are then taken to PubPeer's webspace to explore the comment. The comment at PubPeer is pretty much the same comment below the article in PubMed Commons.






Commenting

I posted my first comment on PubPeer concerning an article about the synthesis of yaku'amide. This is how it looks!






Future

Will they permanently archive commentaries and/or commentary chains with DOIs? What is the difference between PubPeer and PubCommons in PubMed? How is PubPeer different than Disqus?

I know there are subtle differences, but I am still waiting to hear back from those organizations. Until then, PubPeer remains another excellent tool for scientific commentary just waiting to explode!

Monday 5 October 2015

Sparrho - Scientific Recommendation

Sparrho Logo

Sparrho ("sparrow") is a scientific recommendation service.


When I began playing with Sparrho, I got the feeling that it was similar to Google Scholar, but I knew it was different. I just couldn't tell how.


So I asked Sparrho myself!





Thanks so much! I also got an invitation to receive some "sparrhoswag". I'm not sure what it is, but it sounds good. Now I am trying to navigate sparrho.

--

Be it known that I am obsessed with naphthenic acids in oil sands process waters!

How can sparrho help me?

--

Well, the interface is sleek-looking, purple and starry! I am looking into a fascinating world. As I type in and save keywords, I am building a repertoire of articles of which I can mark as relevant (checkmark) or irrelevant (X) for my purposes. It's like I am building a research topic channel. I can immediately share articles over a variety of networks and link to the location of the article online.

The only thing is the 1D, 2D, 3D network graph logos that confuse me a bit.

THIS is how Sparrho is different from Google Scholar.

Line
The 1-D graph image is a search which contains only the exact keywords you've entered for your channel.

Square    The 2-D graph image search includes keywords defined as a more general concept. Here is where you want more and more keywords to give a better overall context for the research.

Cube    How could it get better than that? Well, the 3-D graph image represents Sparrho recommending new articles you didn't think you would need!

Okay, let me go back to Sparrho then and play the game!



The Game



 A 1-D search for the keywords "naphthenic acids", "OSPW", and "oil sands" gives 75 hits. By the way, these 75 hits are mostly research that has been published in 2015. The oldest articles in this channel are 2012.


A 2-D search returns over 400 hits! Excellent. I tend to get hits concerning various aspects of naphthenic acids chemistry: biodegradation, toxicity, structure determination, etc. Extremely useful. If an article is currently considered "noise" to a channel, I can mark it as irrelevant. I can also dig up the articles I have marked by clicking HISTORY button on the top right. I can change the status of an article to which I have already applied relevance status (for that particular channel).


A 3-D search produces 261 hits, which is less than 403, but doesn't mean the search has somehow failed. On the contrary, it succeeded by returning exactly the number of results it is supposed to return for the keywords and relevance scores supplied! Perhaps it suggests that my channel's keywords are quite 'directed' and do not have a diffuse set of connotations or definitions. I saw here some articles related to climate change and the Athabasca oil sands, as well as articles concerning oil sands soil nitrogen availability, and honouring indigenous treaty rights.

When you click on the information about an article, a pleasant green window pops down underneath called "People who read this also read", showing articles that can be called as such.



Summary

I am very excited to learn how to use Sparrho more effectively. I can envisage the 3-D search being extremely useful for academics who are charged with research that is very nebulous, publicly involved and has a lot of angles by which to approach: In fact this is all PhD projects, no matter how pessimistic you may feel! Using Sparrho can open you up to new research that is still directed toward your primary research interests and goals! Although I am developing analytical methods to characterize naphthenic acids, my efforts are directly related to policy and the wider industry. I believe it is my job to understand and be able to effectively manage the milieu in which my research is situated so I can have an impact there. Sparrho is helping.

P.S.: sparrhoswag?

Sunday 13 September 2015

Returning Google Search results in R - "mirex"

Introduction

When searching for specific information on the internet, the keywords we use often have multiple meanings. It is problematic when using statistical measures to gather information quickly: If you get 5 million results, how many of the results are directly related to the exact meaning you intend to search for? How many different meanings are there for a single word? Statistical metrics will easily lose sight of the range of meanings unless they are managed appropriately.

Think of the English word "love" (About 5.7 billion results on Google) and how many websites are dedicated to it. You may be looking for a detailed explication of the Greek notions of love as eros and agape, but end up on someone's careless Facebook post where 'love' is being used sarcastically. You may be directed to companies or people whose names include the word 'Love'.

Chemical nomenclature searching

In the world of chemistry, language is also extremely important and very complicated. IUPAC chemical nomenclature is a kind of agglutinative language, but additionally, many chemicals have their own trade names and traditional names. Some of these names are so old and common that they have acquired many different meanings and contexts over the years.

When searching for information about a chemical called "mirex", a prohibited pesticide, it is important to know that PubChem alone has amassed 121 "synonyms" and alternate names for this molecule. Using R, we can record the estimated number of hits returned in Google Search for each synonym of 'mirex'. The number of hits tells us something of the popularity of the word, but we cannot tell if there are other non-chemical meanings to the word that artificially inflate the results numbers.

R code and example

The following R code returns the approximate Google Search number of results for each entry in a vector.

library(XML)
library(RCurl)
LIST<-{a vector or matrix column of identifiers}
vec<-c()
for(i in 1:length(LIST)){
results<-unlist(xpathApply(htmlTreeParse(getURL(paste0("https://www.google.ca/search?q=",LIST[i]),
ssl.verifyhost=F,ssl.verifypeer=F,
followlocation=T),useInternalNode=T),"//div[@id='resultStats']",xmlValue))
vec[i]<-as.numeric(paste0(unlist(strsplit(results,"[A-Za-z, ]+")),collapse=""))
}

For the 121 synonyms listed in PubChem for the pesticide "mirex", the results can be displayed in a bar plot

barplot(vec,ylim=c(0,6e5))


The two tallest bars extend much further vertically past the boundary of the plot window into the millions. The synonym for mirex which returned the most hits (about 16,400,000) was "HRS 1276" (without double quotation marks). A few reasons are that when not enclosed in double quotation marks, HRS can refer to "hotel reservation service" or "hrs" as an abbreviation for 'hours', searching '1276 hours'. When enclosed in double quotation marks, "HRS 1276" returns 973 results. It can be said that this terms can have high "keyword search entropy"--a concept I will explore at a later time.

Summary

For a chemical with high legal profile, such as mirex, it is important to provide appropriate search terms to find the information needed. Perhaps those search terms with the smallest number of hits are the most relevant terms. Perhaps "mirex" is the most popular synonym for the chemical, but what percentage of hits returned by Google Search of "mirex" relate to the the pesticide and what portion relate to something else? More hits does not necessarily mean more popular.

Monday 7 September 2015

Cucumber + cherry

When eaten together, cherry and cucumber compliment each other. I wouldn't say that they directly enhance each other's flavours, but they seem to produce a slightly unique and positive flavour. That unique flavour, however, is not as strong as the natural cherry flavour still present.

The cherry flavour appears to dominate the combination just slightly and the cucumber flavour is almost overpowered.

Cherry and cucumber are somewhat close in texture because they are both crunchy so this combination is approximately equal to cucumber in texture, but contains the full texture of both cucumber and cherry.

Thursday 3 September 2015

Using Google Books API and R to illustrate the general impact of a scientific work over time

In order to get a quick idea of how a book has affected scientific research over time, Google Books API provides that data and R provides the visual!

The Book "The Carbohydrates", edited by Ward Pigman, is an example of a book that you might think has had a significant impact on the landscape of chemical science over the years. If another book cites this one, chances are Google Books will have a record. We can use the Google Books API to have a look.



R code:

library(XML)
library(RCurl)
library(RJSONIO)
result<-getURL("https://www.googleapis.com/books/v1/volumes?q=%22the%20carbohydrates%22%20pigman&startIndex=0",ssl.verifyhost=F,ssl.verifypeer=F,followlocation=T)

#This returns a text object in R which consists of 10 results in JSON format.

list<-fromJSON(result)

totalcount<-fromJSON(result)[[2]] ##returns the total results number
fromJSON(result)[[3]] ##returns all the listings for the 10 results
fromJSON(result)[[3]][[1]]$volumeInfo$publishedDate ##returns the date the book was published for result number 1.

lapply(fromJSON(result)[[3]],function(x) x$volumeInfo$publishedDate) ##returns the publishing date for all 10 books in the list.

##Again you will need to loop this with a new startIndex value each time until 440 is reached.
#Finally, categorize the book;s impact over time by grouping the dates according to year (because
#this is most likely the only datum consistently available.
#The following loop will amass all the JSON returned.

totalcount<-fromJSON(result)[[2]] ##returns the total results number
list1<-list()
#Begin for loop
for(i in 0:floor(totalcount/10)){

list1[[i]]<-getURL(paste0("https://www.googleapis.com/books/v1/volumes?q=%22the%20carbohydrates%22%20pigman&startIndex=",(i*10)),ssl.verifyhost=F,ssl.verifypeer=F,followlocation=T)

}

#The following loop will amass only the published date of results. Less data to save and more time between calls (which is a good thing for the servers).

totalcount<-fromJSON(getURL("https://www.googleapis.com/books/v1/volumes?q=%22the%20carbohydrates%22%20pigman&startIndex=0",ssl.verifyhost=F,ssl.verifypeer=F,followlocation=T))[[2]] ##returns the total results number
vec<-c()
#Begin for loop
for(i in 0:floor(totalcount/10)){

vec<-c(vec,unlist(lapply(fromJSON(getURL(paste0("https://www.googleapis.com/books/v1/volumes?q=%22the%20carbohydrates%22%20pigman&startIndex=",(i*10)),ssl.verifyhost=F,ssl.verifypeer=F,followlocation=T))[[3]],function(x) x$volumeInfo$publishedDate)))

}

vec

#If you want to call quicker, use the URL to extract only the totalItems and publishedDate information by appending the following to the URL

#&fields=totalItems,items/volumeInfo/publishedDate

#This will return only the dates.

#Display vec in R as a kind of timeline graph using package igraph

#As a saveable function. Input your API key in double quotations and your query in double 
#quotations (URL-encoded).

GBapi<-function(query,key){
totalcount<-fromJSON(getURL(paste0("https://www.googleapis.com/books/v1/volumes?q=",query,"&startIndex=0&key=",key),ssl.verifyhost=F,ssl.verifypeer=F,followlocation=T))[[2]] ##returns the total results number

list1<-list()
#Begin for loop
for(i in 0:floor(totalcount/10)){

list1[[i+1]]<-fromJSON(getURL(paste0("https://www.googleapis.com/books/v1/volumes?q=",query,"&startIndex=",(i*10),"&key=",key),ssl.verifyhost=F,ssl.verifypeer=F,followlocation=T))

}

list1

}

And lapply() on the resulting list for the data.



And for comparison

#partition the plot space
par(mfrow=c(2,1))
#Plot one book first. xlim parameter makes sure the windows are the same size.

plot(table(unlist(regmatches(unlist(lapply(gbapi,
function(x) lapply(x$items,function(y) y$volumeInfo$publishedDate))),
gregexpr("[0-9]{4}",
unlist(lapply(gbapi,
function(x) lapply(x$items,function(y) y$volumeInfo$publishedDate))))))),
ylab="Number of Books on Google Books",xlim=c(1800,2015))

title(main="Some books published per year
relating to
'Computational Chemical Graph Theory' by Trinajstic")
#plot the other book below
plot(table(unlist(regmatches(vec,gregexpr("[0-9]{4}",vec)))),xlim=c(1800,2015),
ylab="Number of Books on Google Books")
title(main="Some books published per year
relating to

'The Carbohydrates' by Pigman")

Hopefully this kind of metric provides a useful way to approximate the scholarly impact a book has had on other books. In the sciences, textbooks and other books have always had an authoritative quality to them, so this metric may indicate a certain kind of scientific influence which may include teaching, information gathering and reputation all in one. 

Current difficulties are mostly related to the limits imposed on the user by the Google Books API. At a certain point, the number of books returned on a result page diminishes. A workaround for this is in the works.

Wednesday 2 September 2015

Using PubChem to match CAS numbers to identifiers

Using the PubChem REST API is the most straightforward for new users because it utilizes the URL.
CAS registry numbers are ubiquitous chemical identifiers that have use in many areas of industry. It is important, therefore, to be able to connect other chemical identifiers to CAS RN, improving the visibility of chemicals on the internet.

1. Download data (containing CAS RN)

Domestic Substances List (Canada)
Non-Confidential TSCA Inventory (United States)

2. Search individually through PubChem REST API (leveraging R) and returning as text.

"naphthenic acids"
https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/name/1338-24-5/synonyms/txt

R code:

library(XML)
library(RCurl)
LIST<-{your vector of CAS RN}
getURL(paste0("http://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/name/",LIST,"/synonyms/txt"))

use xmlTreeParse() for each entry in the vector to transform it into xml for slightly easier handling.

3. Create list object in R of synonyms by dumping synonyms into list objects.

For each xml, get the value of each synonym node and save it as the i-th list entry.

OR

3b. Use REST API to make a call for identifiers

"naphthenic acids"
http://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/name/1338-24-5/property/InChI/TXT

3c. Append another column to the matrix which contains the identifier (InChI, SMILES, etc...)



USEFUL POST:
http://depth-first.com/articles/2007/05/21/simple-cas-number-lookup-with-pubchem/


Monday 31 August 2015

Cucumber + blueberry

Cucumber and blueberry, when eaten together, produce a combination of taste which is overall positive.
Together, I wouldn't say they enhance each other directly, but they dull each other's stronger flavours. For instance, some of the distinctly cucumber notes are lessened at the same time the bitter tones of blueberry are lessened.

Noticeable is the buttery flavour of blueberries still present. Overall the total flavour is not strong, but smooth.

The texture is dominated by the crunch of the cucumber whereas the blueberry, almost creamy in the presence of cucumber, is barely noticeable from the point of texture.

When cucumber is eaten directly after blueberries, the flavour of the cucumber overpowers the blueberry flavour quickly but not immediately.

When blueberry is eaten after cucumber, the flavour of the blueberry dominated the cucumber flavour in much the same fashion as the cucumber did to the blueberry.

C + B = C + B + 0.5CB

Monday 24 August 2015

Carrot + nectarine (white flesh)

The acidity of the nectarine, though low compared to other fruits, seems to "kill off" the flavour of the carrot.

Raw carrot comes with its own rooty tang which had a cousin in parsnip and maybe ginger. Yet, when combined with nectarine, the carrot almost seems to have no unique taste of its own and ask I can taste is nectarine with some tasteless crunch.

There's a huge difference in texture between a soft nectarine and a carrot stick. The carrot is hard, crunchy and dry while the nectarine is juicy, fleshy and soft.

Tuesday 18 August 2015

Canteloupe + raw peanuts = slightly unpleasant

Cantaloupe, when eaten with raw peanuts, produces a distinct mixture of flavours which is on the neutral to slightly unpleasant side.


Peanuts are dry and crunchy with slight chew and canteloupe is wet and slightly firm. In terms of texture, the two do not mix very well.

Raw peanuts clash with cantaloupe aftertaste.

Cantaloupe clashes with raw peanut aftertaste, but to a lesser extent than the reverse.

For me, the collective aftertaste is generally slightly unpleasant.

Monday 10 August 2015

Canada Substance Groupings Initiative

There are currently nine groupings of substances which pose environmental and health risks to Canadians.

Aromatic Azo and Benzidine-based Substance Grouping
Boron-Containing Substances
Certain Organic Flame Retardants Substance Grouping
Cobalt-Containing Substance Grouping
Internationally Classified Substance Grouping
Methylenediphenyl Diisocyanate and Diamine (MDI/MDA) Substance Grouping
Phthalate Substance Grouping
Selenium-containing Substance Grouping
Substituted Diphenylamines Substance Grouping

A *csv file containing all the substances is shown below. Copy and paste it and save it as *.csv.

Tuesday 4 August 2015

Recipe title word frequency for describing the food context of a searched keyword

A search for "adobo" using the Yummly API led to just over 6500 recipes containing the word "adobo".

Viewing the recipe titles then grouping the words according to frequency results in a table of the most frequent words associated with my searched keyword "adobo".

The top 30 most frequent words can be described by a pie chart:

Friday 24 July 2015

Tetra-tert-butylmethane and other chemical compounds that don't exist

I learned two things yesterday:

  1. You can buy tetra-tert-butylmethane from (at least) the following vendors:
    • Angene
    • Triveni
    • Hangzhou Sage Chemical Co., Ltd
    • Kingston Chemicals
    • Atomax Chemical Co., Ltd.
  2. Tetra-tert-butylmethane has never been made and likely cannot be made.

Aside from being a favourite high-school chemistry nomenclature problem, PubChem and ChemSpider both have entries for this compound. It has CAS number: 4103-17-7. And it does not exist.

So, I went searching.

Thursday 23 July 2015

Total synthesis of yaku'amide (again)

Total synthesis is always fantastic: You think of what you want to make and then you make it...that was a joke.

A few weeks ago, Inoue and group synthesized and "correctly" characterized the compound yaku'amide, a tridecapeptide named after the sample collection site at 屋久新曽根 (Yakushinsone), again.1,2

Friday 17 July 2015

On The Number of Phthalates

On the Number of Phthalates

Recently, four cover stories of C&EN magazine were dedicated to the class of molecules called phthalates. Phthalate esters of fatty alcohols are generally used in industry as plasticizers, turning PVC into a more malleable form which comprises many toys for young children. There have been studies of the links between exposure to phthalate plasticizers and antiandrogenic effects in humans (do a PubMed search for phthalates). Reports are coming out about its reproductive effects on females. So, the race is on to develop the next "healthier" plasticizer.

Wednesday 15 July 2015

Turning spiroketals into teddy bears

In Tim Burton's film The Nightmare Before Christmas, there is a scene where the Oogie Boogie Man, upon his demise, splits open at the seams and out spill thousands of worms. Besides taxidermy, the Oogie Boogie Man is the only thing I thought of when I saw this graphical abstract on Twitter:

Wednesday 24 June 2015

SMILES notation: The Functional SMILES Perspective

SMILES Perspectives

SMILES notation is so much fun to play with! Another reason why SMILES is an appropriate acronym. Because SMILES is a graph/connectivity language in string format, there are many ways to enumerate bond paths and subgraphs in molecules.

Wednesday 20 May 2015

Chemical notations - working list

The following is a working list of chemical notations used in chemistry. It does not contain most commercial formats (SDF, MOL, Gaussian, XYZ, etc...) but instead focuses on formats which more linear and human-readable (i.e. not heavily based on coordinates).

Saturday 11 April 2015

Chemicals in Canadian Law

Save the following data as .csv, or when importing into data software, use "," delimited. Next, replace all "(COMMA)" strings with a "," character as needed (especially when searching websites).

The R code outlining this action is below. You need to insert your own text where it says string or apply it to a character vector containing strings.