
I’ve written several times here about the Make Data Count project and its major output to date, the Data Citation Corpus, currently at version 4 (see The fourth release of the Data Citation Corpus incorporates data citations from Europe PMC and additions to affiliation metadata). In June Make Data Count launched a Kaggle Competition with the goal of developing a tool that will process articles (in either PDF or XML format), extract data