
Natural history museums have long been valuable repositories of data on species diversity. These data have been critical for fostering and shaping the development of fields such as biogeography and systematics.
Natural history museums have long been valuable repositories of data on species diversity. These data have been critical for fostering and shaping the development of fields such as biogeography and systematics.
A number of the APIs we interact with (e.g., PLOS full text API, and USGS’s BISON API in rplos and rbison, respectively) expose Solr endpoints. Solr is an Apache hosted project - it is a powerful search server.
rplos is an R package to facilitate easy search and full-text retrieval from all Public Library of Science (PLOS) articles, and we have a little feature which aren’t sure if is useful or not. I don’t actually do any text-mining for my research, so perhaps text-mining folks can give some feedback. You can quickly get a lot of results back using rplos, so perhaps it is useful to quickly browse what you got.
Upcoming Book on Open Science with R We’re pleased to announce that the rOpenSci core team has just signed a contract with CRC Press/Taylor and Francis R series to publish a new book on practical ways to implement open science into your own research using R. Given all the talk about the importance of open science, the discussion often lacks practical suggestions on how one might actually incorporate these practices into their
The Global Biodiversity Information Facility (GBIF) is a warehouse of species occurrence data - collecting data from a lot of different sources. Our package rgbif allows you to interact with GBIF from R. We interact with GBIF via their Application Programming Interface, or API. Our last version on CRAN (v0.3) interacted with the older version of their API - this version interacts with the new version of their API.
We are building a taxonomic toolbelt for R called taxize - which gives you programmatic access to many sources of taxonomic data on the web. We just pushed a new version to CRAN (v0.1.5) with a lot of changes (see here for a rundown). Here are a few highlights of the changes.
We have previously written about creating interactive maps on the web from R, with the interactive maps on Github. See here, here, here, and here. A different approach is to use CartoDB, a freemium service with sql interface to your data tables that provides a map to visualize data in those tables.
Previously on this blog we have discussed making geojson maps and uploading to Github for interactive visualization with USGS BISON data, and with GBIF data, and on my own personal blog. This is done using a file format called geojson , a file format based on JSON (JavaScript Object Notation) in which you can specify geographic data along with any other metadata.
Open access week is here! We love open access, and think it’s extremely important to publish in open access journals. One of the many benefits of open access literature is that we likely can use the text of articles in OA journals for many things, including text-mining. What’s even more awesome is some OA publishers provide API (application programming interface) access to their full text articles.
I attended the recent ALM Workshop 2013 and data challenge hosted by Public Library of Science (PLOS) in San Francisco. The workshop covered various issues having to do with altmetrics, or article-level metrics (ALM). The same workshop last year definitely had a feeling of we don’t know x, y, and z , while the workshop this year felt like we know a lot more. There were many great talks - you can see the list of speakers here.
With the US government shut down, many of the federal government provided data APIs are down. We write R packages to interact with many of these APIs. We have been tweeting about what APIs that are down related to R pacakges we make, but we thought we would write up a proper blog post on the issue. NCBI services are still up! NCBI is within NIH, which is within the Department of Health and Human Services.