Open Tools and R Packages for Open Science
A while ago weonboarded anexciting package, codemetarby Carl Boettiger. codemetar is an R specificinformation collector and parser for the CodeMetaproject. In particular, codemetar candigest metadata about an R package in order to fill the termsrecognized by CodeMeta. This meansextracting information from DESCRIPTION but also from e.g. continuousintegration 1 badges in the README!

You might have read my blog post analyzing the social weather ofrOpenScionboarding,based on a text analysis of GitHub issues. I extracted text out ofMarkdown-formatted threads with regular expressions. I basicallyhammered away at the issues using tools I was familiar with until itworked! Now I know there’s a much better and cleaner way, that I’llpresent in this note. Read on if you want to extract insights abouttext, code, links, etc.

rgbif was seven years old yesterday!What is rgbif? rgbif gives you access to data from the Global Biodiversity Information Facility (GBIF) via their API.

This post is the 1st post of a series showcasing various rOpenScipackages as if Maëlle were a birder trying to make the most of R ingeneral and rOpenSci in particular. Although the series use cases willmostly feature birds , it’ll be the occasion to highlight rOpenSci’spackages that are more widely applicable, so read on no matter what yourfield is! Moreoever, each post should stand on its own.

This week version 2.0 of the mongolite package has been released to CRAN. Major new features in this release include support for MongoDB 4.0, GridFS, running database commands, and connection pooling. Mongolite is primarily an easy-to-use client to get data in and out of MongoDB. However it supports increasingly many advanced features like aggregation, indexing, map-reduce, streaming, encryption, and enterprise authentication.

In this technote I will outline what phylotaR was developed for, how to install it and how to run it with some simple examples. What is phylotaR? In any phylogenetic analysis it is important to identify sequences that share the same orthology – homologous sequences separated by speciation events. This is often performed by simply searching an online sequence repository using sequence labels.

eBird is an online tool for recording birdobservations. The eBird database currently contains over 500 millionrecords of bird sightings, spanning every country and nearly every birdspecies, making it an extremely valuable resource for bird research andconservation. These data can be used to map the distribution andabundance of species, and assess how species’ ranges are changing overtime. This dataset is available for download as a text file;

Motivation A few weeks ago, as part of the rOpenSci Unconference, a group of us (Sean Hughes, Malisa Smith, Angela Li, Ju Kim, and Ted Laderas) decided to work on making the UMAP algorithm accessible within R. UMAP (Uniform Manifold Approximation and Projection) is a dimensionality reduction technique that allows the user to reduce high dimensional data (multiple columns) into a smaller number of columns for visualization purposes (github,