Informática y Ciencias de la InformaciónInglésHugo

rOpenSci - open tools for open science

rOpenSci - open tools for open science
Open Tools and R Packages for Open Science
Página de inicioFeed JSON
language
PackagesTesseractImagesOCRTech NotesInformática y Ciencias de la InformaciónInglés
Publicado
Autor Jeroen Ooms

Last week Google and friends released the new major version of their OCR system: Tesseract 4. This release builds upon 2+ years of hard work and has completely overhauled the internal OCR engine. From the tesseract wiki: We have now also updated the R package tesseract to ship with the new Tesseract 4 on MacOS and Windows. It uses the new engine by default, and the results are extremely impressive!

CommunityEventsUnconfIcebreakerWelcomeInformática y Ciencias de la InformaciónInglés
Publicado
Autor Stefanie Butland

While many people groan at the thought of participating in a group ice breaker activity, we’ve gotten consistent feedback from people who have been to recent rOpenSci unconferences. We’ve had lots of requests for a detailed description of how we do it. This post shares our recipe, including a script you can adapt, a reflection on its success, examples of how others have used it, and some tips to remember.

CommunityEventsCommunity CallImagesOCRInformática y Ciencias de la InformaciónInglés
Publicado
Autor Stefanie Butland

rOpenSci’s software engineer / postdoc Jeroen Ooms will explain what images are, under the hood, and showcase several rOpenSci packages that form a modern toolkit for working with images in R, including opencv, av, tesseract, magick and pdftools. 🕘 Thursday, November 15, 2018, 10-11AM PST; 7-8PM CET (find your timezone) ☎️ Find all details for joining the call on our Community Calls page.Everyone is welcome. No RSVP needed.

LiteratureXMLParsingPubchunksFulltextInformática y Ciencias de la InformaciónInglés
Publicado
Autor Scott Chamberlain

pubchunks is a package grown out of the fulltext package. fulltextprovides a single interface to many sources of full text scholarly articles. Aspart of the user flow in fulltext there is an extraction step where fulltext::chunks()pulls parts of articles out of XML format article files.

CommunityDataSoftware Peer ReviewPackagesData ExtractionInformática y Ciencias de la InformaciónInglés
Publicado
Autor Thomas Klebel

Every R package has its story. Some packages are written by experts, some bynovices. Some are developed quickly, others were long in the making. This is thestory of jstor, a package which I developed during my time as a student ofsociology, working in a research project on the scientific elite withinsociology.

OrcidRorcidDescTech NotesInformática y Ciencias de la InformaciónInglés
Publicado
Autor Maëlle Salmon

Proper identification of individuals is crucial for acknowledging andstudying their scientific work, be it journal articles or pieces ofsoftware. In this tech note, one year after CRAN started supportingORCIDs, we shall explain why and how to use unique author identifiers inDESCRIPTION files. 🔗Why use ORCIDs on CRAN? When analyzing the authorship of CRAN packages, one can look at authors’names and email addresses.

PackagesGifAnimationVideoAvInformática y Ciencias de la InformaciónInglés
Publicado
Autor Jeroen Ooms

At rOpenSci we are developing on a suite of packages that expose powerful graphics and imaging libraries in R. Our latest addition is av – a new package for working with audio/video based on the FFmpeg AV libraries.

CommunityEventsCommunity CallInformática y Ciencias de la InformaciónInglés
Publicado
Autor Stefanie Butland

Do you have code that accompanies a research project or manuscript? How do you review and archive that code before you submit a paper? Our next Community Call will present different perspectives on this hot topic, with plenty of time for Q&A.What’s the culture of the group around feedback and code collaboration?What are the use cases?What are some practices that can adopted?

CommunitySoftwareSoftware Peer ReviewPackagesOutcomerateInformática y Ciencias de la InformaciónInglés
Publicado
Autor Rafael Pilliard Hellwig

🔗Background Surveys are ubiquitous in the social sciences, and the best of them are meticulously planned out. Statisticians often decide on a sample size based on a theoretical design, and then proceed to inflate this number to account for “sample losses”. This ensures that the desired sample size is achieved, even in the presence of non-response.

MarkdownR MarkdownXml2CommonmarkTinkrInformática y Ciencias de la InformaciónInglés
Publicado
Autor Maëlle Salmon

Remember our recent post showing that one can wrangle Markdown filesprogrammatically without regex? Thattech note showed how to convert Markdown bodies to XML in order toextract information from them.

CommunitySoftwareSoftware Peer ReviewPackagesData AccessInformática y Ciencias de la InformaciónInglés
Publicado
Autor Max Joseph

Hundreds of thousands of people in east Africa have been displaced and hundreds have died as a result of torrential rains which ended a drought but saturated soils and engorged rivers, resulting in extreme flooding in 2018.This post will explore these events using the R package smapr, which provides access to global satellite-derived soil moisture data collected by the NASA Soil Moisture Active-Passive (SMAP) mission and abstracts away some of