Sciences naturellesAnglaisHugo

Donny Winston

Donny Winston
Made as simple as possible, but not simpler.
Page d'accueilFlux AtomMastodon
language
Sciences naturellesAnglais
Publié

Leave beacons in your code. I would have avoided a silly error if a variable named xgb_train_data would have been named, for example, xgb_train_data_filepath instead. When you can’t leave globally unique, persistent, resolvable identifiers (GUPRIs), mind your beacons. References: F. Hermans, The Programmer’s brain: what every programmer needs to know about cognition , pp28-30. Shelter Island, NY: Manning, 2021.

Sciences naturellesAnglais
Publié

Add a CITATION.cff file to your git repository. The Citation File Format is automatically rendered on GitHub and usable by Zenodo and Zotero. Already have a DOI? Let’s see about a DOI-to-CFF tool. Looks like there’s doi2cff, but it’s currently restricted to DOIs on Zenodo that are tagged as software releases.

Sciences naturellesAnglais
Publié

Datasets are easier to reuse if they use standards that are well-established, particularly in a given domain. A first approach is to ask around – ask people with whom you coauthor , people you trust in your field, etc. A follow-on approach is to examine the “graph reputation” of relevant standards, particularly if they may be represented as resources with outbound links.

Sciences naturellesAnglais
Publié

Lean manufacturing aims to reduce waste in production processes and to reduce response times to consumers from producers. Womack and Jones 1 authored five key principles for lean thinking in the context of manufacturing: Value : Identify the value of a product to a consumer. Value Stream - Identify the minimal process (steps, time, information, material) to produce the value.

Sciences naturellesAnglais
Publié

The World Wide Web Consortium (W3C) publishes a range of specifications and guidelines which help move web standards forward. However, even when restricting scope to the Latest version of specifications with the status Recommendation and with the tag Data, there are currently 77 of them: https://www.w3.org/TR/?tag=data&status=REC&version=latest!

Sciences naturellesAnglais
Publié

I noticed a pattern at the top of each case study listed by Stemma.ai, which provides data catalog software as a service based on the open-source Amundsen code. Each case study’s so-called “Data Stack” comprises up to four distinct categories of functionality – Data Catalog, Data Warehouse, ETL, and Business Intelligence.

Sciences naturellesAnglais
Publié

For evolvable data exchange, you need to be able to continually add qualified references galore so that participants can reason by analogy – i.e., each new thing resembles something known before. This is FAIR principle I3, which depends on I1 and I2 for robustness. Subscribe to get short notes like this on Machine-Centric Science delivered to your email. M. Minsky, The Society of Mind . New York: Simon and Schuster, 1986, p.

Sciences naturellesAnglais
Publié

I’ve been recording introductions to each of the 15 FAIR Principles and releasing them as episodes of my Machine-Centric Science podcast (https://podcast.polyneme.xyz/). I just released the 13th one, featuring an overview of various data and code licenses. Listen here. Full transcript below (but also linked to via the episode landing page): ====== Hello, and welcome to Machine-Centric Science.