Art manager

Improvements for man and machine in scientific publishing

The need for information from research results to be more findable, accessible, interoperable and reusable (FAIR) has prompted researchers, database managers and publishers to constantly seek new and better ways to make machine-readable information. Another equally important area is creating articles that readers can actively engage with, rather than passively taking in information by reading a published article. One tool that easily improves the machine readability of data is a data standard called Frictionless Data, developed by the Open Knowledge Foundation. A study published in the journal Open Science gigabyte revealed that not only does Frictionless Data dramatically improve machine readability, it can also transform normally static numbers in the article into dynamic entities that allow readers to directly interact with the data in the article. Demonstrate that the use of Frictionless Data can tackle two important activities: enabling human and machine to use and directly engage with scientific results in a dynamic way.

The frictionless data integration was performed on a paper by a team of researchers from the University of Melbourne in Australia, led by Professor Anthony Papenfuss, whose lab has long been a proponent of open and reproducible research. Ensure that data, source code and all other shareable components of their research are freely available to the community. This makes their work especially conducive to using new tools in addition to their articles to make published work dynamic and actively usable. The article here introduces two new open source tools, svaRetro and svaNUMT, for interpreting difficult-to-structural variations in genome analysis. These help to annotate new genomic events that are missing in most genome assembly pipelines: such as retrotransposition events and insertion of DNA fragments from mitochondria to nuclear DNA, which contribute to the complexity of genome sequences and understanding gene function and genome evolution.

The openness and availability of all the research components behind these tools and analytics has created a perfect opportunity to implement Frictionless Data to make the article much more machine readable. In the process of adding this to the article, Raniere Silva of City University of Hong Kong, while on a FAIR data internship, made the serendipitous discovery that Frictionless Data might also play a role in the improving human interaction with the article. The figures, for the first time, have been interactively regenerated. In the example here, readers can not only view the summary information presented in the figure, they can hover over the data points to see the exact numbers and information behind them, and also manipulate the figure itself to view specific components that interest them.

Silva says, “My biggest surprise was that the Frictionless Data Package specifications, together with the popular Plotly tool, have functions to convert a static visualization into a dynamic visualization. This greatly lowers the barrier for many researchers to produce dynamic data visualization as they only need to add a line or two to their code. gigabyte took a huge leap forward in publishing Dynamic Data Visualization and I hope this inspires other journals to publish Dynamic Data Visualization.

When asked what they found most helpful in this process, the authors said, “The interactive figures are a great addition to the article. We found that the interactive features made the labels easier to read, especially for the label-rich figures, and we liked that the figures were accessible in SVG format, allowing them to be viewed and edited without losing detail. figure information.

To promote the use of Frictionless Data in more published articles, Silva has written a detailed manual which includes an introduction to the use of Frictionless Data, an introduction to the specifications, short working examples to create the own data package of ‘an author and long examples, based on articles published in GigaScience and gigabyte reviews, illustrating the creation and use of Frictionless Data. The goal is for the handbook to serve as a starting point for a conversation within the scientific community about how to embrace frictionless data. This manual also provides a resource and guidance to make it easier for data producers to submit articles with these packages to data publishers, such as GigaScience Press.

Of additional interest, in addition to the inclusion of Frictionless Data, the paper is that for the first time, as digits were regenerated interactively, this process combined a CODECHECK certificate of reproducible computation.

The use of Frictionless Data and all the downstream elements it enables, serve as transformative steps in scientific publishing, as they improve machine readability and reproducibility, and transform scientific papers from their static format to the old in a 21 formatst living document of the century. These kinds of innovative, data-driven additions to the publishing process are part of the reason why gigabyte was the winner of the ALSPS Innovation in Publishing Award 2022 presented this month.

– This press release was originally posted on the GigaScience website