Faculty of Language: Scientific Publishing in a Modern World: A Thought Experiment

Wednesday, July 27, 2016

Scientific Publishing in a Modern World: A Thought Experiment

Norbert and regular readers of this prestigious blog may have seen me participate in some discussions about open access publishing, e.g. in the wake of the Lingua exodus or after Norbert's link to that article purportedly listing a number of arguments in favor of traditional publishers. One thing that I find frustrating about this debate is that pretty much everybody who participates in it thinks of this issues as how the current publishing model can be reconciled with open access. That is a very limiting perspective, in my opinion, just like every company that has approached free/libre and open source software (aka FLOSS) with the mindset of a proprietary business model has failed in that domain or is currently failing (look at what happened to OpenOffice and MySQL after Oracle took control of the projects). In that spirit, I'd like to conduct a thought experiment: what would academic publishing look like if it didn't have decades of institutional cruft to carry around? Basically, if academic publishing hadn't existed until a few years ago, what kind of system would a bunch of technically-minded academics be hacking away on?

As you might have already gleaned from the intro, this post is rather tech-heavy. That said, I'll try my best to keep things accessible for people who have made the much more reasonable decision of not spending many hours a week on Linux Today, HowtoForge, the Arch wiki, or the website of the Electronic Frontier Foundation. Putting aside technical matters, the publishing model we would see in my hypothetical scenario has three fundamental properties:

Individualistic Every scientist is a publisher. Every scientist is a reviewer. Every scientist is an editor. Instead of an infrastructure that locks scientists into specific roles with lots of institutional oversight, scientists directly share papers, review them, and curate them.
Crowd Sourced By making every scientist an active part of the publishing model, you open up the way for a 100% crowd-sourced publishing and archiving infrastructure. All the issues that are usually invoked to motivate the current model --- administrative overhead, copy-editing, hosting costs --- are taken care off collectively by the community.
Fully Open The current system involves lots of steps that are hidden away from the community at large. Authors do not share the production pipeline that produced the paper (software, source code, editable figures, raw data), only the final product. Publishers do not share the tools they use for administration, hosting and editing. Reviewers do not share their evaluation with the community, only editors. The hypothetical system makes all of this available to the community, which can learn from it, critically evaluate it, and improve it as needed.

To keep things concrete, I will (try to) explain how the system works from the perspective of two participants: Joe, who's an average guy and just wants to get his results out there, and Ned, the tech nerd. We'll go through the following steps: writing, distribution, reviewing.

Writing

Joe and Ned do not use the same tools in their writing, but both use systems that separate content from presentation. This means that the source format they write in allows for many different output files to be produced, depending on purpose: pdfs for printing and presentations, epub for e-readers, html for online publishing.

Joe likes things to be simple, so he chose a markdown dialect known as pandoc. Markdown allows Joe to throw out his word processor and instead replace it by a text editor of his choice. As a Windows user, Joe eventually wound up with Notepad++ and its pandoc plugin Joe is happy that he didn't have to pay for Notepad++ (he's sick of spending money on the newest version of MS Office every few years). Notepad++ also loads much faster than any word processor on his aging laptop. Learning markdown was also easy for Joe, as it turns out that its syntax is very close to his digital note taking habits anyways. With the extensions of the pandoc format, he can do almost everything he needs: headings, font formatting (bold, italic), paragraphs, lists, footnotes, figures, tables, links, all the basics are there. Semantic formulas could be easier to write, but it works. Trees are automatically produced from labeled bracketings, although the syntax for adding movement arrows took some getting used to. And for glossed examples he had to adapt an old trick from his word processor and use invisible tables to properly align all words. But Joe is confident that once more linguists use pandoc, these kinks will be quickly ironed out.

In contrast to Joe, Ned believes that a steeper learning curve and heavier time investment is often worth it in the long run. He has put many hours into learning Latex, in particular tikz, with the major payoff that he now has perfect control over his writing. With tikz he can handle even the most complicated trees and figures, typsetting semantic formulas is pure joy, glosses work effortlessly, and he can write mathematical formulas that would make every proprietary typesetting software sweat. Ned has to pay special attention to portability, though, since some Latex tricks do not work well with HTML and epub instead of pdf. Recently, Ned has also discovered that he can replace the Latex engine by Luatex, which allows him to extend Latex with Lua scripts. Overall, Ned is confident that his time investment has paid off and that Latex will still be around for many decades to come thanks to a large community that has been going strong since the mid 80s.

Both Joe and Ned also love that the source files for their papers are now in plain text and can be read by anybody with a text editor. Compilation to an output format increases readability, but the content of the paper is perfectly clear from the source code itself. This ensures that no special software is needed to open and read these files. Even 50 or a 100 years from now, researchers won't have a problem reading their papers --- for if civilization has forgotten how to read plain text files, it has forgotten how computers work.
Since everything is plain text, Joe and Ned can also put their papers under version control, e.g. git. This allows them to record the entire history of the paper, from the first few sentences to the final product. They can also easily sync the papers to an external server --- Ned rolled his own via cgit, Joe went with the more user-friendly github. With easy syncing and detailed version control, it is also very simple for the two of them to write papers collaboratively.

Joe and Ned now happily write their papers using their own workflow. There's no publisher telling them what software or template to use, how long their paper may be, nor whether they should use American or British spelling. They each produce the paper that they think conveys their ideas best while being as pleasant to read and look at as possible. They realize that this freedom also means that they cannot rely on somebody else to fix things for them, but with great power comes great responsibility.

Distribution

Both Joe and Ned now have several papers that they want to share with the world. They make them available on their personal websites in a variety of formats, but it occurs to them that this is not a particularly good way of going about things. Sure, by hosting them on their website they clearly signal that they wrote these papers and endorse them in the current form. But their website is not a good venue for promoting their work, nor is it a safe backup. What if their website goes down? What happens after they die? There has to be a better way of doing this.

Joe and Ned briefly consider other options, in particular paper repositories such as Lingbuzz and arXiv. But those have no guarantee of availability either, and they show just how easy it is for Joe and Ned's work to get lost in an uncurated sea of papers. And then there's of course the issue of cost: if a few servers get hammered by thousands of downloads and uploads every hour, whoever has to keep those servers running needs to pay big bucks for hardware and system administration. This makes it hard for volunteers to shoulder the burden, and for-profit repositories cannot be relied on in the long run. It seems that any solution with a single point of failure is no solution at all.

Ned quickly realizes, though, that distributing a paper is not the same as hosting a paper. Taking a hint from pirates and Linux distros, he decides to use peer-to-peer file sharing. He creates a torrent of his paper (all output formats + the whole version control history) and puts a magnet link to it on his website. He also uploads these magnet links to a number of archives. Ned makes sure that his paper is always seeded by running a torrent client on his home router, and he asks everybody who downloads the torrent to keep seeding it. So now Ned can rely on the community to keep his paper available even if his website and all the repositories go offline.

Because Ned is lucky enough to live in a hypothetical world where ideas succeed based on their merits, other researchers follow Ned in adopting peer-to-peer distribution of their papers. As the number of users grows, so does the number of seeders, and the network becomes more and more resilient to failure. Moreover, paper repositories no longer host pdfs but just magnet links, which only take up a few bytes each instead of several megabytes. This reduces bandwidth usage by a factor of 1,000, and the size of the repositories shrinks from terabytes to megabytes. All of a sudden, everybody with a little bit of hard drive space can make a full backup of these repositories or run a mirror. Tech-savvy researchers do so across the field, greatly improving resilience. It is now impossible for a paper repository to disappear --- if one server gets shut down, hundreds of mirrors are ready to take its place. Even if all servers were to magically disappear over night, many researchers would have local backups on their computers that they could share online to get a new server started. Since the actual distribution of files is completely decoupled from the distribution of magnet links, even a total loss of servers and backups would not mean a loss of papers --- researchers would just have to reshare the magnet links for the torrents, which are still alive and well in the torrent network.

Libraries and professional societies also take notice and start creating dedicated backup torrents that collect entire years or decades of publications under a single magnet link (actually, they create a script to scrawl the web and share it with the community, so there is little actual work involved). At first, these much larger torrents (several GB per year) are shared only by professional institutions and power users like Ned with lots of network storage. But as the prices for mass storage keep plummeting, even Joe finds that he can pay his debt to the scientific community by purchasing a NAS with several TB for 200 bucks. The NAS comes with a torrent client built in, so all he has to do is turn it on and seed these backup torrents. Over the years, seeding becomes an expected part of academic life, just like reviewing and administrative duties are in our world --- in contrast to the latter, though, the time commitment is a few minutes per year at most.

But Ned, eternal tinkerer that he is, still isn't completely happy with the system. Recently some trolls played a prank on him: they downloaded one of his papers, replaced all figures by 4chan memes, and shared the modified version of the paper as a very similarly named torrent. Many of his colleagues fell for the scam and wrote him dismayed emails about his homophobic agenda. In order to prevent such abuse in the future, Ned decides that torrents need something like a certificate of authenticity. Since Ned already has a PGP key for encrypting email, he decides to sign all his torrents with his key --- torrents that aren't signed with his key clearly aren't his. Again people like the idea, and everybody starts signing their torrents. Professional societies jump on the train and offer double signing, the service to sign torrents with their society key that have been signed by one of their members. This makes it very easy to design a filter that only accepts torrents signed with one of these society keys. Some nifty programmers also develop a tool that allows other academics to sign torrents they downloaded and verified for correctness, creating a distributed web of trust in addition to the central verification via professional societies.

Another thing that irks Ned is that the papers are shared in a distributed manner, while magnet links are not. If the evil dictator of Tropico wanted to limit access to scientific papers, they could DNS block all paper repositories. Tech-savvy users would get around a DNS block in minutes, of course, and mirrors can be created faster than they can be blocked. But it still makes it much harder to access papers for most researchers. So Ned looks around a bit and learns about Freenet and Zeronet, which extend the peer-to-peer concept to websites. Ned starts a distributed magnet link repository that can be shared by the community just like any other torrent, without any central servers or centralized DNS records. Now only deep packet inspection could restrict access to these repositories, but since the traffic is encrypted this isn't possible, either. The result is a network that is completely hosted by the scientific community, which guarantees that it is freely accessible around the globe as long as this community exists.
Joe and Ned now both live in a world where they can easily create and distribute papers. They can rest safely in the knowledge that the burden of paper distribution and archiving is shouldered by the whole community. But while papers are easier to share than ever before, it is also harder than ever before to find good papers. In a sea of thousands of papers, it is hard to separate the wheat from the chaff.

Evaluation and Review

What Joe and Ned's world still lacks is a system to indicate the quality of a paper. That is not the same as missing reviews. Reviews are easy, because they do not differ from any other paper: any academic can write a review and distribute it as a torrent. If the torrent isn't signed (and the author didn't include their name in the review), the review will automatically be anonymous. But if a researcher wants to know whether a paper is worth an hour of their time, the answer cannot be to spend several hours tracking down reviews and reading them. There must be a system to quickly gauge the quality of a paper within a few seconds, and to easily find more in-depth reviews.

The typical, user-based review system of Amazon, Netflix and Co will not do. It requires complicated software (at least a full LAMP stack), is tied to specific paper repositories, makes backups and mirrors much more complex, and does not work with Freenet or Zeronet. Again we need something that is platform independent, community-hosted, easy to backup, and built on robust technology that will be around for many years to come.

Note that these are all properties of the paper distribution system in Ned and Joes' world, so the best choice is to directly integrate the review system into the existing infrastructure. We want two levels of review: shallow review, similar to facebook likes, and deep review, which mirrors modern peer review.

Shallow review amounts to adding up how many people like a paper. In other words, we want to know how trusted a paper is by the community, which takes us to a point we already mentioned above: the web of trust. Even good ol' Joe now understands that torrents can be signed to indicate their authenticity via a web of trust, and the same web of trust can be used to indicate the quality of a paper. Instead of just a verification key, researchers have three types of keys: a verification key (to guarantee that they authored the paper), a yea key (for papers they like), and a nay key (for bad papers). After reading a paper, Joe can sign a paper to indicate its quality, or just do nothing --- e.g. if he felt the paper is neither particularly good nor particularly bad, or he doesn't qualify to judge, and so on. By tallying up the positive and negative signatures of a paper, one can compute an overall score of quality: 87%, 4 out of 5 stars, whatever you want. Readers are free to define their own metric, all they have to do is count the positive and negative signatures and weigh them in some manner. Ned will happily define his own metrics, while Joe goes with a few that have been designed by other people and seem to be well-liked in the community. So the shallow review is neutral in the sense that it only creates raw data, rather than a compound score.

Having three keys instead of one slightly complicates things since one now has to distinguish between keys signing for authenticity and keys signing for quality. But that is something for people even more technically minded than Ned to figure out. Once a standard has been established, the torrent clients for end-users like Joe just need an option to indicate what key should be used for signing. Joe picks the right option in the GUI and thereby casts his digital vote. Paper repositories are updated at fixed intervals with the web-of-trust scores for each torrent. Some dedicated services will be started by the community or commercial enterprises to analyze the readily accessible raw data and convert it into something more insightful, for instance what kind of researchers like your papers. With a minor extension of the infrastructure, Joe and Ned now enjoy a system where each paper has quality data attached to it that can be converted into whatever metric is most useful to authors, readers, universities, and so on.

But a score still isn't exactly a good way of assessing or promoting a paper. Nobody goes through a paper repository to look for the most recent papers with 4+ stars.And that one paper has much more likes than another says little about their relative scientific merit. Ned is worried about this, but before he even has time to come up with a solution, somebody else does it for him: many researchers are already blogging, and many of them use their blogs to review and promote papers. This is the seed for a much more elaborate and professional system for deep review.

In order to attract an audience, blogs need to specialize. Over time, then, some of them start to focus on paper reviews, garnering them a devout following. The blogger writes a review of a paper, taking great care to indicate the reviewed version in that papers version control history (we'll see in a second why that matters). He or she also signs the paper with a positive or negative key, depending on overall evaluation. If the blogger is well-known, metrics that turn signatures into scores may take this into account and assign this signature more weight than than others, which we might consider a reflection of the blogger's impact factor. Some tools will also be able to pick out the signatures of prominent bloggers and do a reverse search to automatically load the review of the paper. Other tools keep track of signatures of torrents and notify an author when a signature by a prominent reviewer has been added to one of their papers. They can then take this review into account, revise the paper, and incorporate the new version into the torrent (that's why it matters that reviews link to specific commits in the version history). The reviewer, in turn, can revise their review if they're so inclined. Since blog posts are produced like any other paper, they too can be put under version control and distributed in a peer-to-peer fashion. This creates a reviewing ecosystem where reviewers and authors can interact in a dynamic fashion, all changes and modifications are preserved for history, both papers and reviews are readily accessible to the community, and readers can define their own metrics to determine if a paper is worth their time based on who signed it.

But things do not stop here. To improve their coverage and grow their audience, some of the blogs join forces and form reviewing networks, which they run under flowery names such as Linguistic Inquiry and Natural Language and Linguistic Theory. These review networks are very prestigious entities, so their signatures can affect scores a lot. Since they have a large readership, they are also essential in getting many people interested in your paper. Many reviewing blogs also seek out experts in the community for reviews, who can do so anonymously if they want. Reviewers are paid by having their signing key signed by the reviewing blog. Since those are prestigious keys, metrics can once again decided to value the signature of an academic more if it is signed by a reviewing network. Consequently, many academics are eager to review for these blogs in order to increase their own prestige and the influence they wield over the reception of other papers. Without any intervention by Ned, a reviewing system has naturally developed that is incredibly detailed and comprehensive while being easy enough for Joe to participate in.

Wrapping Up

So there you have it, my pipe dream of what publishing would look like if it wasn't tied down by existing conventions and instead was newly designed from the ground up with modern technology in mind. I skipped some points that I consider minor:

copy editing: author should read carefully or hire a copy editor if they absolutely want one; dedicated community members can suggest paper revisions via forks and pull requests on git repositories (one of the many advantages of plain text + version control)
doi: bad solution since it's centralized, and not really needed if you have magnet links; but if you absolutely want to you can register each magnet link under a doi
page numbers: make no sense for HTML and epub, so switch to paragraph numbering
aggregators: just like blog aggregators, these provide meta-reviews and promote the best promotional blog entries
conference proceedings: dedicated signing key for validation and review + a separate blog, which may also be just a category on some bigger blog

And then there's also some systemic issues that are hard to fix, in particular the Matthew effect (the rich get richer, the poor get poorer) and ratings hysteria. However, those issues also exist in the current system. Eradicating them is hopeless, I think, though one would like to have some mitigating strategies in place. The fact that signatures are metric neutral is helpful, as every reader can define their own metrics, and professional organizations can give recommendations on what a good metric should like. If you have any other suggestions, or you feel the need to burst my naive tech bubble, the comments section is all yours.

13 comments:

TracyJuly 28, 2016 at 8:02 AM
This comment has been removed by the author.
ReplyDelete
Replies
TracyJuly 29, 2016 at 8:48 AM
it might be the case that I missed this in my reading, but is there any space for double blind reviews in this system? Nothing's 100% effective, but double blind is a good way to mitigate some worries like subconscious or conscious gender biases at the review stage. The system proposed seems to have more points where biases like that could have an effect
ReplyDelete
Replies
ewanJuly 30, 2016 at 3:25 PM
I feel like there's something missing from the deep review system, and the comment I'm writing now is a case in point. There's no incentive to actually change the paper in response to the reviews. There's every incentive not to - thinking is hard, and doing the ensuing work takes time. In the current system, in contrast, you (1) don't get published, and then (2) don't get jobs/funding if you don't appropriately respond to reviewers (which ideally amounts to "improve the paper").

Thought experiment: what kind of response are you inclined to give to this comment? My impression is that everyone's default response to blog comments is to give a reply that ranges between thoughtless rage and sort-of-thought-out deflections. The chances are pretty low that someone turns around and says, "Ah, I see your point. I take back point 2 completely, and replace it with this new, completely re-thought solution." Once you've thought through and written the damn thing, it's over with, and you have no reason to go back and think through and write any part of it again.

The only potential incentive system here is maybe the paper's popularity, mediated maybe by its likes, and the likes themselves. I'm skeptical as to whether that's enough to make people improve their papers, both because I just don't think it's enough and because I have a feeling the likes and/or the popularity are not going to be responsive to the revision. If the first version was bad, no one's going to go back and read the revision I'd wager.
ReplyDelete
Replies

Add comment

Faculty of Language

Comments