Comments on Faculty of Language: Right sizing ling papers

I screwed up and put Mark de Vries' comment on...

2016-10-31T16:09:32.110-07:00

I screwed up and put Mark de Vries' comment on the wrong thread. I deleted it from there and am putting it up here. It seems that Google doesn't like him and is preventing him from posting. I am sorry. I am sorry that I don't know how to read. Here is Mark's comment:

When -- long time ago -- I apologized to the late Hans den Besten for the many footnotes in a chapter of my PhD thesis, citing a writing advisor according to whom the use of footnotes means that the text is ill-structured, Hans just shrug his shoulders and said reassuringly that those guys simply don’t understand the complexity of the matter.

Having said that, yes, we all agree that the average syntax paper tends to be too long and convoluted (obviously not one size fits all). This is counterproductive for all parties: readers, reviewers, and authors themselves. I suppose most of us, including me, are ‘guilty’ of participating in a culture where long papers are the norm. But there is no reason for individual blame or to state that the field is anti-theoretical or insufficiently sophisticated. It is not so strange that authors prefer to be visibly recognizable as experts on the topic rather than risk the accusation of being ignorant of various related matters. But clearly things have gotten out of hand. Reviewers and editors play a key role, here:
-- Reviewers are somehow tempted by the system to come up with every possible counterargument or potentially problematic data point (from any language) they can think of.
-- Authors are expected to cite and to some extent discuss every work ever published on the topic. Evidently, this is untenable in the long run.
-- Editors require authors to respond to everything reviewers say.

Now what? Encouraging individual authors to be more succinct is totally insufficient. We’d need an active Shorter and Clearer Paper Movement. Here are some simple suggestions for its party program:
-- EDITORS and REVIEWERS must be convinced that each review should consist of two parts. The first evaluates the soundness of the argument and the clarity of the core proposal of the paper at hand. The second may contain helpful further references, additional data, thoughts/etc., which are not supposed to play a crucial role in the overall assessment.
-- AUTHORS try to clearly highlight the core argument structure of the paper. If there is more to discuss, the first part of the paper deals with the essentials, and then there can be sections “more concerning x, additional thoughts about y, ...” which can be skipped by readers. Relevant additional data can be in appendices, etc. (So the total page number can still be high if necessary, but the core of the paper is shorter.)
-- THE FIELD should reach a new consensus concerning citation. Do we think it is fine for an author to tell his/her own story, simply refer to an overview article for further information, and only cite/discuss other work where it is really important? (Mind you, the linguistic citation index will go down over time.)
-- JOURNALS accept that there can be follow-up articles that do not elaborately summarize all the foundations.
-- READERS appreciate that papers are not all-encompassing.
-- EVERYONE bears in mind that “...the most worthwhile scientific books [papers] are those in which the author clearly indicates what he does not know; for an author most hurts his readers by concealing difficulties.” (Évariste Galois, 1811-1832)

One final remark. I don’t understand Norbert’s claim that the point of departure for every paper must be a clearly-defined problem. Yes, we tell this to undergrad students for didactic reasons, and it’s generally not a bad rule of thumb. But what happened to actual discoveries, new ideas, original deductions, explorations, ...? Of course one can always artificially construct a ‘problem’ with hindsight, but that may be entirely beside the point.

This seems good to me. Another, unrelated thing t...

2016-10-31T14:10:35.784-07:00

This seems good to me. Another, unrelated thing that might make papers more useful, if a few sentences longer, is a 'minimum prerequistes' statement (bundled with the acknowledgements, perhaps) that would name some works that you really have to have a reasonable grasp of in order to understand the paper. So for my latest lingbuzz upload, that would be the chapters on coordinate structure and glue semantics in Dalrymple's 2001 book. This is a proposed adaptation to the fact that papers are much more likely to be stumbled upon by people with insufficient background who nevertheless might find the contents interesting and useful, so just telling them what to read first might be helpful.

I think there is one (very minor) improvement we c...

2016-10-31T13:50:19.536-07:00

I think there is one (very minor) improvement we could make upon the standard presentation of starred and unstarred examples: when an unstarred example is unstarred *because* it's been reported as acceptable, it should be marked with a check mark (or something) instead of a star. The current method makes no distinction between reporting that a certain experiment resulted in a judgement of acceptability, and providing the stimulus of that experiment. (Usual caveats about how what we care about is contrasts, not binary status.) In the past perhaps this wasn't really a problem, but in modern "experimental syntax" papers, for example, there are times when a sentence needs to be presented without yet making a claim about its unacceptability (e.g. in a methods section where you're showing the materials), and the obvious thing to do is to write down the sentence without a star (or any comparable mark), but this looks the same as a reported finding of acceptability. (Also occasionally in older papers, authors would say things like "It would also be interesting now to test sentence (30), but unfortunately it's basically impossible to give judgements on", and then sentence (30) has to be presented with some weird annotation ... perhaps a question mark, but then you have to clarify that it's not the usual kind of question mark, etc.)

Perhaps I'm being optimistic but I think changing this practice would (subtly) clarify/reinforce the actual nature of what we're doing, and the sense in which an acceptability judgement is a little psychological experiment. It seems plausible (I don't actually know the history?) that the starred/unstarred notation caught on, and made a lot of sense, in pre-Chomskyan linguistics when the absence of a star meant something more like "yes, this is in the language" or "yes, I found this in a corpus" or simply "yes, this sentence exists"; in this mindset, there's something relatively natural about only marking (what we would now describe as) the unacceptable things, and it makes less sense to distinguish between (the existence of) the sentence and the finding that it is judged acceptable. But nowadays a report that something is acceptable is just as much a report as is a report that something is unacceptable.

So while I agree that it would be silly to make the data in linguistics papers look "more sciencey" by forcing everything to take the form of graphs and tables, perhaps this one minor modification could clarify the sciencey nature of it slightly without significantly changing the way we do our work.

I don't think there's any way in which the...

2016-10-30T14:36:00.212-07:00

I don't think there's any way in which the presentation of starred vs unstarred examples can be significantly improved, because the significant contrasts just do not fit into any kind of pre-established patterns such as those that support graphs and tables. In fact, I think the organization of linguistics papers is somewhat similar to math papers, and for a similar reason: in math, the true, interesting and nonobvious statements are thinly scattered amongst the false, boring or obvious ones, according to no intelligible form of overall pattern.

What we can do is relegate to appendices or online repositories additional information about the data that it is now necessary or at least highly desireable to have. For example each grammatical vs ungrammatical contrast is hopefully a well-chosen representative of a very large number of similar examples, so you put the other ones you looked at, all of the relevant examples you found in your corpus, etc, somewhere other than in the main text. Ditto details of surveys, etc. But that doesn't affect the basic structure of the papers, but is rather a way of handling information that was typically not provided at all 40-50 years ago, but definitely should be, now.

For anybody who does want to permanently archive a...

2016-10-30T13:24:30.180-07:00

For anybody who does want to permanently archive a data set that doesn't fit into a paper, the University of Tromsø hosts an open linguistic data repository called "TROLLing" (Tromsø Repository of Language and Linguistics): https://opendata.uit.no/dataverse/trolling

See here for a notice on LINGUIST list about it:
http://linguistlist.org/issues/25/25-4338.html

The idea of archiving material related to papers i...

2016-10-30T13:01:29.304-07:00

The idea of archiving material related to papers is important. (I've been part of a working group on data citation in linguistics https://sites.google.com/a/hawaii.edu/data-citation/ There'll be a panel session at the LSA.) It's unclear how that will help shorten papers, though. Right now when an author makes a claim, they provide representative data supporting that claim. The reviewers and readers can look at that data and evaluate it (e.g. there's a confound b/c of animacy/focus particles/verb class/etc/etc). For a review situation, it's then up to the authors to find data that doesn't have the confound. For a reader situation, later readers can also notice factors whose relevance were discovered long after publication (e.g. d-linking/etc). If we relegate all data to an appendix, it'll be like reviewing abstracts -- having to flip back and forth to see the data, which is often annoying. (Also think endnotes vs footnotes.) If we don't provide representative data, but just put all data in an archive, then the reviewer and reader will be more likely to just "trust" the authors, rather than sifting through much more data to see if there are any confounds. That's likely to reduce the quality of the papers in that more confounds will go unnoticed. So, improving the structure of papers is a worthwhile goal, but it strikes me as not so easy.

It's interesting to hear that papers are getti...

2016-10-29T21:09:37.391-07:00

It's interesting to hear that papers are getting longer in linguistics when in other fields, e.g. CS, psychology, and many sciences, the trend has been the opposite. Those fields have made just as much progress as linguistics, so long papers aren't inevitable. Those other fields must have better ways of summarizing previous results, or a better distribution of workload between survey and research papers, or different priorities for papers.

Regarding the justification of assumptions, I'm with Avery: if your assumptions get you interesting new results, that's the only support they need. I think syntax is fairly unique even within linguistics as there is a strong expectation to situate technical assumptions within an overarching view of language. Some syntax papers start out with long introductions about Minimalism, third factors, the role of merge, and how this picture can be interpreted in a specific way to yield a set of assumptions from which they derive a specific phenomenon. I think these concerns are best left for a separate paper, which would be the linguistics analogue of your typical Science/Nature paper. First you show that your assumptions get the job done, then we talk about their conceptual implications.

As for data, there I agree with Norbert. *Nobody* should like the current way we present data. Imagine a psycholinguistics paper where the descriptions of the experiments and the stimuli are scattered over 50 pages, no journal would publish that. But that's exactly what many linguistics papers are like.

At the very least, each paper should come with a digital supplement that contains all the example sentences with glosses and acceptability judgments, information about how these judgments were obtained, whether there were noteworthy disagreements among informants, and so on. In contrast to a paper, the supplement can contain as much data as you want and include an excruciating amount of additional information. With the right technical tools supplements could even include toy grammars of your analysis that can be fed into an MG parser to ensure that your analysis works for all the data. And of course phrase structure trees and derivation trees for all the examples. The whole thing could be like an iPython notebook, with raw data, documentation and computer-aided visualization interleaved in a dynamic fashion. And having this information available digitally would make it much easier to test replicability, e.g. via meta studies.

The only question, then, is what the linguistic analogue would be to diagrams and tables. Those are great ways of compressing quantitative data in a paper, and we have no comparable way of succinctly presenting acceptability judgments.

Perhaps the justification of choice of assumptions...

2016-10-29T16:51:32.040-07:00

Perhaps the justification of choice of assumptions could be eliminated or greatly abbreviated, on the basis that the real justification is the results that are obtained from them. A stronger division between between survey papers and research papers, with less emphasis on trying to present yourself as a person of good judgement.

I am not sure that the suggestion was to eliminate...

2016-10-28T07:53:39.832-07:00

I am not sure that the suggestion was to eliminate it so much as condense it and make the raw data available in another format/venue. We should be able to do what they do in Nature/Science where the main idea is outlined in one place and the supporting data/details are available in another. That, at least, is how I understood the suggestion.

Impressionistically at least, at NLLT we are indee...

2016-10-28T07:17:58.248-07:00

Impressionistically at least, at NLLT we are indeed receiving longer and longer papers, despite maximum length limitations. It's an increasing burden on reviewers and editors. Part of the problem is that the existing literature is large and diverse. Authors and reviewers want previous work to be given its due, and want the particular choice of assumptions from the previous literature to be justified (since a different choice could have been made). Mark's point on replicability is quite important, hence I disagree strongly with the suggestion to eliminate data from the papers.

Yes, I agree that documentation like that is poor,...

2016-10-26T09:19:53.610-07:00

Yes, I agree that documentation like that is poor, and that prose that's mutatis mutandis similar can be found in many papers. Essential information being scattered throughout the document is particularly widespread, is my intuition.

It seems we all agree that papers are at a substantial disadvantage because of their format. Even an explanation of the analysis that's pithy and on target for a wide range of audiences still has the strong potential to be useless on a large proportion of reads. I've always wondered what the best format for a more dynamic paper like that could be. That would be even better than submitting two separate versions of a paper (which I doubt anyone would submit to a journal that required them to do).

Here's a call for abolishing word limits in th...

2016-10-26T02:11:21.403-07:00

Here's a call for abolishing word limits in the field of ecology & evolution, noting that longer papers tend to be more widely cited: http://retractionwatch.com/2016/10/25/should-journals-abolish-word-limits-for-papers/#more-45425

The dynamics are likely different in linguistics (let alone generative syntax), but I found these observations insightful:

"Longer papers are probably better cited because they contain both more and a greater diversity of data and ideas (Leimu & Koricheva, 2005b). We argue that the positive relationship between citations and both author number and references cited support this hypothesis. Studies that have more authors tend to draw on a greater diversity of expertise, whether practical or intellectual (Katz & Martin, 1997), and thus present a greater diversity of ideas and/or data types, especially when collaborations are interdisciplinary. Likewise, papers likely cite more references because they have a greater diversity of arguments to support or ideas to place into context."

In other words, the kind of super-compact format Norbert and others argue for may be useful within a narrow discipline, but it may also make it harder for a broader audience to engage with theory & data and build on results.

Although squibs have been getting fatter, LI will ...

2016-10-25T03:33:03.687-07:00

Although squibs have been getting fatter, LI will still publish squibs considerably shorter than 8 pages (me and some co-authors had a 4 page squib a few years ago). I suspect the problem has less to do with length requirements and more to do with the intense competition amongst syntacticians to publish LI squibs. If you have two squibs to choose from, and one merely outlines a problem in 2 pages while the other takes the next step and presents an analysis in 8 pages, then it must be difficult not to choose the latter for publication. It is likely to get more "enthusiastic" reviews, and from an editorial point of view, it is likely to give the impression of being a more important paper.

That is of course why Snippets was a good idea, since the stringent length requirements effectively precluded any attempt at giving a detailed analysis. It's a shame it's non-operational.

Minimum page count is definitely an issue. The sma...

2016-10-24T19:44:07.581-07:00

Minimum page count is definitely an issue. The smallest publishable unit in linguistics is squibs, and those are expected to be around 8 pages. That's already the length of a full ACL paper. It's longer than ACL short papers, and it's much longer than your average Science or Nature paper. Snippets was an attempt to rehabilitate the short paper format, but it seems to be on an indefinite hiatus now.

Code documentation is actually a very strong contr...

2016-10-24T19:33:39.142-07:00

Code documentation is actually a very strong contrast to how most linguistics papers approach things: in source files, code and documentation are clearly separated through notational devices. In a linguistics paper, it's all part of the same prose text, and that makes it a pain to read if you want to quickly extract the essential information. I'm perfectly fine with explanations if they're clearly separated from definitions and analysis so that I can immediately go to the parts I want to read and skip the rest.

And I'm pretty sure you'd agree that the following is not good coding style (sorry for the formatting, apparently blogspot doesn't support code tags):

# I now define a function, which I'll use to
# save myself some typing later on;
# functions are defined with the def command
def merge(a,b):
"""Merge a and b

Arguments:
a - some thing
b - some thing

Output:
combination of a and b

Example:
>>> merge(a,b)
{a,b}
"""
# our arguments are a and b;
# my assumptions about the licit types for
# a and b are stated in the documentation for
# the function syntax()
argument1, argument2 = a, b
# we combine the arguments for Merge
output = mystery_function(argument1, argument2)
# mystery_function is defined in one of the
# other files, as you might remember;

# I now use Python's return statement to
# return a value
return output

The point being: too much explanation reduces readability rather than improving it, and scattering essential information all across the document is a bad idea.

It would be cool if scientific papers could be more like source code, including nice usability features likes folds, ctags, and automatic syntax highlighting. Then you could easily throw in 100 examples without affecting reading flow, but unfortunately papers are still very static, linear affairs.

I rather like the idea of submitting two documents...

2016-10-24T08:14:53.350-07:00

I rather like the idea of submitting two documents - the essentials, and the essentials plus thorough documentation

Lit review sections are often bloated and meanderi...

2016-10-24T07:56:28.167-07:00

Lit review sections are often bloated and meandering. I admit I may have a bias. I have had two papers come across my desk recently with bloated and meandering lit review sections, and only one was a linguistics paper, and only in that case did I blame the customs of the field for the bloat, rather than simply the author (the other was a psychology paper). That having been said, I have the genuine suspicion that linguistics authors feel they have a minimum page count.

I rather disagree about the principle of handholdi...

2016-10-24T07:53:52.607-07:00

I rather disagree about the principle of handholding. A good talk (in which the author actually tries to get the listeners the right intuitions to be able to think about the mechanics of the thing) is worth a thousand re-reads of the same paper. Much like good code should be easy to read on its own, but should also come with decent documentation. And, well, it can't *always* be easy to read the code, even for the initiated. Sometimes stuff is hard. If there were a way to somehow submit two documents - an analysis and some halfway-decent documentation for the analysis - then it might be easier to read in the appropriate way. That's not to say that I don't think papers are too long, just that I am not a fan of not documenting things in a way that makes them accessible, and thus, subject to more scrutiny.

I think the ideal length for a linguistics paper i...

2016-10-23T15:38:05.641-07:00

I think the ideal length for a linguistics paper is 30 pages, but there are many reasons why it sometimes has to be exceeded, although very rarely beyond 50. One reason that linguists might need more formal handholding than computational people is simply that they are on the whole not as good at math ... possibly fundamentally as able in many cases, but practice makes a big difference. Linguistics also I suspect has a considerable population of people who do have significant mathematical ability, but were traumatized in various ways by bad or inappropriate teaching in K-12, so the need for handholding might be psychologically a bit deeper than just having to go at a slower pace due to less practice.

Seems to me Peter's request for good models is...

2016-10-20T04:30:27.889-07:00

Seems to me Peter's request for good models is impossible to answer without specifying the subfield, and even then the metric by which one might decide what's a good model is highly debatable.

For the record, the most cited journal ever in Language is Sacks, Schegloff & Jefferson (1974) (surpassing even Chomsky's review of Skinner). It has a highly condensed style where an elegant system of ordered rules for turn-taking is deduced on the basis of conversational data (i.e., competence inferred from performance). Samples of actual records of conversation are supplied along with the specification of the system, which makes the argument essentially replicable for any reader. It's an impressive paper in many ways and it has been massively influential in many fields, though it has too many footnotes to my taste and it is not particularly easy to read.

I'm not so sure that the narrative is indispen...

2016-10-19T16:22:51.853-07:00

I'm not so sure that the narrative is indispensable. Suppose you start your paper by proposing formalism F. You then present a battery of acceptability judgments and show that F can account for them. And that's basically it. There are of course other explanations, but it's not your job to debunk them because there's infinitely many.

If I'm not mistaken many syntax papers in CCG, LFG, and HPSG have this flavor: here's the inference-rules/feature system we agreed on and the modifications I posit, and here's the structures/derivations for the data. The data itself illustrates why F can't be some imaginary F' (because any deviation from F will make the wrong prediction for, say, example 47).

To give a more concrete example, suppose you're the first person to propose sidewards movement. You specify how it works (feature triggers, locality constraints, etc.), and then you show data points and their derivation trees under your analysis. No prose, no elaborate discussion of the derivations, just sentences and their trees, like the tables and figures in a psych paper. You cluster the trees into expository classes like "simple examples", "illustrate importance of locality constraints", "why feature F can't be feature G". The reader may study these derivations in great detail if they want to understand every tiny aspect of your proposal, or they may believe you that this handles the data very elegantly and focus their full attention on a paper that's closer to their research instead.

Both the dedicated and the casual reader will get something out of your paper, whereas the latter will often have a hard time quickly extracting the core insight from a conventionally written syntax paper. But would the dedicated reader have to put in more effort because many details aren't discussed at length in the paper? I'm not so sure. Presumably a dedicated reader would be an expert in the area for whom these details should already be familiar from their own research.

To give an example from math: the more specialized the paper, the more steps are left implicit in a proof. And mathematicians like it that way, because spelling out those details may help beginners but distracts experts from the essential aspects of the proof. To some extent you also have that in linguistic writing --- we often assume that the reader can fill in details like subject movement and so on --- but I guess the smaller size of the field means that a large part of your audience consists of researchers who aren't experts on the topic. But they also want more than the CliffsNotes, so you end up with this middle ground which simply doesn't work too well.

That question gets asked a lot in various circumst...

2016-10-19T15:38:43.455-07:00

That question gets asked a lot in various circumstances, e.g. what papers to assign in a student writing class. But given my personal preferences I actually find it paradoxical: A paper is good if I remember the ideas and results but not the paper itself.

I can't think of anything like that in syntax, but math papers often have that flavor. A brief intro with the main result and the structure of the paper; some preliminaries; and then we're right in the battlefield of proofs and theorems. If there's an example, it doesn't come with any special prose, it's simply a new paragraph with a boldface "Example." at the beginning. Bullet points will be used where convenient (apparently a big no-no for some people). The authors don't have distinctive voices, the writing is succinct and functional with zero frills. Many phrases and sentences are formulaic and repeat quite a bit. You don't even get a lot of connective tissue between the results. It's less like writing and more like clicking together lego blocks. At the end, I know exactly what the authors showed, but I don't remember the written form in which this knowledge was conveyed because it was so barebones and basic.

That's pretty much the antithesis to linguistic writing, and it might well be the case that this writing style simply wouldn't work for linguistics. The parallels to philosophy that David points out might play a role here: a distinctive voice is essential for a good philosopher, and discourse-focused writing with little style is very tedious.

Then again, some psycholinguistics papers are fairly close to this ideal considering their very rigid structure and formulaic descriptions of experiments and results. The only real writing is often limited to the introduction and the conclusion.

Another advantage of both math and psych papers is that it's very clear what information is conveyed where and what I can skip depending on my level of interest. In a math paper, I might ignore the proofs or only study the theorems that are relevant to my research problem. In a psych paper, I never read the experimental design part because I lack the expertise to critically evaluate the design anyways. I will also skip the statistical analysis as long as I can reasonably assume that the authors did the number crunching correctly --- I won't check their statistics anyways unless it matters to a research project of mine. Basically, I can cut down math and psych papers to a few pages that contain the information I actually want. Syntax papers make it very hard for you to do that because the information is scattered across 30+ pages that are heavily interlinked. It's an all-or-nothing deal that makes it very hard to read only those parts you care about.

Does anybody feel like nominating some papers as g...

2016-10-19T02:43:45.955-07:00

Does anybody feel like nominating some papers as good models of how a good linguistics paper should be?

I agree with much of this, especially prose specif...

2016-10-19T00:36:20.042-07:00

I agree with much of this, especially prose specifications of derivations, which also set my teeth on edge, but which I've had requested by reviewers. But I think I understand why there's a need for narrative presentation of data: the narrative presentation is actually the underpinning of the analysis. You can't just present the data in a compressed tabular format, as the various steps in the argument are the support for why you take the data to have the interpretation and import that it does. Then that analysis is what is generally connected to the theory. In fact, some of the most beautiful and impressive syntax is of exactly this sort: all the heavy work (and the long-lasting insights) are the marshalling of arguments for a particular view of the phenomenon, which is then shown to inform/follow from/ destroy some theoretical position.

This narrative presentation might also be behind Thomas's concern about `derivational writing'. Quite often such writing is not actually a reconstruction of the process of research (it certainly shouldn't be), it's a rational reconstruction of how the process of research should have gone. I actually agree that it'd be better not to use this (very venerable) technique of presentation (with Principle X', X'', X''', X(final form), etc), but in my own work, at least, I often get complaints about not doing this, and about having too technical an approach to simply specifying the system as a whole, without handholding.

Because of the data narrative underpinning the analysis, linguistics papers are most like philosophy papers, I think, rather than papers in the other humanities. Papers in literature, history, etc. tend to actually be very short (15pp or so), since the real work is saved for (and generally repeated in) monographs.

There's several aspects of linguistic writing ...

2016-10-18T15:58:47.092-07:00

There's several aspects of linguistic writing that lead to bloat:

1) The insufficiency of existing notation
I cringe every time I see a prose explanation of how a given structure is built. You know, the usual "XP moves to Spec,YP after which ZP extracts from XP, which triggers bla bla bla, yadda yadda yadda". These kind of explanations often take up a half a page and are much harder to follow than simply drawing up a derivation tree or writing down the sequence of rules. It's very similar to early 19th century math, which is painful to read nowadays. Efficient notation is essential for a field, and syntax doesn't use much beyond labeled bracketing and traces.

2) Data Presentation
Linguistic writing has no techniques for compressing acceptability judgments for presentation. That's why the discussion of the data alone can already take up half a paper. Rather than dumping all data into an online supplement, I'd like to see a system to present and talk about data more succinctly. No idea how to do it, though.

3) Derivational writing
Instead of presenting a solution, linguistic papers tend to retell the history of how the researcher came up with the solution. It's particularly egregious with monographs, where in some cases everything is thrown away that was proposed before the last chapter. That's unheard of in math or CS: you define your analysis or model (again, good notation helps), then you apply it to data. The application to data is enough to show why you defined XYZ in a particular way, no reason to walk us through fifty deadends.

4) Completionist papers
This was already hinted at in your post, but the field's expectations for what constitutes a smallest publishable unit are very high. You can't just present a formalism, show that it solves one particular set of data, and then address problems in follow-up papers. Maybe the field doesn't sufficiently appreciate that an analysis of a given phenomenon, even if it cannot readily be extended to any other piece of data, still reveals interesting aspects of that phenomenon? Trying to make your account perfect before publishing is also detrimental to progress, you wanna follow the open source motto "release early, release often".

5) Reader handholding
In comparison to the papers I see in math, CS, computational linguistics, and molecular biology, it feels downright patronizing how much handholding is in linguistics papers. If there's a definition, it's followed by three examples of how it works (okay, that's actually a good thing because often the definitions aren't explicit enough to work on their own; so we have a case of two minuses yielding a plus).

For any relevant piece of data, you don't just get a specification of the analysis (e.g. via a list of the lexical items with their features), you get a two page explanation that walks you through it from the very first Merge step to the very end.

The discussion of previous literature is a special case of this type of handholding --- instead of presupposing that the reader knows the relevant works or is smart enough to read a survey first if they don't, you do a mini-review for everybody. I think this is part of the humanities heritage of linguistics.

Besides length, there's also several other aspects of linguistic writing that could be improved imho. Definitions should be clearly indicated as such rather than being lumped into the main text or being put into the example numbering scheme. Similarly, crucial assumptions should be highlighted typographically. And most importantly, papers should have an identifiable macro structure so that you can quickly find the specification of the analysis and the relevant data, rather than piecing it together from various paragraphs that are distributed over the whole 50 pages.