Comments on Faculty of Language: POS POST PAST

Really good work there. Informative and helpful. A...

2022-06-09T00:35:42.852-07:00

Really good work there. Informative and helpful. Appreciate it. But might be looking for POS System in North Carolina

I look forward to reading your paper with Bob and ...

2014-11-27T08:57:19.770-08:00

I look forward to reading your paper with Bob and William when it is ready ; sounds like the sort of work there should be more of.
I think our methods and assumptions are quite similar though we come to different conclusions -- but these are very different from those of Paul and Jeff and Norbert. So I wonder what your take is on the strategy of making thinks like PISH innate without going all the way to a finite class of grammars.

I think a lot of the work in inductive inference exactly looks at how to "characterize formally learnable classes of languages". That isn't quite what I do -- I am trying to characterize learnable languages, with respect to certain algorithmic approaches. I don't think there is any problem then with relating these languages to human languages, since (most of the time) they are fairly close variants of the standard grammars that mathematically precise linguists use. Other people in the grammatical inference community look at other types of grammars -- pattern grammars, external contextual grammars that are rather further away from the standard view.

Anyway, I am sure your turkey needs either to be basted or taken out off its brine depending on what time zone you are in, so happy thanskgiving to all, and especially to Norbert for stimulating some interesting discussion.

Alex: I don't think our goals, or even the met...

2014-11-27T06:33:15.487-08:00

Alex: I don't think our goals, or even the methods, are all that different.

Linguists, especially those in the P&P persuasion, try to characterize the properties of human languages with hopefully good learnability properties. Some of us work on this more directly: for instance, William Sakas, Bob Berwick and I will argue that a linguistically motivated parameter space is feasible to learn, at next year's GALANA conference to be held at Maryland.

Folks like you, and many in the grammar induction tradition, try to characterize formally learnable classes of languages. The burden of proof is to show that these classes of languages correspond to something like human language.

None of us has really gotten there yet but all of us, including the Bayesians, are reacting to the consequences of formal learning theories. As Gold noted, to achieve positive learnability, we can restrict the space of grammars or provide the learner with information about statistical distributions of the languages. Of course, these two tracks are not mutually exclusive.

Just lost another comment so here is a truncated v...

2014-11-27T05:06:14.154-08:00

Just lost another comment so here is a truncated version:

You write
"(4) much of what we know is due to how we encode experience/hypotheses, as opposed to specific experiences that "confirm" specific hypotheses.
...
.(6) much of what we know is due to our having experiences that confirm specific hypotheses. "

So I am not sure I see the contrast between 4 and 6 in quite the right place.
So if we have a Bayesian PCFG learner, then it doesn't rely on specific experiences that confirm specific hypotheses, but rather on a subtle computation where potentially every experience confirms probabilistically almost every other hypothesis to a greater or lesser extent, whereas a triggering algorithm in a P & P model does have a specific experience that confirms a specific hypothesis (a parameter value setting).
Whereas I think most people would have this the other way round.

I think there are real differences between languag...

2014-11-27T01:04:21.443-08:00

I think there are real differences between language and geometry. One is that the truths of geometry don't depend on contingent cultural facts or vary from place to place.

For me it is not a philosophical debate but entriely an empirical one. There is, as Chomsky never tires of pointing out, some part of human genetic endowment that is absent in rocks and kittens that allows us to acquire language. This acquisition ability is some sort of LAD, which takes the input and outputs some grammar that defines an infinite set of sound meaning pairings. Both the LAD and the input are *essential* to the process.

You write " I think knowledge of language is an interesting example of knowledge acquired under the pressure of experience, but not acquired by generalizing from experience. ".
For me the term "generalizing" just means that the grammar output is not just a memorisation of the input: i.e. that the child can understand and produce sentences that are not in the input. And "from experience" just means that the output of the LAD depends on the input. I don't see how one can deny either of these. And learning is the term we use in the learning theory community for that process. So I think there is a terminological difference here unless you are referring to Fodorian brute causal processes, which I guess you aren't?

Now the POS used to be an argument for domain specific knowledge in the LAD. I never liked that argument. This has now changed in the MP era and now the POS is just, as it should be a question: what the hell is going on in the LAD that such rich grammars can come out of such poor inputs? (to copy the title of a recent book).

So my view is we can move forward by making specific proposals about the structure of the LAD as a computationally efficient learning algorithm that can output grammars of the right type, and we can make progress by developing a general theory of learning grammars. As opposed to using a general theory of probabilistic learning like Charles Yang does or like the Bayesians do. But that's just me; others have different approaches which is of course good. One approach however does not try to come up with models of the LAD. Rather it comes up merely with restrictions on the class of grammars output by the LAD.

So rather than saying "the LAD is this function ..." they say "the LAD is some function whose outputs lie in the class of grammars G". i.e. just defining the range of the function.
Or worse just saying "the LAD outputs grammars that have property P" where the class of grammars and the property P are not defined. My scepticism is about these types of solutiom, which in the examples I have looked at don't explain anything. I.e. with "solutions" to the POS of the form "X is innate so the learner doesn't have to learn X ". And my second objection to this is Darwin's problem in the case when X is clearly specific to language, and the proposed solution which is just to kick the can down the road and hope that at some point in the future there will be a reduction to something non language specific.

(PS One of the reasons I dislike the term UG is that as it is often used it blurs the distinction between the LAD and properties of the outputs of the LAD. So if anyone feels the need to use the term UG, please specify which you mean.)