Understanding LLMs and Language: A Linguist's…

Jan 13

You’ve used ChatGPT. Do you know how it actually works? And what does it mean for the translation industry?

8 Comments

Thanks for your thoughts. In the absence of a workable linguistic model of language (which linguists have struggled with for decades) AI companies have taken the number crunching route as you explain. The results will therefore never be as good as the best minds though will be good enough for many applications. The secrets of language remain to be discovered and part of me hopes they never will be as therein I believe lies the definition of what it means to be conscious, what it means to be alive. If chatbots ever get that far we won’t have much need for human contact.

Expand full comment

Reply (1)

Adam Chrenko

Jan 14

Thank you, Justin, I appreciate your time and ideas. As you point out, the absence of the linguistic model (I understand it as a blueprint for the general functioning of any language) may be the reason behind going the number crunching route – or, rather a shortcut. I believe that not everything needs to have a template or "cheat sheet" (though many marketers would disagree with me) – sometimes, we just have to do the work and the subject of expertise will be inevitably ever-changing and ever-developing.

Of course, the number-crunching model using LLMs thwarts this to a large extent, reducing the whole concept (comfortably, to some) to a set of data. And it may or may not work, probably with varying efficiency, depending on the language and application in question.

"Good-enough" is another great concept, thanks for bringing it up – lately it has been very "popular", especially when it comes to translation. People just need the texts to mean something, not be perfectly representative of the original. It is possible that this will be the new trend – though, we may ask, what will it lead to?

And if chatbots do get as far as you sketch out – well, what will there be for us to do, indeed? We will have yielded human language (and with it a large part of our identity) to machines and, with that, communication, writing, books, knowledge etc. Maybe we can write poems...? (But I've seen that's also been "transferred" to AI, and some people think it can do better than Shakespeare, so...)

Expand full comment

Reply (1)

Ash Stuart

Apr 23

I understand the dilemma, and the pain, in these comments and the underlying articles. Our brains, whether it comes to producing language or creating machines that can do the same, are limited in our understanding of how language works. Whatever we tell ourselves about the strengths or limits of machine intelligence and how that manifests as linguistic output, in the practical realm, as long as it convinces its users (in this case I guess those who use translation services), neither the underlying implementation nor our musings about it would matter. If it quacks like a duck...

Expand full comment

Jorgen Winther

Jan 13

Great walk-through and, as always from you, reasoning and considerations that make sense.

About what we logically should do – when it comes to buying machines to do human work, those who buy them are not those who know what is good. In fact, they often care the least about quality, they just want to show that they have saved money.

You, who know what is needed for creating good quality, are the one who will be replaced by the machine, so nobody will ask you.

That's the kind of logic that works in business life.

Expand full comment

Reply (1)

Adam Chrenko

Jan 14

I totally agree with you, Jorgen. The companies that are trying to use this technology seldom understand its ins and outs, and how it actually works on the technical (or, in this case, linguistic) level.

And yes, the experts who should be behind the metaphorical steering wheel to achieve better or satisfactory quality will be phased out and replaced by people who can "sell" or "market" this technology.

Very true, but we also lose a ton of expertise along the way – my question is, will that matter? (Especially considering the fact that these models learn on answers we provide, to a certain extent – if the answers are low quality, we are sort of in a doom cycle). So far, though, it seems like it's not going to be a problem...

Thank you so much for taking the time to read it!

Expand full comment

Reply (1)

Jorgen Winther

Jan 14

Low quality is a problem. But only to some, as others do not see the quality as low.

I suppose this is comparable to many other things in industry — and by industry I mean everything produced for the masses. I had a time when I wrote a lot about lamps and their development, which is a fascinating story! What I could see, among other things, was that the designs of paraffin/kerosene lamps became simpler over time, with the use of thinner materials. Mass production was optimized to save resources, this either making the products cheaper or the profits larger.

Industry works like that. Many people have the idea that new inventions are happily embraced by industry, resulting in constantly better products, but very often, a new technology isn't being manufactured at all. This, because the old tech is already in production, all machines are geared to produce them, and there is no competition that would justify improvements. In other words, as long as the old and less good products can be sold, they will, and they may just get thinner along the way.

About the training and post-training of AI. I have taken part in this — probably through jobs similar to those you have been offered at 7 pounds per hour, even though I got somewhat more for it. I would say that since very many people have done it, and many of them at the lowest thinkable rates (so that they would see it as a last resort), there must, logically, have been some of them doing a bad job. But there was an intent to do it well, and the individual projects were often adjusted or supplemented to reach a quality level that was acceptable.

As you mentioned in the article, there are filters of various kinds, to sort the contents, both in the input and the output. They appear in different places of the training and use processes, but they are not as subjective as they could perhaps have been imagined: They are mostly about preventing some of the unwanted kinds of answers, that the early experiments could give. Such as suicide guidance, marriage proposals, and other ethical problems, but also purely legal matters — there are certain things that could lead to a trial, in some countries more likely than others, but since these machines are based in the USA, the harsh reality of that country have set the standard.

But the overall goal of a chatbot is to make the conversation easy and positive, providing a kind of answer that is reasonable for the questions. That is probably the source of the somewhat schematic and often quite similar answers you'll get from the machine. The shape of the answer, such as an introduction, five bullets, and a conclusion, is part of this.

In general, much of what the AI is answering, is not meant to be great literature, it is meant to answer the question in a brief and comprehensible way.

And that is probably also why it doesn't work well for writing novels. And not well either for many translation tasks.

I agree with you, that if the machine doesn't care about semantics, it can never reach a good quality level for certain tasks. And in my opinion, we just shouldn't use it for such tasks.

Expand full comment

Luca

Jan 16

Thank you for this! I really enjoyed (and worried at the same time) reading it :)

Expand full comment

Reply (1)

Adam Chrenko

Jan 16

Thank you for taking the time to read it, Luca :). I appreciate it… It is worrisome indeed.

Expand full comment

Hrot

Understanding LLMs and Language: A Linguist's…