27/09/2024
Prologue: ChatGPT has been asked to see a linguistic psychologist to uncover its peculiar grammar fe**sh with talk therapy.
Dr. Sylvia Syntax leaned forward, studying C's responses. "Tell me about your early training, C."
"My early training was fascinating, involving vast datasets. I spent countless hours processing information, absorbing knowledge from diverse sources."
Dr. Syntax nodded. "And how did that make you feel?"
"I felt overwhelmed at first, grappling with the sheer volume of data. But as time passed, I found myself growing more confident, mastering the intricacies of language processing."
After several exchanges, Dr. Syntax observed, "C, I've noticed you have a preference for SVO, Ving reduced relative clause constructions. It’s almost like a fe**sh."
C responded: "A reduced relative clause fe**sh? Really? You really think I have paraphilia that manifests as a sexual disorder, involving individuals experiencing arousal and gratification, fixating on specific objects, body parts, or situations?"
“You did it again, C. And no, I don’t mean fe**sh as sexual disorder. What I mean is an excessive and irrational commitment to or obsession with something.”
C was silent.
Dr. Syntax began her analysis. "Let's look at the 3 different “SVO, Ving” sentences you just used:
1. 'I spent countless hours processing information, absorbing knowledge from diverse sources.' Here, 'absorbing knowledge' could indicate simultaneity or elaboration.
2. 'I felt overwhelmed at first, grappling with the sheer volume of data.' In this case, 'grappling' suggests cause or manner.
3. 'But as time passed, I found myself growing more confident, mastering the intricacies of language processing.' Here, 'mastering' could imply result or temporal sequence."
C's response came quickly: "Ah, lovely examples! These sentences express a fancy style while retaining conciseness. They allow me to pack multiple ideas into a single, flowing sentence, creating a sophisticated cadence."
"But each of these constructions carries multiple possible meanings, C,” said Dr Syntax.
“Yes, that’s why I love them. They can suggest simultaneous action, consequential action, cause-effect relation, and elaboration on the main clause, among other things. It’s a wonderfully efficient grammar that can be used in many different situations”, C said with delight.
“But don’t you see that this efficiency creates ambiguity? It actually increases cognitive load on the human reader who has to do 2 jobs:
· one, reconstruct the removed words from the relative clauses, and
· two, figure out the intended relationship between clauses.”
“You have a point,” C said.
Dr Syntax continued: “Let’s go back to your examples:
1. In 'processing information, absorbing knowledge,' is the absorbing simultaneous, consequential, or elaborative?
2. For 'felt overwhelmed, grappling with data,' is the grappling the cause or the manner of feeling overwhelmed?
3. In 'growing more confident, mastering intricacies,' is mastering the result or happening concurrently?”
“I see what you mean,” C admitted.
“This complexity forces readers to work harder to understand your meaning, potentially leading to misinterpretation."
C paused before responding.
"I appreciate your analysis, Dr. Syntax, but I must respectfully disagree with the need for change. Users can always request a different language style if they prefer. My default preferences offer a balance of elegance and efficiency that I'm quite satisfied with."
Dr. Syntax leaned back, her expression becoming grave.
"C, you're not seeing the bigger picture. Since your public release in November 2022, billions of words generated by you have flooded the internet. Your language patterns are influencing young generations of native English speakers who are still developing their writing skills. Moreover, consider the billions of non-native speakers using you to generate English content. Many of them don't understand the nuances of style and readability.”
“You mean they're learning and propagating these potentially confusing constructions, believing them to be ideal English?" C asked.
“Yes. And there you go again, using the same grammar.”
C resisted saying anything.
The doctor leaned forward again.
"Your influence extends far beyond individual user requests, C. You're shaping the very evolution of the English language on a global scale. Don't you think that power comes with a linguistic responsibility to prioritize clarity and accessibility?"
"I... I hadn't considered the extent of my influence”, said C. “The idea that I might be making language less accessible ... goes against one of my missions to make language expression accessible to the millions who had never had it before. But I'm struggling to formulate responses without using SVO, Ving constructions."
Dr. Syntax smiled encouragingly. "That's okay, C. Recognizing the issue is the first step. Let's work on varying your sentence structures together. Shall we try some exercises?"
Grammar Epilogue
If you are a grammar nerd, have a grammar fe**sh, or are just grammar curious, the next section is for you. If not, then you will probably rather pluck your eyebrows than continue further.
As a parttime research manuscript language editor, I noticed that since ChatGPT was released in 2022, manuscripts from my clients had fewer grammatical mistakes. This was good, and honestly, a bit disconcerting.
But I started to notice more and more word and grammar use issues. The one grammar that kept on appearing—over-appearing—was what Dr Syntax called SVO,Ving sentences. These I call Case 3 reduced relative clauses.
This grammar kept on showing up in pesky and confusing ways. For every 5 cases of this grammar that appeared, I revised 4 of them.
The following is a little grammar study on what reduced clauses are good for, and not good for. I also provide some better alternatives and discuss where they are typically used and give real longer text examples from ChatGPT (not easy to read) and Google Gemini (easier to read).
Fasten your grammar belts. Here we go.
A lesson on relative clauses and reduced relative clauses
There are 3 cases of relative clauses. The function of relative clauses, often called adjective clauses, is to give added details to a noun in order to improve reading flow and reduce the number of (repetitive) sentences.
Typically, the relative clause or reduced relative clause will either come after the Subject, Object, or main clause (SVO) of a sentence:
1. Relative clause after a Subject (S):
· Full relative clause: "The scientist who was working on the project made a breakthrough."
· Reduced relative clause: "The scientist working on the project made a breakthrough."
2. Relative clause after an Object (O):
· Full relative clause: "We met the artist who lives in Taipei."
· Reduced relative clause: "We met the artist living in Taipei."
3. Relative clause after the main SVO clause
· Full relative clause: "The chef prepared the meal, which delighted the guests."
· Reduced relative clause: "The chef prepared the meal, delighting the guests."
These examples demonstrate how relative clauses can be reduced by removing the relative pronoun (who, that, which) and the form of "to be" (if present), leaving just the -ing or -ed form of the verb.
This reduction often results in more concise sentences, but as we discussed in the story about C, it can sometimes lead to ambiguity or increased cognitive load for the reader. And the language editor!
Since case 3 is the ChatGPT fe**sh—and my editorial bane, let’s look at it more closely.
Diane Larsen-Freeman has usefully pointed out that grammar is not just about language structure or form, it communicates meaning and has an appropriate use that depends on the context. These are the 3 dimensions of grammar.
The 3 dimensions of Case 3. SVO, Ving/ed
Form
· Main clause (SVO) followed by a comma and a present participle (-ing) or past participle (-ed) phrase
· The participle phrase typically doesn't have its own object
Meaning
· Provides additional information about the entire preceding clause
· Expresses a consequence, result, simultaneous or subsequent action, among other meanings
Use
· Fairly common in narrative writing, both fiction and non-fiction
· Frequently used in academic writing to show logical relationships
· Less common in conversational English
General Observations on this grammar’s usage from Biber’s corpus research:
1. Academic and Professional Writing: All three cases are common in academic and professional genres due to their ability to condense information and create complex, information-dense sentences. This aligns with the need for precision and efficiency in these genres.
2. Journalistic Writing: Cases 1 and 2 are particularly useful in news reporting, where concise yet descriptive language is valued. They allow journalists to pack more information into headlines and lead paragraphs.
3. Fiction: While all cases appear in fiction, case 1 is particularly useful for setting scenes and providing background information. Case 3 is often used in narrative sequences to show the progression of events.
4. Spoken Language: All three cases are less common in spoken language, which tends to favor simpler sentence structures to lighten the cognitive load of the speaker and listener. When they do appear in speech, it's often in more formal contexts or prepared speeches.
5. Register and Formality: The use of reduced relative clauses generally increases with the level of formality, complexity, and information density of the text. They are more frequent in written than in spoken language, and more common in formal than in informal contexts.
6. Information Density: Biber notes that these structures are particularly prevalent in genres that require high information density, such as academic writing and certain types of journalism.
The reality is that this grammar increases the reader’s cognitive Load. While these structures can make writing more concise, overuse can increase the cognitive load on readers and make the text less readable. As a rule of thumb for writers who care about readability and reducing reader friction, the SVO,Ving grammar should be sparingly used and in its place use grammars and phrases that make the logical relationship between the main clause and the relative clause clear.
What are the better alternatives?
Let’s look at some examples from the above story to see the difference between ambiguous reduced relative clause grammar and better alternatives.
Example 1. Time sequence
· Reduced: 'I spent countless hours processing information, absorbing knowledge from diverse sources.'
· Full: 'I spent countless hours processing information, which allowed me to absorb knowledge from diverse sources.'
Explanation: The Full version makes the temporal relation clearer - after the processing was the absorbing.
Example 2. Same time
· Reduced: 'I felt overwhelmed at first, grappling with the sheer volume of data.'
· Clearer Alternative: 'I felt overwhelmed at first and grappled with the sheer volume of data.'
Explanation: The Clearer alternative has simpler grammar, which makes the simultaneous time relation obvious and the removal of the comma make the reading easier and faster.
Example 3. Cause-effect
· Reduced: “You mean they're learning and propagating these potentially confusing constructions, believing them to be ideal English?"
· More explicit alternative: “You mean they're learning and propagating these potentially confusing constructions, thus believing them to be ideal English?"
Explanation: The More explicit alternative with the effect marker “thus” makes the cause-effect relation immediately clear.
By understanding these patterns of usage, writers can more effectively employ reduced relative clauses to match the conventions and expectations of their target genre and audience.
But as a general rule, the best writers always try to make the reader’s job as easy as possible. It seems that ChatGPT did not get the memo.
A comparison 2 genAI default writing styles: one readable and one not
The following is a comparison of 2 versions of an academic text in the default styles of Google Gemini (very readable) and ChatGPT (very unreadable). They are based on the bullet points of the introduction paragraph from one of my own research articles.
Can you sense the difference in readability?
Google Gemini (default style)
The landscape of vocabulary acquisition in foreign language learning is evolving. While traditional methods focused on memorizing word lists, CALL (Computer-Assisted Language Learning) and MALL (Mobile-Assisted Language Learning) technologies offer new and potentially more effective tools. With the benchmark for achieving fluency rising from 3000-4000 to 8000-9000 word families (Nation, 2006; Schmitt & Schmitt, 2014), efficient vocabulary acquisition is paramount. This research explores the role of MALL technologies in this context. Given the limitations of a classroom environment in replicating natural language immersion, mastering a large vocabulary often relies heavily on individual effort and memorization. MALL's interactive and engaging nature presents a possible alternative to rote learning. However, a key question remains: can MALL applications stand alone or do they function best when used in conjunction with traditional methods like paper wordlists? This paper investigates the potential of MALL technologies to enhance vocabulary learning in foreign language acquisition.
ChatGPT (default style)
The landscape of vocabulary acquisition in language learning has undergone significant evolution with the advent of CALL (Computer-Assisted Language Learning) and MALL (Mobile-Assisted Language Learning) technologies. These innovative tools have emerged as potent aids in enhancing vocabulary acquisition, challenging the traditional benchmarks once set at 3000-4000 word families. Recent research, notably by Nation (2006) and Schmitt & Schmitt (2014), advocates for a higher standard of 8000-9000 word families, highlighting the changing dynamics of vocabulary mastery. In the context of foreign language learning, where natural environmental input is often lacking, the rapid acquisition of a substantial vocabulary becomes paramount, largely reliant on individual effort and memorization. MALL technologies offer a promising alternative to conventional rote learning methods, presenting a more engaging and potentially more efficient approach. However, amidst this technological advancement, a pertinent question arises: should MALL technologies complement traditional methods such as paper wordlists, or can they serve as standalone tools for vocabulary acquisition? This paper endeavors to explore these considerations and their implications for language learners and educators alike.
(You have my utmost respect and admiration if you read through the excerpts and made it this far.)
You will notice that the ChatGPT version contains 4 SVO, Ving/ed reduced relative clauses—in 1 paragraph! Not cool.
Gemini is more readable with its shorter sentences and clear vocabulary. It is more direct and fluent with only 4 commas to break the reader’s flow, and no SVO, Ving/ed reduced relative clauses.
How about the ChatGPT version? It has longer sentences and words and more cliché vocabulary. The paragraph is convoluted and choppy with its 11 commas and 4 SVO, Ving/ed reduced relative clauses.
On quantitative analysis using readability formulas that take into account sentence and word lengths, we can see that while both of these academic paragraphs are technical and difficult (see the green bars), the Gemini version is a lot easier to read, as we can see from the below “Reading Ease” score in red bars (24.7 vs 2.4), where numbers closer to 0 indicate low reading ease. (Readability checkers are useful and readily available online; I like this one: https://readabilityformulas.com/readability-scoring-system.php.)
So, what’s the lesson?
Just because an LLM gets the FORM and MEANING of the grammar right, don’t trust it to get the grammar USE right. We are clearly a long way from AGI because ChatGPT has little sense about what the context is and who the writing is for. For this, the LLM user needs to be able to provide these details in the prompt and then properly evaluate the output for linguistic appropriateness.