Wednesday, September 19, 2007

A Few Thoughts

Here is something I thought about this afternoon:

Throughout the course of my day, it is often necessary, or simply entertaining, to compare English and Japanese. In my thoughts on the subject, I have decided that a central concept in the development of languages, if not the central concept, is what I have come to call “cultural parsing.” As I am not a linguist, I have no knowledge of another term for such a concept, which no doubt already exists.
In order to define cultural parsing, we must first delve into a theoretical construct of what language is. Language is assumed to be a tool designed, perhaps not necessarily consciously, by societies, for the intention of communicating ideas. “Societies” will go undefined properly, as that is beyond the scope of this paper. “Ideas” will similarly go underdefined, but can be understood as points in the mathematical sense. The space in which ideas are proposed to exist is a space of possibly infinite dimension, with each axis possibly finite, countably finite, or even uncountably infinite. Successfully communicating an idea can be seen as simply using language, which in this case can be seen as a formula in the most abstract sense, in which all operators and values given are taken to be understood by all parties (generally thought of as speaker and listener). Though this last assumption is patently and demonstrably false in the real world, as is evidenced by the existence of miscommunication events even among native speakers of the same language, it is a useful assumption for simplification of the theory.
Before delving any deeper into the theory, an example may be useful for clarification. Take for example, the relatively simple English sentence:

Bob is tall.

It is useful to first examine the sentence from a purely grammatical perspective. We see clearly that “Bob” is our subject, “is” a linking verb, and “tall” a predicate nominative.
Now, we can examine this sentence, taken to be an idea, that is a fully defined mathematical point, from a mathematical perspective. One can take the subject to be a variable, for example. By using “Bob,” we have fixed the subject variable on Bob, here taken to be a known value for both speaker and listener. Our only other variable in this case is the predicate adjective, “tall.” In other words, height is the variable. “Tall” can be seen as a value, also assumed to be known by both speaker and listener. While height could be expressed easily as a real number, such as in inches, so that the value of the variable could take any positive value along the real line (or furthermore any range of numbers, as long as a bijective mapping exists between the two systems of measurement, but this is a needless digression), it could also be described by a finite set, such as {tall, average, short}, as it is. The choice of sets from which variables' values can be selected is the central idea behind cultural parsing.
On a slight tangent, but an important one to keep our model of language consistent, we address non-addressed variables. In our example sentence, only two variables were addressed, the subject and the height. According to our model, the number of variables required to fully describe an idea is possibly infinite. Then what happened to our other variables? The other variables can be seen as being set by omission to a special value which must be included in all component sets of our Cartesian-product idea space, pointing to the values' irrelevance. The speaker can be seen as setting all the variables to this “irrelevant” value by not mentioning the variables, indicating he/she is, quite obviously, not currently addressing said variables.
Now, we examine the idea of cultural parsing. Our example sentence works fine here. We have seen that the variable of height can be described using various sets. One possible set was the set of positive real numbers, while another contained only three elements. The “irrelevant” value is omitted from here on for convenience. The idea behind cultural parsing is that two different societies may choose different sets with which to describe certain variables. To continue using our example sentence, we may consider two theoretical societies, conveniently named Society A and Society B, who are assumed to use the same words for “Bob” and “is” out of convenience. In reality, a simple difference in words like this is even fairly easy to handle, as long as a bijective map exists between the sets describing each of the variables “subject” and “linking verb.”

Society A – Society A is taken to be an extremely hierarchical society, in which the hierarchy is based on, of all things, height. The taller members of this society enjoy all the benefits of wealth and fame, while the shorter members enjoy only low-paying, denigrating jobs. Furthermore, even a slight difference in height results in a huge difference in authority.

Society B – Society B is taken to be a society in which amusement parks are the central entertainment, religion, and source of wisdom. Those who may ride the rides are blessed, while those who can't are, simply, not.

Cultural parsing is the idea that societies choose descriptor sets based on which ranges of values of variables need to be distinguished for practical use. Let's see how this works with our two societies.
In Society A, everyone must know their height down to the milimeter, if not picometer. Status is taken to be important to members of all societies, so it makes sense that people would want to know where they rank. As a result, Society A choose to measure height using a continuous, non-negative (though this is, again, not a technically necessary condition).
In Society B, people below a certain height cannot ride any rides, and are, thus, all treated terribly. Those somewhat taller, but not topping the charts, can ride some rides, but not the best ones. Those of outstanding height have access to all rides, and, thus celebrities. Of course, all people in the lowest height bracket are treated the same as each other, as are those in the highest height bracket. Therefore, Society B naturally chooses to describe height using the set proposed earlier (short, average, tall), as further distinctions are superfluous.

This may all seem like a nice (or not so nice) thoeretical model, but the astute reader is no doubt searching for a more realistic example of cultural parsing, as well as poking holes in it. One such hole should be obvious from our example. Even in English, height may be described using either real numbers or our other proposed descriptor set, among many others. A few hypotheses on the existence of redundant descriptor sets:

1.Members of a society tend to prefer a single descriptor set despite the existence of redundancy. Our societies were, indeed, hypothetical, and, moreover, both situations exist in one form or another in our own societies. A refined version of the cultural parsing theory may result in factoring in situational preference and relative situational frequency in different societies.
2.Societies and situations are not homogenous, though theoretically it is useful to assume so. Some members may prefer certain descriptor sets to others, and the overall effects of cultural parsing can be seen by an analysis of the society as a whole, possibly even statistically.
3.Societies are not static. Especially in this day and age, societies even interact with, and change each other. Accordingly, descriptor sets can be altered as societies' need to describe various ideas changes. For example, ancient societies had no need to distinguish between an abacus and an electronic calculator, as neither existed. Eastern societies today must make that distinction, whereas Western societies, with little or no experience with abaci, have no need for that distinction.

As for a more concrete example, we turn back to the impetus for this paper, comparing English and Japanese. Here we will examine two subcases: color and seaweed.

In terms of colors, hopefully all readers are familiar with at least basic colors in English. Here we are concerned only with words for colors (this is the descriptor set for the “color” variable) which are commonly used, which can be understood to mean colors that even preschoolers are familiar with. For example, red and green are basic color-words, whereas compound color words, such as blue-green, and more obscure color-words, such as beige, are not basic color-words. The descriptor set for color is, therefore taken to be {red, orange, yellow, green, blue, purple, black, brown, white, gray, pink}. Of course, there could be some disagreement over the members of this set, but a more detailed discussion is beyond the scope of this paper.
A less familiar set is probably the descriptor set for color in Japanese. First, a slight digression on the nature of adjectives in Japanese. Adjectives in Japanese can be seen as falling into three categories: i-adjectives, na-adjectives, and no-adjectives (no-adjectives are more precisely called adjectival nouns). The types of adjectives are so named for the characters that follow the root of each. I-adjectives are constructed as such: root + i, for example. I-adjectives are all native to the Japanese language, and can thus be seen as automatic members of the basic color-word set.
Na-adjectives are constructed as such prenominally: root + na. When used as a predicate, no -na is used. It is worth noting that na-adjectives are sometimes referred to as nouns, as are -no adjectives. All foreign adjectives are na-adjectives. Na-adjectives are fortunately irrelevant to our discussion of colors.
No-adjectives are constructed as such: noun + no. This is why no-adjectives are more precisely called adjectival nouns. They are generally not regarded as adjectives, but are particularly relevant to our discussion of colors for the reason that certain colors (or, more precisely, ranges of variables which we in English call colors) can only be described as no-adjectives, specifically of the form noun + iro + no, where iro is the Japanese word for color (the mapping between iro and color is seemingly bijective). At least one fairly basic exception to this construction exists, that is midori-no, or green. The use of midori as an adjective is a relatively recent addition to Japanese, and will thus be specifically left out of the basic set.
Since no-adjective color-words are more awkwardly constructed than i-adjective color-words, they are postulated to be later additions to Japanese, and thus, less fundamental to the Japanese perception of colors. If we took only i-adjectives to make up the basic color descriptor set, we would be restricted to {akai, aoi, shiroi, kuroi} (red, blue, white, black). If we allow iro + no-adjectives, we also get (orange, yellow, brown, and gray).
Of course, these sets were fairly arbitrary, but based on personal experience in both America and Japan. For those who prefer violet to purple, no significant difference is incurred by a replacement. The main point here is that traditionally, colors (which could theoretically be described as numerical values [more precisely as coordinate pairs, if we include intensity or brightness] by the frequency of the colored light perceived by the speaker), are seen, or at least described, differently between Japan and America. In the language of cultural parsing, there is no bijective mapping between the descriptor sets [In actuality, a more precise mathematical description is called for here, but that will follow with more precise thought, as this paper is merely laying out a theoretical groundwork]. Specifically, colors (or ranges of frequency) described by English speakers as green, have traditionally been described as aoi, or blue, by Japanese speakers.
Here we see an example of cultural parsing, though an as yet (by this author) unexplained example. It is postulated that cultural parsing is the result of environments (in a very wide sense of the word) call for different ideas to be expressed, or, rather, different descriptor sets to be called for, but that some differences may be chalked up to randomness. This particular example may be one of randomness, but more thought may be necessary.

A second, and more easily explainable, example comes from seaweed. For those unfamiliar, the Japanese diet has traditionally contained various varieties of seaweed. Nori, wakame, and kombe are different foods which have been consumed in Japan for hundreds of years. No English words exist for them, aside from the recent addition of the loan words from Japanese, which can be ignored for the purposes of this paper, as it is an obvious example of non-static societies. All words can be most easily explained to native English speakers unfamiliar with the foods as seaweed (then modified to specify the kind of seaweed).
In the language of cultural parsing, we see the sets (which are no doubt incomplete on the Japanese side, but that point is irrelevant to the current discussion) cannot be mapped bijectively onto each other. That is, the Japanese descriptor set is {kombe, wakame, nori}, whereas the English descriptor set is {seaweed}.
The reasons for differences in cultural parsing should here be obvious. Native speakers of English likely never had expereince eating seaweed, so making distinctions among different varieties would be fruitless, and unnatural. Thus, the small descriptor set. Japanese speakers, most likely Japanese, would have need to specify a variety of seaweed, as one would not like to try wrapping makizushi in wakame! Thus, the larger descriptor set.

This paper is, as noted, only a theoretical groundwork in the area of cultural parsing. Clearly, individual cases of cultural parsing still need to be explained. Furthermore, the example of color shows us that variables (dimensions) can sometimes be seen as examples of cultural parsing. Typically, what we English speakers characterize as the variable “color” can be seen as multiple variables, say, “color (frequency)” and “intensity.” So, some variables may even be seen as functions of other variables. Given that the set of possible variables may itself be infinite, we have quite a lot of work in front of us.

Apparently, blogger ignores the formatting from open office, but you aren't missing much there. Sorry!

2 comments:

Potomac Rubella said...

This should have been your admissions essay.

Hot Topologic said...

I'm sure they would have been pleased. Really, I have more pondering to do on the theory.