Analysis. Vera is the result of a research concerning the changes in communication in today’s society,
the storing of huge amounts of information online, language, digital reading and literary analysis.

 
 

Background

Social changes and technological progress in recent years has favoured the storage of enormous quantities of information on the Internet. Due to the amount of information, the digital environment is dominated by brevity and simplicity.

The usage of language also has been transforming during these years; it has adapted to new forms of creation and information distribution: paper (newspapers, books, magazines…), screens (TVs, computers, mobile phones, digital tablets…), etc.

Nowadays, a new model of communication based on the digital relations is becoming the norm. The language is more and more affected by the dynamics required by the networks/internet: speed, concision, brevity and simplicity. These factors do not contribute to the development of the language and compromise the quality of the transmitted information. In addition, this tendency together with the expansion of short messages makes the content inexact and superficial. The form of language affects the meaning and both are threatened on the Internet.

There are many groups of people who are genuinely concerned with the idea of language preservation: beginning with Dadaists (Tristan Tzara, Apollinaire…), members of the OuLipo Group, who worked with metalanguage, to more recent writers like George Orwell, William S. Burroughs or Ray Bradbury who considered language as a form of control, a virus.

All this led me to devise a project that deems to oppose the tendency of language impoverishment. Literary texts were taken as a starting point since they are less popular on the internet and are distinguished by their literary quality. I propose to integrate these texts on the internet. To accomplish this task I presented the data as a means of explaining and distributing the project. The ultimate purpose of my study is to stimulate the analysis of the text and its language, to create new interaction experiences and to promote reading.

 
 
 
 
 

1. Measuring writing style

“The battle of Waterloo was certainly fought on a certain day; but is Hamlet a better play than Lear? Nobody can say. Each must decide that question for himself.”

— Virginia Woolf

The quality of language is not something unanimous. That is why I centered on the analysis of the style of each author; complexity, rhythm, action, ornamentation and difficulty of reading have been variables to pay attention to. To localize the style I have taken the sentence as a minimum unit of measure.

· Complexity (length of the sentence).
· Rhythm (variations in the length of the sentence).
· Action (number of verbs in a sentence).
· Ornamentation (percentage of adjectives in a sentence).
· Difficulty of reading (number of long sentences in the novel).

Each of these factors are based on the count and classification of words, length and their combinations in the text as Wordnet, the software that was used for the analysis, centers only on the morphological and quantitative aspects of the language.

The style of the author is determined by his world, his education, his life experience… It is the way he/she writes what determines the readers’ experience while reading a book.

 
 
 
 

2. Visualization design

To create the visualization we started from the conceptual presumptions that in some way should be present in the design. Literary works are something unitary that should be perceived as a whole; reading and writing are processes that have duration in time and consist of different phases; textual presentation cannot be understood entirely if it is not supported by or does not have references to the text that it deals with, and finally, there should necessarily be a poetic component that reveals the creative personality of the author, after all literature is a form of art.

My aim from the beginning was to present the style of the text in a simple form, without obscuring the results, since the subject in itself is rather complicated. To do this I used the elements from the graphic world (shape, colour, size, position…) and assigned them previously defined variables to analyze the texts. In this way, once the code is learned, it is possible to read easily the data transmitted by the visualization.

Three levels of representation were established: one general, that shows the whole novel with its poetic component; the second level that is more analytical is meant to see the sequence of the ordered phrases and the text of each of them, and the third level that shows only one phrase and its components. This division is very useful since it helps to understand the processes that occur in every case: see, sort and read.

 
 
 
 

3. Future research lines

The analysis of meaning is a field open for investigation. Figures of speech may be an important indicator of complexity and creative genius of the author due to the capacity of word associations beyond the norm. The metaphor and other similar resources cannot be detected automatically because they only obtain meaning though their relationship with other words that acquire new meanings as a group.

In this line also appear semantic fields; can the references to physical or abstract objects, feelings, moral values or colours be a reflection of the historical changes in the language? Can we get to know better a society through the way it uses the language? The Stanford Literary Lab has started to work with these hypotheses that can be developed in depth.

Generally speaking, one of the greatest problems of the literary studies is the question of quantification as a valid system of analysis of humanistic disciplines. Can a program summarize the experience of literature or music using quantitative data? And, if not, what is the function of proceedings of this type? Computational linguistics still has limitations that need to be solved in the future. This opens an interesting line of investigation.