Distant reading using CliC and Voyant tools of Robert Louis Stevenson works

Robert Louis Stevenson

We’re all used to carefully reading novels and poems and reflecting on them ever since we started analyzing literature. However, we recently learned a new method of analyzing texts called Distant Reading. Distant Reading refers to a professional reading method that relies heavily on computer programs. It was created by Franco Moretti who wanted to analyze thousands of pieces of literature by simply feeding them into a computer. We got familiarized with two text analysis tools called CliC and Voyant-Tools and chose three novels written by Robert Louis Stevenson in order to discover how these tools work and dig more into Distant Reading.

There’s something satisfying about murder mystery books. The crime of murder is inherently terrifying, which makes these stories part horror fiction and part puzzle. You can feel the danger of murder looming in the background while also having the satisfaction of solving the crime in the foreground of the story. Having read “The Strange Case of Dr Jekyll and Mr Hyde” by Robert Louis Stevenson, we, as a team, decided to focus also on two of his other adventurous novels written during the same era: Kidnapped and Treasure Island. All three texts that we chose were available on CliC. However, we had to use The Gutenberg Project, a database of over 60,000 free eBooks, as a resource for building our corpus in order to analyze it using Voyant Tools.

Our Corpus

Many questions arose during our corpus analysis as a group, but we managed to narrow them to two main topics:

Throughout the novels, there is a notable lack of female characters compared to male presence. When they do appear in the story, they are victims or servants rather than more prominent roles. How is this a reflection of life in Victorian England? 

It is known that during the Victorian era, when Stevenson released his books, women were considered as belonging to the domestic sphere. This stereotype required them to only provide their husbands with a clean home, to put food on the table and to raise their children. Even though symbolized by the reign of British monarch Queen Victoria, women did not have the right to vote, sue, or own property.
Having previously read Stevenson’s work (Close Reading), we noted critical moments of feminine quality in these male-centered stories that are identified through the roles that female charactres play as (1) humble counterpart to males in socio-cultural context and (2) feminine energy in sex and sexuality in the gender context.

Based on this idea, we wanted to see if Distant Reading tools would provide us with the same results. We first started analyzing “The Strange Case of Dr Jekyll and Mr Hyde” on CliC, we went to “Concordance” and searched for the term “her” and realized that it was associated with the words “scream”, “lamentation” and “master” which emphasizes the female role as victims and servants.

Snapshot of concordance tool in CLiC

Digging deeper into our analysis, we also tried searching in the “Concordance” feature for the word “woman” and found the sentence highlighted in yellow expressing the author’s point of view towards females through one of the characters. Stevenson used the term “woman” as a way to describe weakness by comparing it to a “lost soul” and associating it with the demeaning word “weeping”. Searching for the exact definition of the word “weeping”, an example related to the Victorian times was given: “a Victorian tombstone that depicted a weeping woman”, which is a reflection of life in Victorian England.

Snapshot of “Concordance” tool in CLiC
Snapshot of the definition of the word “weeping”

Switching tools, we uploaded our Stevenson corpus and started with “Cirrus”, which shows the most frequently used words displayed in a colorful manner, very pleasing to the eye. After modifying the stoplist by excluding common words such as “like”, “come”, “said”, we realized that “man” is the most frequent term. Additionally, the “Trends” tool helped us visualize the dominant frequency of the term “he” compared to “she”.

Snapshots of Cirrus and trends using Voyant tool
This image has an empty alt attribute; its file name is screen-shot-2020-03-09-at-8.42.09-pm-2-1.jpg

This information triggered our curiosity, and we learnt that in the 19th Century “man” was used almost 80% more than “woman” in books with the help of Google Ngram Viewer, which could refer to a gender bias era. Moreover, we found out that women during the Victorian era were considered as belonging to the domestic sphere,  this stereotype required them to only provide their husbands with a clean home, to put food on the table and to raise their children.

Snapshot of Google Ngram Viewer

Good and evil are so close as to be chained together in the soul

Robert Louis Stevenson

The novels revolve around the contest between good and evil. Who wins in the end?

Stevenson always believed that humans have a good and an evil side. According to him, both versions live inside of us, only that the evil one is always repressed by society. These thoughts led him to write the famous book Strange Case of Dr. Jekyll and Mr. Hyde, a story about a character with a complex personality disorder.

Stevenson uses Dr. Jekyll, and Mr. Hyde as metaphors representing good and evil. Throughout the story, the duality and contrast is present and taking centre stage. He also wanted to point out how evil can express itself in anyone, no matter how rich, or how much we think we know someone. Even Jekyll had a bad side to him which caused him to create Mr. Hyde, who does a lot of horrible acts that are unforgivable. On one hand, Mr. Hyde was compared to Satan in the book, which emphasizes on how evil Mr. Hyde is.

Snapshot of the text “the Strange Case of Dr Jekyll and Mr Hyde” where Mr Hyde is compared to “Satan”

We searched for the term “evil” in “Concordance” tool and found an eye-catching sentence saying that Edward Hyde was pure evil.

Snapshot of “Concordance” tool in CliC

On the other hand, Jekyll represented the complete opposite of Hyde. He was well liked by the people around him and was a respected figure which was shown in the yellow highlighted sentence below after searching for the word “friends” in the “Concordance” feature on CliC.

Snapshot of “Concordance” tool in CliC

 The most surprising thing we found from our analysis was that the words we thought would be associated with the stories such as “evil”, “fear”, and “bad” are extremely small compared to “good” and “great”. This is illustrated in a graph by “Trends” tool on Voyant.

Snapshot of Trends in Voyant tool
Evil vs. Good
Source: https://annablogcode.files.wordpress.com/2020/03/19ac0-1448851093574-87ozjxfdd7qlasgn0hvp.jpg

As distant readers, we would think that “good” would always win in the end of each story. For example, in “Kidnapped”, Stevenson focuses on the fact that there will always be some bad apples, but instead of getting away with it like Long John Silver did in “Treasure Island”, he adds to it that extra amount of effort, the good will always emerge victorious. However, switching to Close Reading for “The Strange Case of Dr. Jekyll and Mr. Hyde”, the final statement of Henry Jekyll (explaining everything before his death/suicide) shows that Mr Hyde took over Dr Jekyll’s thoughts which led the doctor to end his life. This is a representation of a case where evil conquered good. Multiple sentences in Jekyll’s final statement proves this point:

“The powers of Hyde seemed to have grown with the sickliness of Jekyll.”

“The hatred of Hyde for Jekyll, was of a different order.”

“Here then, as I lay down the pen and proceed to seal up my confession, I bring the life of that unhappy Henry Jekyll to an end.”

We can finally say that in spite of our great results with distant reading tools, a better analysis can be done with close reading.

Whereas most text analysis tools are not designed for beginners in the digital humanities field like us, we found that Voyant-Tools and CliC were very clear and that their websites provided straight-forward explanations about their different features containing texts, screenshots, and screencast tutorials in the case of Voyant-Tools. Their best feature is their user- friendliness, evidenced through their extensive documentation, simple user interface, and an ability to export data. The most attractive is the simple. This quality is exhibited in the single data-entry field on the homepage as well as the panels on the data analysis dashboard. Each panel contains a different tool useful for text analysis, like Cirrus a word cloud generator on the Voyant-Tools website.

However, the only limitations of Voyant Tools were the occasional prolonged text-loading time and the challenge of gathering information using some visualization tools configured in the Voyant Skin Builder. As for CliC, you may not find a specific text or corpus you would like to analyze sometimes because of its limited text availability.

To sum up, an obvious difference between Distant and Close Reading is that a computer can analyze thousands of books by a click of a button whereas it would take years for us humans to go through the same number of books. In addition, another problem of Close Reading is the subjectivity. Every person analyzes a book with an opinion. A computer’s work is more objective and analyzes the texts as they are. There are no emotions involved. And a computer can find hidden aspects in plots by transforming them into networks. However, when using Distant Reading, readers don’t get the full experience of reading a book and can’t fully understand the messages that the authors are trying to deliver to the readers.

Hope you enjoyed reading our blog!

Anna, Tara, and Marc.

Leave a comment

Design a site like this with WordPress.com
Get started