Alice Thomas Alice Thomas

Text Mining Sartre’s “Being and Nothingness”

Sartre_Being_and_Nothingness.jpg

One of the greatest Existentialist thinkers, Jean-Paul Sartre transforms metaphysics and ethics in Being and Nothingness. The work maps out the fabric of Being in all its various shapes and experiences of consciousness, developing the idea of internal negation connecting the for-itself (consciousness defined as the privation of a particular Being) and the in-itself (non-conscious Being) (800). Sartre traces the origins of consciousness: “[t]he For-itself, in fact, is nothing but the pure nihilation of the In-itself; it is like a hole in being at the heart of Being.” (785-6).

A challenging work at the outset to interpret correctly, I thought Being and Nothingness readers might benefit from a big-picture quantitative analysis of the text and visual representations of its central themes and relationships. Text analytics can yield surprising insights, or they can confirm certain hunches readers might have with respect to interpretation.

 

In order to mine the text for additional insights beyond my own intuitive understanding, I worked from Julia Silge and David Robinson’s excellent introductory book, Text Mining with R: A Tidy Approach, and the syuzhet vignette found here.

 

I was able to locate a digital (pdf) copy of the physical text I own here, which required a bit of pre-processing to remove unwanted sections (images, extraneous text).

 

There was naturally a learning curve trying out text mining for the first time. However, Text Mining with R lays out the foundations in a helpful, simple manner. It suggests packages to start with –– in this particular case, I used the following packages: tidyverse, tidytext, pdftools, tokenizers, textdata, syuzhet, wordcloud, ggplot2, and reshape2.

 

The function pdf_text() was used to pull the full text from the pre-processed file. syuzhet conveniently provides a way to break the text down into sentences using get_sentences(). I then extracted sentiments with get_sentiment(). Running sum() gave the value 2730.75, indicating an overall positive tone. Meanwhile running mean() on the sentiment object yielded 0.1863, suggesting on average, a moderately neutral to positive tone. Next summary() was used to glean information about the overall distribution of the sentiments in the text. Minimum sentiment was located at -4.5000, First Quartile at -0.2500, Median at 0.0000, Mean at 0.1863, Third Quartile at 0.6500, and Maximum sentiment at 6.5500.

sartre_summary.png

Importantly, syuzhet provides great ways to visualize text data. To start, I made a plot of the sentiment data over the span of the text:

plot_sartre.jpeg

Then, I made what’s called a “simple plot”, showing scaled sentiment over the full narrative time:

sartre_simple_plot.jpeg

There is a distinct dip in the arc at the beginning, followed by a wave, which gradually builds to a high point towards the last third of the text. Note that simple_plot() also gives a simplified macro look at the plot arc, which shows a build up from negative to positive sentiment over the course of the text. This makes sense as the first part of the text serves as a proving ground for the conceptual conflicts which are to be resolved, concerning the questions arising from Being, while the subsequent sections build on the architecture established therein, and flow towards conclusions which posit a resolution of these conflicts.

 

Next, let’s look at the overall distribution of these sentiments, to understand the initial summary statistics a bit better. ggplot() was used to generate a plot showing syuzhet sentiments distribution:

sartre_sentiment_distribution.jpeg

Note the spike at 0.0, where a lot of words with neutral sentiment are located, and a slightly bigger tail to the right, in the positive sentiment zone. Our initial statistics, particularly the sum() and mean() confirm this. We can conclude that the overall tone of the book is positive, based on this analysis.

The next procedures, in a separate block of code, rely more on tidytext and other packages. The text was read in, and formed into a dataframe to start. The “stop words” (words to be removed –– typically thought of as lacking sentiment), were taken out, and the number of occurrences of each word was counted. I created a wordcloud from these counts, to show the relative prominence of each word (limited to the top 50 words):

sartre_wordcloud.jpeg

Of note, “world” is the most significant word, at 1438 occurrences, along with “consciousness” at 1282 occurrences, and “object” at 1118 occurrences, followed by “relation” at 840 occurrences, and “freedom” at 814 occurrences. This shows the importance of these concepts in terms of how often they appear in the text. One of the main relationships discussed is that of consciousness towards the world, so this visualization highlights that.

 

Let’s look at the plot showing the frequency of each of the top words with greater than 500 occurrences:

sartre_frequency_plot.jpeg

The most frequent words in the text are: “world, “consciousness”, “object”, “relation”, “freedom”, “body”, “nothingness”, “past”, “meaning”, “pure”, and “existence”, in that order.

 

The comparison cloud, on the other hand, offers an insight more into the ethical depths of the text. Comparison clouds are built out of sentiment data; in this case I used the “bing” sentiments list. The words, ”object”, at 1118 occurrences, and “negation”, at 447 occurrences, appear more weighty on the negative side, while “freedom”, at 814 occurrences, and “pure”, at 513 occurrences, rest firmly on the positive side.

sartre_comparison_cloud.jpeg

As a first text mining project, this was admittedly ambitious. I chose a dense and deeply technical text, with a lot of terminology which may have been caught in the wide dragnets of the stop-words lexicons. Next pass at this, terms such as “Being for-itself”, “Being in-itself”, and so on, would require certain exceptions. I have so much more to learn, but I think I still got a good rough snapshot of the text, which serves to underline its central concepts and its overarching trajectory. Check out my GitHub repository here for further details, or to try this out yourself.

Read More