To introduce this post, I would like to mention the inspiration for creating this experiment which is fairly dichotomous from what I have previously done. Prof. Eric Raimy spoke with me about my work on this blog and, while remaining supportive of what I was doing, challenged me to look at my methods anew. His argument against the ultimate realization of what I was doing was that it didn’t seem logical or realistic that I could get similar or the same results, even after eliminating a character or two from the play. For what is it that we perceive as “Shakespeare” if the lines of a leading character are but a trifle? Prof. Raimy told me that I needed to try the opposite of my subtraction experiments; that is, a multiplication experiment.
My setup remained with the three Romances I had looked at previously and their respective main characters. I decided now to isolate the characters and their respective plays one by one against the corpus, thereby attempting to truly pinpoint the effects of these characters and the effects acting upon them. In the fourfold pictures below, that initial isolation against the corpus is in the top left corner. I then amplified any potential effects of the main character in their play by multiplying their lines by two, five, and then ten. My results are below, with Leontes and the Winter’s Tale set leading.
Initially in the Winter’s Tale isolation, the movement of most of the highest tragedies is fairly obvious. With the introduction of Leontes, the orange section from Antony and Cleopatra to Timon of Athens is moved out of what was previously the green middle of the original corpus diagram. However, it is not that the orange section is moving apart, but rather the blue section moving onto the other half of the tree to gather near the red cluster. This is strikingly different from the original corpus and would suggest that the isolation of Leontes prompted Shakespeare’s corpus to divide itself and move apart.
When I spoke with Prof. Michael Witmore last Friday, he asked me how I would describe what is going on in these diagrams, particularly when introduction of new elements, or division of standing elements, results in movement across the diagram. I ended up settling on the thought of a lump of texts represented two dimensionally on the diagram. When a new element is introduced, it changes the features of the lump, so that it acts similarly to convolutions on the brain. A similar “brain” or corpus of texts lies underneath each of these “Shakespearean diagrams, but their distinguishing features, i.e. convolutions, shift and change. This can be transferred onto different corpora in a set as well. Instead of the convolutions and ridges of the data set being the limitations and distinguishing features of the corpus, the idea of a brain or central locus becomes an outer periphery past which nothing in the data set can be represented. This idea of an expansive universe approach is shown in diagrams like Prof. Witmore’s diagram of one-hundred and fifty years of drama or even across authors like Dekker and Shakespeare. While this larger space allows for more movement of groups of texts, movement to the outer reaches is just as illogical as assuming less movement in a space with less restrictions. However movement can occur much like analysis of a galaxy might, with more movement appearing when viewed within a larger scope. Following this idea, different sets of Shakespeare’s plays may have more room to distinguish themselves and their features but they still remain within the ultimate system of “Shakespeare”. This expansive explanation may cause more problems than it solves, but hopefully it facilitates grappling with the results and texts I show here.
Returning to Leontes, if the presence of his lines truly causes the corpus to rupture, a closer analysis and more control of the variables is needed to prove this. Shown below are two diagrams, one without Leontes’s isolated lines present and one without the text of the Winter’s Tale w-o Leonte’s lines. As you may see, the movement is still occurring with and without the Leontes’s lines being isolated. This kind of element manipulation may be useful for later applications, however I am currently working off of the assumption that two versions of the play (or three if you count Leontes’s lines) will not be detrimental to the structure of the corpus, but rather illuminating.
What is also interesting while analyzing the other three diagrams in this set is the fact that Leontes’s lines and the play don’t move. Indeed, the only movement relative to the play that can be seen is the play without Leontes and the Merchant of Venice. They cluster together in the third diagram and move as a pair to the extremity of the diagram in the fourth. But when the Merchant moves in the third diagram, other elements move as well. The large movement between the second and third diagram redoes the division seen initially. However, now the division isn’t as clean as it was. Midsummer’s Night’s Dream and Romeo and Juliet are both left with the Histories, which is a grouping we have not seen before. As the fourth picture appears it is not that the corpus has rethought its earlier division, but a Leontes-centric corpus has been created. This overwhelming presence of Leontes has so dominated the corpus that it is arranging around that text as a locus rather than with a collection of shared attributes. However, that does signify a huge correlation between All’s Well that Ends Well and Cymbeline with this artificially constructed world as these two plays in particular have stuck by throughout the amplification of Leontes’s presence.
When comparing Leontes’s diagrams to Prospero’s, some similarities are present. Mainly that the play, with and without Prospero, create the focal point of the diagram. Interestingly, Prospero is absent from this middle section until the second diagram when his lines are multiplied by two in the text of the Tempest. This magnetic attraction in order to pull Prospero towards his original play implies a certain autonomy that Prospero’s lines possess. He does claim nearly 30% of the play’s lines. However it could also reveal the tragic nature of his lines through its clustering with a group of mostly Tragedies from All’s Well to Coriolanus and is immediately followed by some of the most tragic plays in the group from Hamlet to Timon of Athens. I fear that the second diagram is only duplicating the locus effect seen in the greater degrees of Leontes’s multiplication. And it appears that after multiplying Prospero’s lines by five that we have reached a saturation point in Shakespeare’s corpus where any additional Prospero is superfluous for the rest of the diagram. As you might notice in the fourth diagram, there is almost no perceptual difference between the fourth and the third picture except that the twig with Tempest and Prospero connects closer to their names on the left. Since this diagram is working on a distance scale, all that means is that the statistical difference between the Tempest and Prospero has lessened, i.e. they are becoming more similar to themselves. The rest of the result when inundating the corpus with Prospero is negligible as it appears more like a smear of previous genre distinctions and groupings.
Switch your attention now over to Imogen and her situation. To say that we are seeing similar effects transferring across the sets of diagrams would be reasonable. However the section that moves has neatly inserted itself within the red group normally inhabited by the Histories. Similarly to the diagrams of Leontes, there is again no difference between the first two diagrams but stop your gaze at the third. I would dare to call to this single diagram the crème de la crème out of what we have seen so far. What makes this diagram so important is its similarity to another diagram that we have already seen, which is the one in my previous post here. The only differences between the two are the placement of Winter’s Tale and Merchant of Venice and the Tempest and Julius Caesar. But what does this nearly identical nature between the pictures mean??? Can this signify the fact that amplifying a character’s lines can apparently substitute for another character’s lines entirely? For it looks like Imogen is either performing the same changes as when Prospero and Leontes are present or absent or she is acting in a method that is independent of them. Regardless, it appears that Imogen has a powerfully “Shakespearean” attitude albeit only present when amplify by a factor of five. She does a great job of dividing the plays by genre, with the Histories in red, the Comedies in green, and the Tragedies in blue. I believe this is the cleanest division of Shakespearean genre that I have seen yet, despite it getting rid of the Romance category. The only quibbles I might have would be the placement of All’s Well in the Tragedies, Romeo and Juliet in the Comedies, and Love’s Labour in the Histories. However each of these placements may be explained, All’s Well has a near tragic plot, Romeo and Juliet has a comic plot ending in tragedy, and Love’s Labour has been in the Histories since the original corpus diagram. For all I know, this could be Shakespeare’s true genre divisions based on authorial style and, to an extent, content. Unfortunately this still leaves us with the question of how amplifying Imogen can create this artifice of genre. Would this suggest the deepest of divisions between how Shakespeare wrote female characters and that Imogen’s “Shakespearean”-ness was merely hidden by the multitude of males? Or is it that Imogen’s comic nature was throwing off of the diagram until her isolation could reveal her true affiliation apart from the overall tragedy of Cymbeline? Truly, I have no concrete answer to this.
Continuing on to when Imogen is multiplied by ten, the delicate balance present in the third picture vanishes and is replaces a kind of Shakespearean swirl between a dominant and sub-dominant locus where Imogen and Cymbeline x 10 drags part of the diagram away from the other half.
Assuming the results from Leontes and Prospero as dead ends, the findings above with Imogen may indeed signify the importance of character’s lines upon genre as well as have deeper implications for the very presence of a genre called Romance in Shakespeare’s plays.
Very interesting post, Mike.
Four comments:
1) Character amplification effectively re-weights the corpus and allows for new pairings: it might be doing so by changing the mean score for particular types of words, so I would want to know if these results are being taken from scaled data. Presumably, however, the multiplication of particular scores for a text (like one containing Imogen) effectively pushes that text “further out” in the multidimensional space that is being surveyed for proximal items. Could you say more, then, about how this “pushing out” of one particular item in a particular way (“Imogen-ness”) might be affecting particular pairings, perhaps with reference to the order of pairings from the Ward’s procedure?
2) In order to understand how this procedure works, could you think about other forms of intervention into the corpus that might have similar “re-weighting” effects? What if a version of Tempest was added to the mix where there were five or ten times as much Act 1? Since we could imagine any number of such manipulations, what would be the significance of all those manipulations (a Love’s Labour’s Lost with 10 act fives; an Othello that was all act threes, etc.) as a group if they all produced groupings that matched up with, say, the First Folio’s genre distinctions? Would these be producing the “correct” historical groupings by chance, or would there be some underlying significance to the fact that these manipulations “work” and others don’t?
3) You make an interesting statement, as follows: “Would this suggest the deepest of divisions between how Shakespeare wrote female characters and that Imogen’s “Shakespearean”-ness was merely hidden by the multitude of males?” I’m not sure what you mean here. Are you saying that Shakespearean genre becomes more orthodox (i.e., more susceptible to characterizations of the sort applied by Heminges and Condell) if his female characters are given more stage time?
4) These experiments are important because they put “knobs” on the analysis — they allow for tests of scenarios in which certain variables are changed incrementally. Another way of making incremental changes is to vary the corpus itself by adding more items (a larger field of dramatic writers, say). What, then, is the difference between this kind of “internal” augmentation using resources from within a closed set of items (all Shakespeare’s plays) and an external augmentation (adding more playwrights), since both will produce a re-balancing?
Looking forward to seeing more!
Hello Mike,
I apologize for the tardiness of my reply; it has been a busy week. In regards to your comments, my replies are below:
1. I suppose this is one of the main questions that I still have, mainly as it is somewhat statistical and somewhat rhetorical. If Imogen is being amplified, we are indeed redistributing the pivotal locus around which the rest of the unmodified plays will still revolve. That may adjust some of the pairings, as they must now re-orient themselves and their stratification to the new axis. However, what if this new locus is the focal point of what is truly “Shakespearean” and therefore allows the rest of the corpus to naturally arrange itself in order of genre division? It is interesting to note that the First Folio, which is the data set that we are making many of our assumptions off of, only has Tragedy, Comedy, and History as genre divisions. In theory then wouldn’t we want to assume the same results in our dendrograms? Applying a fourth genre like Romances or Tragi-comedy may not be applicable to the problem set we are working with. In that case, if Imogen (who is usually considered a weak woman and watery character in general) is a very Shakespearean character, or the epitome of Shakespeare’s style, amplifying her lines would redistribute the other plays in the corpus around the truly correct Shakespearean locus. This would then imply that the only reason this distribution didn’t happen earlier was that her miniaturization within Cymbeline dragged the rest of the play away from the corpus which then led to us viewing the corpus distributed around an initially skewed focal point. The counter-argument against this is that is was “luck”, but I personally don’t believe in luck within the realm of statistics; everything happens in reaction to everything else.
2. My thoughts on the textual intervention in other ways are similar to my thoughts on Imogen’s multiplication. There is probably good chance that by performing any of the experiments above, or any combination of them, we could get similar results than what I got with Imogen. But this then brings into question if what we are looking at is truly Shakespeare. If something completely foreign is entered into the data set, like Tempest with five or ten first Acts or five extra Imogens in Cymbeline, it would be natural for Shakespeare’s corpus to then bunch up in defense of this intruder and cluster tighter together. This kind of response makes it questionable as to how far our methods and experiments can go, and it also brings up the question of subjective evaluation as to whether or not our results are Shakespearean. It may be significant however, to look at whether or not these foreign texts in the corpus are then relegated to the outer extremities of the two dimensional space in the dendrogram or if they re-adjust the center around themselves like when five Imogens are present.
3. Unfortunately, I believe that my sentence was only interesting because of the poor nature of my grammar. I should have split it in twain, questioning first if there what was being seen was a division between the way that female characters and male characters were written in Shakespeare (or whether or not Shakespeare’s females or males were more Shakespearean) and whether or not there was a use of stock characters in either males or females in Shakespeare’s leading roles. The second question was whether or not Imogen’s watery lines and insignificance in the play was due to the multitude of males in Cymbeline or if there was some sort of problem with the censors at the time. If Imogen was truly written with the Shakespearean-ness that five of her seem to introduce, would she have made it past the censors as such? Or were her lines formed as they are today before submitting for pre-production review? I suppose that is question that strays well away from Digital Inquiry though. As to Heminges and Condell, I am not immediately familiar with their work and I am unsure of the implications you mentioned. For one, if genre is assumed to have a direct relationship to characterization then I could see how expanding some of the characters lines would have an effect on genre, but I am uncertain as to if those characters would be the women in the plays. We see a dramatic shift in the dendrogram hierarchy when Prospero is removed and amplified as well, in theory readjusting the genres of Shakspeare. So for this, I am unsure as I would have to investigate more thoroughly first.
4. I suppose my main preference by working within Shakespeare’s corpus, in lieu of introducing other authors, is that I feel like I have more control. I am indeed turning knobs, but I am trying to see how minutely these changes can be made. For example, what if we found a single word that, if introduced into a certain play, would completely readjust genre specifications? To me that would seem like a major find. My thought as to remaining small is that once we find these elements that can be turned at the smallest of levels, such as the theoretical word I mentioned, then we can move onto bigger and more varied corpora without feeling like we are at a loss to explain our findings. In that case, left spluttering to describe the perfect diagram and to what is occurring within it, the perfect diagram no longer becomes the perfect diagram as its usefulness is overshadowed by our statistical “luck” (that is to say our unexplainable methodology and knob turning). While most of this is personal preference, as I would love to expand my studies to Shakespeare’s contemporaries as well, I feel like there are valid arguments either way as to analyzing a smaller or larger data set first.
Have a great weekend and I will see you soon,
Mike
Pingback: Adding Knobs to the Analysis
Interesting work. It seems like what you’re getting at in the end is an analysis of work parts smaller than an individual play. I occurs to me that you needn’t leave this analysis convoluted with the play as a whole by multiplying the weight of individual characters or acts; you could instead pull the characters or acts out entirely and cluster them against the whole corpus. Wouldn’t that give you a cleaner sense of what you’re after?
Repeating the procedure across many or all of the plays, I know I’d be interested to see how all of Shakespeare’s characters cluster, or if the various first acts cluster together, etc.
Anyway, just a thought. Nice work – I look forward to seeing more.
Hello Matthew,
Apologies for the even tardier reply as I had to go to class between comment responses.
What was interesting, although unintended on my part, was that I had five extra Imogens in Cymbeline which made it six Imogens total after counting her lines already in the whole play. In addition, I extracted Imogen’s lines and isolated them as I had done in previous experiments. What I ended up with, although I didn’t recognize it at first was a mixture of the textual interference and character isolation which resulted in a total of seven Imogens running around the data set. I think you are right that my multiplication should have been in Imogen’s lines that were isolated, rather than her lines in the play, as it would revamp the corpus accordingly. But after a point we run into the problem that we are really created a locus around Imogen instead of attempting to create a locus around her lines in a play. If Imogen was universally taken to be “the” Shakespearean character, we might run into less dissent with that route. In the end, I would say its a toss up between convolution within the play or risk focusing the rest of the corpus around a single character. I appreciate your thoughts about what to do next though, as separating all of the major characters from their plays is a long-term goal of mine as well as looking more and more into the categorization of genre. Your idea about the first acts in each of the plays is intriguing as well as I usually read the first acts in a completely different way than the rest of the play. As Prof. Karen Britland has said, you can find everything that is going to happen already present in the first act of Shakespeare’s plays which make them both fascinating and possibly the quintessential “Shakespearean” mode that we have been trying to analyze.
Thank you for your comments and I will look forward to hearing from you again!
Hi Mike,
Great conversation here. You write in your response:
“For example, what if we found a single word that, if introduced into a certain play, would completely readjust genre specifications?”
Excellent point. You have found yet another knob. Perhaps we should begin turning this one.
All the best.