May the Fourth Be With You 2018


In The Force Awakens, when Poe Dameron, the self-assured pilot, said, “So who talks first, you talk first, I talk first? ..." I had a feeling he’d end up being a character I’d like.

When Poe had this interaction with Armitage Hux in The Last Jedi, I was completely sold on him:

Poe Dameron: Hi, I'm holding for General Hugs.
General Hux: This is Hux. You and your friends are doomed. We will wipe your filth from the galaxy.
Poe Dameron: Okay. I'll hold.
General Hux: Hello?
Poe Dameron: Hello? Yup. I'm still here.
General Hux: Can you...? Can he hear me?
Poe Dameron: Hugs?
General Hux: He can.
Poe Dameron: With an 'H.' Skinny guy. Kinda pasty.
General Hux: I can hear you. Can you hear me?
Poe Dameron: Look, I can't hold forever. If you reach him, tell him Leia has an urgent message for him...
Officer (to Hux): I believe he's tooling with you, sir.
Poe Dameron: ...about his mother.

I look forward to May, not for the spring weather, but because it gives me an excuse to play with Star Wars data.  I try to explore scripts from Star Wars movies every May the 4th, and this year is no different.  Finding the script for The Last Jedi proved much more challenging than any of the other movies I’ve worked with.

The Internet Movie Script Database is my typical go-to for all things scripts related, but they didn’t have one for The Last Jedi.  My husband, Adam Maness, fellow SAS employee and resident data guy took a jumbled partial script that I found, and spent hours watching, rewinding, and pausing the movie to make sure the data was accurate. In some cases he actually transcribed parts of the movie.

(Take note: if you haven’t watched The Last Jedi or previous films in the Star Wars series, there are some small spoilers in this post.)

Who has the most lines in The Force Awakens?

The first thing I did was run a frequency to see who did the most talking.  Not surprising, given his bit about talking in The Force Awakens, Poe led the charge.  Yes, the bars are the same color as Luke's blue lightsaber.  Visual Analytics provides the option to use custom colors in graphs.  I used the color picker tool in MS Paint to find the RGB values for the blue lightsaber from a screen capture, and here you have it!


Comparing The Last Jedi and The Empire Strikes Back with text topics

While I was watching The Last Jedi, my mind kept pulling together similarities with The Empire Strikes Back.  I know I’m not alone there.  Just like the parallels between The Force Awakens and A New Hope, it seems that The Last Jedi became a new take on Empire (with a little Return of the Jedi thrown in for good measure).  Poe was at the forefront of the battles while Rey was spending time with Luke honing her skills—much like Han charging into battle in Empire while Luke spent time with Yoda.  The originals were so good, I’m a fan of the nods to them in the newer movies.

Going beyond simple dialogue and frequency of lines, I wanted to run some quick text analysis in SAS Visual Analytics.  Visual Analytics offers an easy-to-use text analytics capability called text topics, which identifies topics that occur together frequently.  All it requires is a unique identifier and a text field.  In my data the unique identifier is the line number in the script.  I ran text topics on the data from The Last Jedi and The Empire Strikes Back.

Just like my analysis of A New Hope versus The Force Awakens, there are some text clusters that carry very similar themes, like the ones highlighted dealing with the protagonist/antagonist relationship and the epic lightsaber battles between them.

Mapping Luke’s lines in all Star Wars movies

Next, I explored the data from The Last Jedi and all of the movies in the original trilogy. I wanted to zero in specifically on Luke Skywalker.  Luke had such a cloud of disillusionment around him in The Last Jedi. I wanted to see if there were themes that were consistent or vastly different from the original trilogy to The Last Jedi.

I used SAS Visual Text Analytics to explore the clusters and the relationships in more detail. I looked at a simple term map for Luke in the original trilogy and The Last Jedi and saw terms that are mentioned in the path to the dark side—fear, pain, anger, hate, along with words pointing to the dark side, Jedi, and family.  I built a small taxonomy so I could compare how those terms and their derivatives and synonyms might be used in both sets of data.

I scored both the data from the original trilogy and The Last Jedi, and brought the scored data back into Visual Analytics.  I filtered the data down to just Luke.  The category that appears most often in both the original trilogy and The Last Jedi for Luke is Jedi.  The details show that in the original trilogy, Luke was happy to be special.  The Jedi were worthy of awe and he desperately wanted to become a Jedi Master.


By The Last Jedi, Luke had learned that there were downsides to being a master, and that his arrogance levied a significant cost.  His experiences caused him to become a jaded recluse.

In the original trilogy, Luke was very much a student of the Force.

In The Last Jedi, he reluctantly took on the role of teacher.

What should I analyze next?

What have been some of your favorite Star Wars moments?  Are you looking forward to the new Solo movie?  Leave me comments, let’s geek out, and May the Fourth be with you!

Learn to apply these same techniques for your organization

About Author

Mary Osborne

Visual Text Analytics Product Manager, SAS

Mary Osborne is the SAS product manager for text analytics and all things natural language processing. She is an analytics expert with over 20 years of experience at SAS with expertise spanning a variety of technologies and subject matters. She has a special interest in the application of analytics to provide aid during humanitarian crises and enjoys her work in the #data4good and #analytics for good movements. Mary is known for her dynamic and fun presentations and enjoys using technology to solve complex problems.


  1. Lonnie Miller on

    Lucky me - my birthday is May 4th. I enjoy this blog series, Mary.

  2. I always look forward to your annual post Mary! I'd be intrigued to see some analysis on Rey's character and her determination and strength in The Last Jedi.

    Definitely looking forward to the new Solo movie... one of the benefits in having a "quick" year is that it is December again before we know it to see the next Star Wars movie 🙂

    Happy Birthday Lonnie! What a great day to celebrate... The first words our son said to us this morning was "May The Fourth Be With You"... certainly put a smile on our face!

  3. Jarno Lindqvist on

    Very timely topic and a captivating read, Mary. Also, can't wait to see the new Solo movie!

  4. Mary Osborne
    Mary Osborne on

    Awesome! Happy birthday, Lonnie, and thanks Michelle and Jarno! I may pull a few things about Rey for our SASChat this morning! 🙂 My kids are all decked out in their Star Wars shirts this morning!

Back to Top