A while back I found a dataset on data.world that listed UFO sightings from the National UFO Reporting Center. It's a mesmerizing collection of over 150,000 sightings from around the world but mostly in the United States. The other day I decided that it would be a fun project to look at reports of UFOs that are have been reported near me in Bucks County, PA.
In this article, I'm going to explore the dataset using statistical methods. The idea is to get a kind of bird's eye view of the dataset. Think of it as zooming out. There are about 260 cases over the past ten years and that is what I am focusing on. I plan on also reading through each case individually and picking out a few that stand out from the rest. Those I will look at in detail. And I want to do some text mining but those two topics will get their own articles later on.
If you like, you can follow along with the code I used to generate this content. I have a Github repository with all my R code that you can view or copy and play with yourself. But, I'm not going to discuss the technical details here instead I'll focus on the UFO story itself. Also keep in mind that this is a work in progress so the codebase will change and adapt over time.
So I would expect that each year had about the same number of reports. Unless there is some kind of media event that might prompt sightings then really there is no reason to suspect the frequency would change year to year. In fact, there was a big media story in 2017 and I expected to see a spike in reports around that time. But as you can see in the plot below, that was not the case.
Right off, this is not what I expected to see. In fact, it's the opposite and reports really dipped around 2017. However, it is very interesting to see that reports spiked in 2014. What is going on then? I will make a note to pay attention as I read through these reports to see what they look like that year. A follow-up analysis may be in order. Could this be a UFO Flap in Bucks County?
I've explored this question before while looking at the dataset below and what I found for Bucks County basically mirrors what I've found before. Most sightings happen when people are more likely to have free time. So it's mainly the weekends with Saturday being the busiest day and Friday being a little busier than Monday and other workdays.
In the past, I've also found that more UFOs are seen in the summer and most are seen on July 4th. To me, this shows our cultural behavior more than anything to do with the UFO phenomenon itself. But who knows, maybe UFOs are piloted by patriotic aliens who prefer warm weather!
When are UFO usually observed? As you can see from the graph there are way more sightings at night when it's dark. But, there are still sightings in the day.
This is interesting because it could just be that most people are occupied at work during the day but it could also say something about the nature of UFOs. Maybe they are easier to see at night.
NOTE The sorting in this plot got wonky and so 12PM and 12AM really should be swapped to make sense of the data. But I ran out of time and the overall message is still the same so I let it go.
It's very important to know how long a UFO sighting lasts for since there is only some much detail that can be gathered from a instantious sighting. Many UFO sightings only take place for a second or two which is not much time. In the plot below I can to get fancy with a log scale to be able to see the whole picture when it comes to duration.
Probably the most important part of this plot is the area covered by the big box. That represents the most common durations reported outside of the extremes of a few seconds or a few hours. It's telling us that around 1 to 10 minutes is the typical length of a UFO sighting.
Generally speaking, UFOs are observed as "Strange Lights in the Sky". But, lights are not the only shapes that people are seeing. We also have disks, cigars, triangles, and more. Here are the counts of each type of shape seen over Bucks.
Lights and disks are the most common. Unknowns are also ranked up there, but that can mean just about anything including reports where this field was simply left blank. On an interesting note, a recent Pentagon report actually lists spheres as the most common shape reported by pilots at high attitudes. That doesn't quite line up with what people on the ground are seeing.
This is a new analysis I've been doing on these datasets. It's called Sentiment Analysis and essentially counts up the words in each description and then compares it to a database of emotions associated with words. So, "explosion" could be associated with "fear" while "authorities" might be associated with "trust". We can can use these associations to calculate a relative percentage for eight emotional states for each report. This can then be summed up to get an idea of how most people feel about their reports.
| Emotion | Value |
| ------------ | --------- |
| trust | 23.892544 |
| anticipation | 21.150439 |
| fear | 17.615789 |
| sadness | 11.005263 |
| joy | 10.371930 |
| surprise | 7.728070 |
| anger | 5.245175 |
| disgust | 2.990351 |
Trust, anticipation, and fear are the most reported feelings associated with these sightings. Fear and anticipation make sense and make me think of the adrenaline experience likely feel at these times. I would have thought there would be more surprise but that is far down on the list. Trust is not that expected, but my guess is that many of these reports will mention calling the authorities or other things. It will be interesting to note the emotional tone as I read through each case.
Here are words taken from the reports that are seen as positive and negative.
Red is negative and Green is positive. Overall, people felt 50/50 when it came to the experience as being positive or negative. This was repeated over a few methods that I used.
NOTE I removed the word "Object" from the plot because it appear so much that it blocked out everything else. It appeared as a negative word and took up 25% of the area of the graph.
Overall, these findings mirror what I've seen before even though the sample is much lower than the national dataset. Bucks County is like a microcosm of the national UFO picture.
As I found before, strange lights observed at night for a few minutes at a time is the typical UFO profile in Bucks County just like it is nationally. Text analysis of the witness is intriguing but leaves more questions than answers.
There are some unanswered questions, the most intriguing of which is what could be a UFO flap in Bucks. Did those 2014 cases all happen around the same time or was 2014 simply a year where more people saw UFOs overall. Was there some reason to be outside at night that year?