Do you find spam annoying or have you found an effective way to deal with it? Or do you rise above and just ignore it?
The reality is that spam email exists, and at least some of it is malevolent. If you have an email account, chances are that you are getting something between “some” and a “crap ton” of spam. Personally, I am not a real fan of spam email. I am a Virgo. We tend to like things orderly and neat – like we all have some form of OCD. So when I open my email and see more spam than real email, it annoys me. But, spam has purpose I guess. Some of it is to get click-throughs. Some of it is actually goods for sale. Some of it is phishing, to distribute malware, or to lure you to visit a hostile site.
I decided I wanted to take a look at my spam. So, for a month, I kept all of my spam, then looked at the subject matter, where it was from and tried to analyze some additional characteristics of the spam. Maxmind helped me with geolocation. Ipvoid and a couple other sites helped me with reputation. Other than that, I viewed the headers in every spam from four different email accounts for a month, ending on Thursday, December 19, 2013.
That didn’t sound so bad until I pulled my spam folders and dumped them into a spreadsheet. 691 spam emails later, I actually found the information interesting. Tedious and time-consuming, but interesting.
I have two caveats.
1. I honestly have exactly no idea how many spam emails my providers filtered out before they even got to my inbox or junk email folders, but I ended up with 691 pieces of spam.
2. With the amount of spam sent in a normal one-month period I can’t say that 691 spam emails is truly statistically relevant to all spam sent. But, I can say that the data is relevant for the spam I actually received in my inboxes.
To start with, my yahoo account, which is my oldest but least used account, received a grand total of 52 spam, while my busiest account received a total of 415. So, I can say that there is a direct correlation between the amount of use my email accounts get and the amount of spam that they receive. That makes pretty obvious sense, right?
Is “Spam” actually “Scranton Ham?”
What I was surprised about was the variety and the sources of spam I received across those accounts.
Bottom line, I received 691 spam from 320 IP addresses in 33 different countries. 86% of that spam came from seven countries, with spam from the United States accounting for over 55% of everything I received. This matches other research I have seen that says that the U.S. hosts a large amount of phishing emails and malware as well.
Scranton, Pennsylvania told me that my U.S. based spam was even more interesting. As far as I could tell, my U.S. based spam came from 17 different “providers.” But, of my 382 U.S. based spam, 42 of them came from the 207.188.xxx.xxx address space from a provider in Albuquerque, and 264 were generously donated courtesy of the 64.xxx.xxx.xxx address space from a single provider in Scranton, Pennsylvania.
This is not the Spam you are Looking For
Based on analyzing my spam, I can tell you that I think IP address filtering, especially from Scranton would be problematic. Of my 691 spam, I received more than four spam emails from only two IP addresses. A Russian site sent 24 messages from the same IP address and one day I got eight emails from the same address in Scranton. Other than that glitch, Scranton worked like clockwork – one email every 25-28 minutes. After one or two emails the IP address increments. If I look through my logs I can watch the addresses change with the received emails.
And so on…
I can see how this makes IP filtering unreliable, especially since the increment appeared irregular, and sometimes changed the class B or class C address, then changed back. Complicate this with the fact that any one IP address does not seem to generate enough reported spam that the addresses are quickly blacklisted for spam.
Just over 24% of my spam was sent from sites that were blacklisted as of December 19, 2013. It certainly appears that the spammer’s ability to change their IP addresses exceeds our ability to efficiently identify and report IP addresses. Consider that of the spam I received four weeks earlier, 36% of it was sent from sites that are now blacklisted, while only 13% of the spam I received within the past week was sent from sites that are currently blacklisted. It certainly appears that the longer the spam site has been around, the more likely its source IP is to show up blacklisted, even if that address has not been sending spam recently.
As I was finishing up my analysis, I noted some trends in domains, as well as IP addresses. I did not track firm statistics on this, but I was a little surprised with the registration information for the spam domains. Many of the domains had invalid contact information that included an invalid phone number and/or a nonexistent email address. Some domains changed registration regularly – I stumbled across one that showed it had changed registration 20 times. And many of the domains I checked had been registered for less than a month.
I can see how this can make domain tracking almost as hard as IP tracking when trying to block spam. I have never registered a domain, but somehow it just makes sense to me that valid contact information would be required. I guess not, which, I have to admit, puzzles me a little bit.
I expected to see more tinyurls. I did not track the exact number, but it did not strike me as significant. Of course, a tinyurl actually hides the true nature of the embedded link, so if you want to see where you might be sent before you click, try a tinyURL decoder like www.trueURL.net or unshort.me.
When I looked at it, the Scranton spam source became even more interesting. This particular weasel was quite industrious. During more or less “normal” work hours (essentially 9am-6pm eastern US time) I received about 90% of my spam from 64.xxx.xxx.xxx. This included relatively little overnight spam, and relatively little weekend spam. My best guess is that they target the spam to hit people in the office, while they are at their desk. Does that help the open rate, instead of having users just come in and sift through their inbox, filtering out all of the overnight spam? Maybe. Probably.
I also expected to see more adult, dating and medical spam than I received. Maybe it was because of the holiday season, but the 198 spam messages that promised “up to 75% off” everything from designer jewelry to new cars dominated the subject matter. Requesting user registration for “free” gift cards was also common, and I received 22 emails that promised various sorts of “Holiday” packages from gift baskets to visits from Santa. I have read other spam studies to indicate an inordinate number of medical or pharmaceutical messages, but those numbers did not come through in my spam so perhaps the holiday season really affected the subject matter.
Especially if I consider the time of the year, I could make an educated guess what the more common subject matter of the incoming spam would be. I would expect to see email promising a deal of some sort, which showed up. I would also expect to see emails about ObamaCare. The “Insurance” emails included a significant amount of options to enroll in premium health plans under the Affordable Care Act. These emails were probably some of the best composed, and included plenty of images and official looking sites. I only had to complete the required registration conveniently pointed to by the link provided in the email.
Yeah. That will happen.
It looked like my Nigerian spam was kind of a unique situation. I received 32 spam messages from 20 unique IP addresses in 12 different countries that included the Nigerian scam. But exactly 30 of those appeared in my yahoo email account. I think my favorite Nigerian scam email was allegedly from an American soldier stationed in Afghanistan. The email was sent from a yahoo.co.jp account, included an IP address from Albuquerque, and referenced a website registered and hosted in Taiwan. I hope that “American soldier” is not disappointed that I somehow found myself uninterested in helping him move his gold back to the United States.
I was also very surprised that exactly two of the 691 spam messages included attachments. Going in, I had expected that number to be much higher. Again, perhaps my email providers are filtering out attachments and I just don’t see it, but I do not consider two PDF attachments to be a serious issue. However, 88% of my received spam included a link back to a website. I found it interesting that of those links that I checked, about 30% of them actually included a link to a domain and/or IP address that was different from the source of the email – like the email came from a gammacard.ru account but included links back to ramrod.wxbereho.in. Now, if I was doing spam filtering, I might use that as a strong spam indicator – if the embedded link goes to a domain that is different from the email, just mark it as spam.
As a matter of fact, 16% of the links pointed back to sites that I found listed on “BAD” lists as malware providers or otherwise hostile sites. This included 108 sites in 18 different countries. By now, it was no surprise that U.S. sites led the BAD list as well (32 of the 108).
I See Dead People Spam
Looking at my spam I felt I saw some obvious triggers that would let me mark incoming email as spam and simply never let it get to a user:
1. If the email includes many links, and relatively low “text” volume. Mark it as spam. For instance, one spam email included a total of 2193 characters. Of those, 2086 characters where in links back to other sites, leaving 107 non-hyperlinked characters. And that included spaces.
2. If the email’s subject line is different from the contents, at least consider marking it as spam. Somehow, I would be a little surprised if an email with the subject line “President Obama exempts refi requirements” would include detailed descriptions of a stir-fry recipe and be valid (unless, perhaps, the recipe includes some form of blue food).
3. For that matter, if the email includes recipes, consider just marking it as spam. Plenty of my spam included padding to help make it appear as if it had valid message text. More than any other single material, that padding was at least some part of a recipe.
4. If the email includes poor English and improper grammar, don’t even think about it, just mark it as spam. I read message after message with errors like “you can loose the weight”, and “Do to a recent updated your scores make have been affected.” Actually, I think the single best indicator here may just be checking for use of the apostrophe. If I see “accept your reward’s from” or “manufacturer’s are offering” I vote for simply marking it as spam and throwing it away. Even if it isn’t spam, I am not sure that those people deserve to communicate with me.
5. If “Dr. Oz” appears anywhere in the email, mark it as spam.
After all this, I have a better appreciation for how hard it is to filter spam, and how hard spam producers work to obfuscate the fact that they are actually spam sites. Changing IP addresses and dynamic domain registration helps spam producers avoid filtering techniques. As far as I know, I had not been specifically targeted, but it still seems obvious that a significant amount of those spam emails included links back to hostile sites. I certainly expect that “bad site” number would have been higher if I had been part of a targeted campaign. But, ultimately, the 691 emails presents no real threat to me if I functionally ignore them, and I just bulk delete them from my inbox. Avoiding those links is probably a good idea, and not hard at all.
So, how do you effectively filter spam? Well, beyond the five actions in my above list, I might start by blocking everything from Scranton, Pennsylvania.
Related: Sir Spamalot and Lady Phishing
Jon-Louis Heimerl is Director of Strategic Security for Omaha-based Solutionary, Inc., a provider of managed security solutions, compliance and security measurement, and security consulting services. Mr. Heimerl has over 25 years of experience in security and security programs, and his background includes everything from writing device drivers in assembler to running a world-wide network operation center for the US Government. Mr. Heimerl has also performed commercial consulting for a variety of industries, including many Fortune 500 clients. Mr. Heimerl’s consulting experience includes security assessments, security awareness training, policy development, physical intrusion tests and social engineering exercises.Previous Columns by Jon-Louis Heimerl:This Spam Aint Spiced HamDefense in Depth has Always Been a Valid ConceptYes, Virginia, There Really is Social EngineeringSeven Habits for the Security ConsciousWhat is YOUR Cool Data? DIY Business Impact Analysis
Tags: INDUSTRY INSIGHTS