Some time ago I wrote a blog explaining the visualization techniques I had developed to help non-technical HR personal interpret the overall scope of a particular investigation. While the specific evidence was perfectly fine to determine if there was a breach of policy, the depth of complicity of the end users actions can sometimes be hard to determine with just assorted evidence. An end user that only sends inappropriate content to one person a number times may be considered different to an end user that forwards one specific piece of inappropriate content to multiple people.
When the latest investigation of this nature appeared on my radar, I had a quick browse to see if there was a better way to automatically generate similar linking diagrams that I had previously created manually in Visio. While looking at secviz.org I notice the posted by Ben from the Honeynet project in Australia that used the graphic tool Circos.
The Circos tool was created by Martin Krzywinski for visualizing links in genomes. While the Circos seemed to be very flexible in the amount of information that could be visualized, it was very industry specific and the configuration terminology is specific to genetics. While it took a bit of time to reverse engineer the terminology in my head and really start to understand how Circos works, I believe this tool could be of great value in the IT space for all sorts of visualizations of large data sets where you want to show relationships. Because of this, I am planning to follow up this blog with a more detailed explanation of the Circos configuration files I produced in the hope that it helps others make use of this tool.
For my first attempt at using Circos I ended up to mapping 26 different internal users, and grouping all external users as another entity. This produce the following graphic.
While at first glance, the graphic looks impressive and seems to be very complicated, once you understand how to read it, it quickly becomes very useful for showing the overall relationships between who was sending and receiving inappropriate content, and how many “networks” of people were involved.
If we look at the inner band first, you will note that the circumference is broken up into 27 different colored parts. Each part for this visualization represents a different internal end user; or in the case of the light blue where “User 4″ would be, all external users to the company. Each colored arc is then broken into smaller segments. Each of these segments represents a specific email (or emails of similar content if the end user also forwarded it on after receipt) that was sent or received. The lines that link different users is colored the same color as the arc that represents the end user that sent the email.
One of the benefits of the Circos tool is that you can add multiple bands of data in the visualization. Using this ability, I added around the outside of my final graphic a histogram that also shows the age of the email for each segment. As explained in my original post on this topic, adding a time period can be important for HR to determine the appropriate discipline. In my graphic, the histogram for the Y axis is broken into 6 monthly segments. For each email represented, a bar is drawn to show how old (from the date of analysis) the original email in question was received. To make it easier to see trends in time I also used a red bar to represent any emails with in 3 months, orange for 3-9 months and green for any emails from 9-18 months old.
Circos is a wonderful tool, and I definitely plan to expand my use of it in the future. One of the next projects I want to use it for is to visualize internal WAN traffic (probably netflow data) to better understand the internal traffic inter-relationships.