Information Theory and Elastic: Pt.1 Elastic Common Schema

5 min readOct 23, 2020

Introducing Information:

It is commonly stated that we are living in the information age. What many don’t realize is that our information age was defined in 1948 by the gentleman pictured on the left. His name is Claude Shannon and he published “A Mathematical Theory of Information” while working at Bell Laboratories. The paper completely revolutionized how digital communications were approached and none of the fancy technology we enjoy today would exist without this foundation.

Now the question is why am I bringing this up in an article about the Elastic Stack? The obvious technological connection is merely the surface, but what lies deeper is the opportunity for Elastic to be the ultimate culmination of our implementation of Shannon’s work. At the heart of his writings is a simple concept that all communication follows a simple model. Each message has a sender, a medium of exchange, a receiver, and some source of noise that distorts the message. Using this model he developed a set of mathematical systems that could actually prove what the maximum speed of information exchange can be, the Shannon Limit, and found the only true key to overcoming noise is to properly encode the message. Now we have designed algorithms that have transferred data at the Shannon Limit, but what has always bothered me is the single dimension of that data. In my opinion, the sending of a single message is only the start of truly utilizing Shannon’s theory. I dare say that the true achievement is to send information with multiple dimensions of relations and hundreds of meanings in a single message. This is the potential that the Elastic Stack can accomplish. Over the next series of articles, I will build out the meaning and implementation of this theory.

A Common Language: ECS

The place that Shannon first started was the need to optimize the symbols used to send any message. A successful method will create a system that can remain compact enough to be transmitted quickly but provide enough information that even a small amount of the message arriving intact the meaning is obvious. Before the Elastic Common Schema, this was not the case. Too many data sources required custom and often bulky fields. The first element that is perfect about ECS is the tree structure. Instead of flat data, we have the ability to describe entire data objects within the field. This allows for a simple hierarchy of top fields that can be utilized in multiple places depending on their context. It also means that when new data sources are added they can either use the same fields inherently or merely reuse their core meanings with a new root. This flexibility means analysts can quickly ascertain the meaning of the data even if they don’t have experience with using the source that produced it. By using this familiarity the entire message from a particular does not require a complete understanding of all the fields, but just enough that the analyst is able to find the base meaning based on previous sources. It is often pointed out how ECS provides instant re-usability, but I believe this robust ability to be understood without full knowledge of the entire fieldset is one of its hidden attributes.

Maximizing ECS Durability:

One area that I believe we could improve on is recognizing what the most fields used during analysis are, and how we can do a better job of bringing them to the forefront. Any good symbolic system has certain combinations that appear more than others. For instance, in the English language, the letter ‘e’ has a very high probability and thus is given the simplest representation in Morse Code. I believe that certain fields are more likely to have significant information and be the key to finding the meaning of a message quicker than others. For instance, when looking over a created process one of the first fields to look at is the process.args array. This field gives more information about the purpose of the process and the thought pattern of what may have started. For an analyst having such a field right in their default view front and center can mean the difference between seconds and several minutes for understanding the context of the process event. This should be a very important area of research. Identifying the weighted importance of fields and ensuring that default views favor those with more weight. Obviously, sentences are possible without an ‘e’ or other more probable letters and the same is the importance of fields. But by reducing the amount of data needed to make a decision on the meaning of the message the faster our information will flow.

Relations of Fields

The key to optimizing the field use to explore the relations between the data and how much information is given to us. There are two methods to achieve this. The first is what I did earlier when analyzing the triage of a created process. Documentation of workflow and identifying what fields do we actually check and which do we skip over will prove invaluable. There are different initiatives to create such documented datasets that are very promising. MITRE ATT&CK just announced a whole new setup for showing the events and data sources that are directly observed per sub-technique. We will cover how this project applies to this series later. The second way to explore the relationship between fields and the final message is by utilizing Kibana graphs. The ability to create graphs based on relevance and/or count could be a great tool for finding out exactly how important fields are in your data. For instance, you could match your preferred detection rules against the fields used. This alone would show direct correlations of the fields that seem to provide the most relevant information when analyzing data for underlying messages. Obviously, we will also include the use of the graph application in Kibana, and I hope over time it is more directly integrated with out of the box functionality that helps beginners.

Conclusion:

The Elastic Common Schema is a robust solution for providing a large amount of information in a compact object. Because of the reuse of key core fields, users can understand the message being communicated even without full knowledge of the source or intention of the sender. Following the work of Claude Shannon, these two features make it optimal to overcome the noise caused by the massive amounts of data used in modern infrastructure. Beyond this, the structure of data in Elastic can carry several relations between entities that the powerful Search abilities inherent in the stack can make apparent at amazing speeds. The key areas of research that can improve our abilities are the use of the Kibana Graph application to find key stats about field usage and exploring the fields that most quickly provide meaning. The next articles in the series will expand on how this can be accomplished.

Information Theory and Elastic: Pt.1 Elastic Common Schema

Written by ivan ninichuck