Implement log clustering 
27
Views
0
Comments
New
Service Center

As a developer, you consider it a good practice to log messages from your application. Logging makes life easier for whoever troubleshoots an application issue. Good logging not only helps reduce triaging time but also helps one to identify repeating patterns and regressions. While logging is good, the number of logs can quickly add up, especially at the cloud scale. So it`s very easy to run into the problem of finding a needle in a haystack. A typical application runs a loop, producing similar messages repeatedly over a period. When an unusual event like an application shutdown happens, you see a set of messages that aren’t usually seen. Log clusters show cluster signatures that occurred only once. These occurrences usually indicate unusual events in your system. 


Also, a request into your application typically traverses multiple tiers. For example, the request originates at the UI and goes to multiple mid-tiers and the database or storage layers. The trend-clustering algorithm identifies messages that occur together at the same time and groups them together. So it`s another good feature to have. 


To sum it up, the idea is to implement a log clustering feature to reduce a large number of log records to a few signatures. The clustering algorithm should auto-analyze and categorize the signatures. The algorithm should also correlate the cluster signatures by their trends automatically. This process helps when you don’t have a specific string to search for and you want the system to automatically analyze and categorize a large volume of logs.