In Different ways to troubleshoot, health check, TCP connections

Quite some network and TCP-connection troubleshooters as well as NPM (Network Performance Monitoring) tool vendors consider it mandatory to have all the packets when doing their magic.

While there are certainly cases where you need the packets, I wouldn’t consider it mandatory to have the packets available all the time; let alone all packets for all users and all applications; especially when it comes to their respective TCP-sessions!

Total cost of ownership

Storing packets for retrospective application performance and availability analysis requires dozens of TBytes high-speed disk-space when analyzing intermittent issues that happened several hours ago.

Let alone apply machine learning to speed-up the analysis of issues that happened several days (weeks?) ago for different applications and different user sites; each with their own characteristics.

The Total Cost of Ownership (i.e. the TCO) for such a packet cruncher makes it out of reach for most mid-sized ITOM organizations.

Effective and efficient troubleshooting

Even without the original packets, you still can keep track of the TCP connections between applications, clients, servers and the network interconnecting them by analyzing and storing the meta data of each packet of each application flow.

In fact, as a TCP-relationship therapist for cloud services, applications and networks, I have learned that troubleshooting based on this meta data turns out to be far more effective and efficient; provided there is an easy way of slicing-and-dicing all this TCP-session-data.

The proof is in the numbers

We recently did a quick health check for a company in the Amsterdam area. During prime time this company has up to 35.000 users utilizing a redundant, active/stand-by type of internet connection with a daily utilization of approximately 6 Gbps (maximum capacity is 10 Gbps).

After 3 weeks of number crunching across all TCP sessions, all users and all applications running over this internet connection, the database usage was still far less than 1 TB. In fact, the number cruncher was able to store 15 days of 1-minute data in just 346 Gbytes of diskspace.

The full story

By requesting the full story you will learn in what ways it helps you as a network-oriented troubleshooter when working with such a number cruncher processing the meta data about application flows; versus doing semi-automated packet crunching based on dozens of TBytes of stored packets.

The full story includes guidelines about scoping and sizing a project and the associated number cruncher.

Leave a Comment