According to a recent survey, metrics and reporting are the most common operational visibility tools across nearly all observability use cases. You can get started with server metrics, application metrics, and metrics used in container monitoring, microservices monitoring, and serverless monitoring for your team’s visibility program. Metrics help you see application issues before they become critical, give you a jumping-off point for troubleshooting performance problems, and can even be early indicators of bugs in your software code. This is especially true for teams that are deploying microservices, using containers, running a continuous software delivery pipeline, and pushing code frequently. Doing all of these things may reduce development complexity but will wreak havoc in your operational visibility program if you don’t plan for the changes you’re making.

Used and loved by...

Metrics are a measure of efficiency, performance, progress, or quality. Some examples include the number of server requests, web server errors, or logged exceptions from your custom application within a second. Unlike logs, which are discrete, usually text-based records of every single event (assuming your application developers defined the log in the application or resource), application metrics, and server metrics involve some quantity of events that occur during a particular period and are reported in a structured data format. Application and server metrics are based on data sampling, can be stored and retrieved efficiently, and can serve as gauges for you to see issues as they develop and prompt you on when and where to drill down.

When it comes to metrics, there are five primary categories: capacity, bandwidth, state, rate, and events. Capacity metrics monitor system components where there is a fixed and known capacity, such as disk space and free memory. Bandwidth metrics measure flows within a system such as network utilization, CPU usage, and disk throughput. State metrics have distinct values that are used to monitor changes in system state, such as “running” or “stopped.” Rate metrics are used to indicate the rate at which certain events take place, such as login attempts per minute. Event parameter metrics show relevant events over some time period.

Use Cases

Monitor app performance

  • Monitor overloaded resources
  • See unacceptable latencies
  • Alert on errors and failures

Correct user issues

  • Replicate service problems
  • Monitor access
  • Identify fraud

Identify software bugs

  • Alert on software errors
  • Monitor API interactions
  • See client problems early

The Scalyr Agent has a ton of out-of-the-box plugins that let you aggregate, store, visualize, search, and alert on metrics across all of your servers, containers, container orchestration environments, databases, load balancers, network equipment, application services, APIs, and more. These include server metrics, system and process metrics, and child processes. We make it straightforward in the Scalyr Platform and provide guidance and resources for managing groups of resources at once. We also enable you to customize your own plugins so you can collect metrics from any source. For more modern architectures such as production container environments orchestrated in Kubernetes, we offer the Scalyr Agent prepackaged as a DaemonSet for simple deployment and maintenance.

In addition to collecting system and process metrics, Scalyr makes it easy for you to log custom metrics from your application code. You can log simple metrics or complex, multi-field events in any format, including key-value pairs or JSON, and add additional metrics to your application code at any time without needing to update the Scalyr Agent.

Many of our customers are standardizing on containers, microservices, and serverless environments, and these architectural shifts have implications for their metrics regimens. For container environments like Docker and orchestration tools like Kubernetes, see Monitoring Containers and Logging Microservices. If you are pursuing a serverless architecture, such as Amazon’s AWS Lambda, you can aggregate a combination of metrics provided by the infrastructure provider, e.g., Amazon CloudWatch, with custom application metrics and see them together in Scalyr. This will allow you to troubleshoot issues across your entire stack and be able to investigate issues through not just high-level metrics but also drill down into the log data, all from a single platform.

Why Scalyr


Go fast. Blazing-fast. Ingest logs and alert on them in real time. Perform split-second searches and visualizations, across your entire environment.

Learn More


Skip the learning curve. Point and click to search, pivot or visualize your data.  No query language expertise required.

Learn More


Using our time-series database and massive compute capacity, Scalyr will easily scale with your systems.  You also won’t break the bank as you grow with us.

Learn More

“One of the things I really value in Scalyr is the responsiveness. I can give it a really terrible-looking query with a bunch of regular expressions, and somehow it still comes back in under a second.”

Elena Tatarchenko, Senior Engineering Manager, Oracle

“We push many hundreds of gigabytes of logs to Scalyr, and I need to search back a few days sometimes, a month even. The fact that I can still do that is just great.”

Jeff Watts, Senior Engineering Manager, Periscope Data

“Asking how Scalyr helps is like asking how breathing helps with your life.”

Tim Kröger, Head of Engineering, Zalando

Live Demo

Explore Scalyr with sample data and zero setup in our Live Demo.

Free Trial

Jump right in with your data in our 30-day Free Trial. No credit card required.