Some Important Notes About VictoriaMetrics

VictoriaMetrics can serve as a full-featured alternative to Prometheus (not 100% compatible). Its official selling points are Scalability, Reliability, Ease-of-Use & Cost-Efficiency. The author has previously released some golang code libraries known for their performance. As an alternative to Prometheus, it tries to solve many existing problems of Prometheus and implements some features that might be “Out of Scope” for Prometheus. For a complete list of features, refer to the official documentation.

The emergence of such open-source systems to solve certain problems is a great thing, providing more choices for everyone, and we should certainly be grateful for that. However, we should also have our own judgment to avoid being easily influenced by others. Nowadays, more and more companies in China are using VictoriaMetrics. This article is mainly to share some lesser-known (or perhaps, less cared about) aspects of VictoriaMetrics’ technical implementation, hoping to provide a more comprehensive understanding of VictoriaMetrics. As for technology decisions, you can make your own judgment after reading. What’s important is not what features a system has, but whether these features are needed and important in the business scenario.

The current version of VictoriaMetrics is v1.80.0.

WAL

There is no WAL (Write-Ahead Log), so there’s a significant risk of data loss during abnormal shutdowns. Interestingly, the author wrote an article WAL usage looks broken in modern Time Series Databases?. In my opinion, some of the points made in this article are not very convincing. For instance, flushing to disk is not solely triggered by the upper-level application. The article mentions ClickHouse, which should also have WAL enabled by default.

Of course, it’s acceptable that many scenarios do not require a WAL (Write-Ahead Log). On the other hand, having multiple replicas can also improve reliability to some extent. However, WAL can enhance reliability to a much greater extent. Compared to other products, lacking this feature might mean one less “9” in terms of reliability. Personally, I think WAL should at least be optional.

Data Integrity Check

There is no information stored for data integrity checking, and the compression algorithm used, such as zstd, does not have the CRC option enabled. Looking at this line of code, this seems to be intentional for performance reasons. I think this is a flaw in the design of the storage system, at least there should be an option to enable it. Data corruption is not something far-fetched, and delivering wrong data to users is even more unacceptable than having no data at all. The currently available solutions and related discussions can be found in the issue I raised Issue: Checksum functionality is missing.

Asynchronous Writes + Temporary Batch Buffering in Memory

The storage layer component, vmstorage, temporarily stores the latest written data in memory before periodically writing it to disk for persistence. Some of the temporarily buffered data may not be queryable (data successfully written may not be immediately queryable). The proxy layer component, vminsert, also buffers data in batches before asynchronously writing it to the storage layer. Data may be buffered in memory at the application level and then a successful write is returned to the upper layer.

VictoriaMetrics employs this asynchronous and batch buffering mode in its write path. For a batch of multiple requests, only the last request can get real feedback on whether the batch write was successful. These batch requests have no awareness of each other and have no contextual relationship (different components at different levels will further tear this relationship apart). This kind of user interface is extremely unfriendly from my limited experience. There is no high-availability queue in between to store data, and failed writes cannot be immediately detected for timely and effective fallback or retry, leading to data loss. It’s acceptable to lose data, but it’s not appropriate to silently drop some. Of course, there is internal retry logic which is useful under normal circumstances.

This article introduces the storage layer’s reference to ClickHouse’s implementation. Possibly due to performance considerations, VictoriaMetrics chose to cut out the part that strictly ensures data reliability.

Scaling/Outages

The storage layer component, vmstorage, requires all writing layer proxy components, vminsert, and all query layer computing components, vmselect, to be restarted for scaling. This is because it does not support automatic service discovery. Someone has raised this Issue. Solutions certainly exist, but they would increase complexity. However, I think it’s worth it, as the problem is that the time for each component to detect each other during scaling may be long. If the cluster scale is large or in an emergency situation, basic read and write availability may not be effectively guaranteed.

VictoriaMetrics will retry writing to all nodes until successful. Coupled with the above asynchronous and batch buffering mode, it is theoretically easy to cause an avalanche if a node goes down (the storage layer now has maxHourlySeries, a startup configuration that can serve as a self-protection measure for a single node).

Of course, adding enough nodes and resources can also mitigate the problem of avalanches to some extent. Refer to the official recommendation:

50% of free RAM across all the node types for reducing the probability of OOM (out of memory) crashes and slowdowns during temporary spikes in workload.
50% of spare CPU across all the node types for reducing the probability of slowdowns during temporary spikes in workload.

Maintainability/Compatibility

The current architecture of VictoriaMetrics is relatively simple, so theoretically, the operational complexity should be manageable.

Having reviewed some of the code implementation of VictoriaMetrics, I can irresponsibly say that it’s a bit brutal. Writing it must have been fun, but the reading experience is average. For example, there are many global states and hardcoded parameters. For performance reasons, many optimizations have been made, which might have been quite a mental burden during the development process. Of course, such optimization cannot be avoided entirely. Performance optimization often involves doing dirty and tiring work. So, in terms of code maintainability, I think it’s just average.

The design is not well-considered, and the custom internal protocol format lacks extensibility and the concept of versioning. For example, there may be related issues like the v1.78.0 update which is not backward-compatible due to changes in the internal communication protocol. Queries cannot work normally before the upgrade is complete, which actually increases the complexity and mental burden of changes.

Of course, the project homepage also mentions that the release iteration is very fast, so please read the documentation carefully.

Conclusion

VictoriaMetrics values simplicity and its pursuit of high performance at a low cost is commendable. However, it’s like a youthful and energetic child, making things happen too quickly. For example, some reliability is sacrificed without reservation for performance considerations. These, or the design shortcomings mentioned above, can only be compensated later by increasing complexity or sacrificing compatibility. When it comes to real-world use cases, many users don’t have stringent requirements for monitoring. As a secondary system, as long as it works and is cost-effective, users are content. And users often don’t want to get their hands dirty and are unwilling to do some data combing work according to the business scenario. In this regard, VictoriaMetrics has its merits. Prometheus and some of its derivative projects also have many issues, including some common problems, resource overhead issues, and some design-related or “Out of Scope” problems. However, given that Prometheus and its related projects have been around for a longer time, they appear to be more mature and stable in handling fundamental details.

There is no best system, only the most suitable one. Use the most suitable system to do the most suitable things.