The story of 1€ or RabbitMQ Tracing Plugin issue
Finally, the PR is merged, and I can tell the story with the feeling of success.
RabbitMQ traces are used as debugging breadcrumbs - they store all the data movements between services, sagas and handlers. So it was really surprising to discover that the code works fine, but no data appeared in the traces.
Handlers were handlin… but there were no incoming or outgoing messages. Even stranger - there was no clear pattern between the missed messages. No relation to the service, source, destination address, timing or timeouts.
Total mess. A void of unpresented data. A nightmare that made bug fixing and testing extremely complicated. And the situation was impossible to reproduce in the development environment, while being reproducible only in staging/testing and production.
I was a newbie in the team and in microservice developpment in general. But I’m naturally curious and have some expirience catching weird issues. I don’t know how, but I can often feel when something is wrong - something very simple, yet with a big impact on the system.
After a few days of deep-dive debugging, I found the problem: the € sign.
When it appears in the payload, the entire trace record was not written to the file,
because Erlang io_lib:format/2 function cannot convert non-Latin1 characters into bytes and silently fails.
The sign itself has no influence on the (de)serialization process, since JSON in our stack works with UTF-16 and RabbitMQ Server internally uses UTF-8 encoded strings.
The euro sign was used to encode the currency in transations, which differed between staging/testing and development environments.
I prepared 2 fixes:
1️⃣ in our code: use strict JSON escaping,
applied for all non-Latin1 characters in MassTransit payload data:
using MassTransit.Serialization;
// state after UseNewtonsoftJsonSerializer() call
NewtonsoftJsonMessageSerializer.SerializerSettings
.StringEscapeHandling = StringEscapeHandling.EscapeNonAscii;
2️⃣ in the Tracing Plugin: encode all characters into UTF-8 when writing the trace file:
%% was
print_log(io_lib:format(Fmt, Args), State);
%% and become
print_log(unicode:characters_to_binary(io_lib:format(Fmt, Args)), State);
This was not my first experience fixing public code, but it was my first time working with the modern GitHub flow in a modern open-source project.
Damn… I set up the entire RabbitMQ Server development environment on my laptop - Erlang and other stuff - and was not prepared to discover GitHub Actions.
My first PR was rejected because I had done few weird mistakes. But I had enough courage to continue,
and I used GitHub Copilot to understand the project structure and learn how to fully test my changes.
I’m used to Gentoo-style workflows - compiling and testing locally with make, without all the bells and whistles of modern CI/CD pipelines.
Conclusion: trust you feelings and suspicions. Bugs can (and will) hide in the smallest unit of software engeneering - a single byte 😊