This page is about instrumenting you R package or project for OpenTelemetry. If you want to start collecting OpenTelemetry data for instrumented packages, see Collecting Telemetry Data in the otelsdk package.
About OpenTelemetry
OpenTelemetry is an observability framework. OpenTelemetry is a collection of tools, APIs, and SDKs used to instrument, generate, collect, and export telemetry data such as metrics, logs, and traces, for analysis in order to understand your software’s performance and behavior.
For an introduction to OpenTelemetry, see the OpenTelemetry website docs.
The otel and otelsdk R packages
Use the otel package as a dependency if you want to instrument your R package or project for OpenTelemetry.
Use the otelsdk package to produce OpenTelemetry output from an R package or project that was instrumented with the otel package.
Complete Example
To instrument your package with otel, you need to do a couple of steps. In this section we show how to instrument the callr package.
Add the otel package as a dependency
The first step is to add the otel package as a dependency. otel is a very
lightweight package, so may want to add it as a hard dependency. This has
the advantage that you don't need to check if otel is installed every time
you call an otel function. Add otel to the Imports
section in
DESCRIPTION
:
Alternatively, you may add otel as a soft dependency. Add otel to the
Suggests
section in DESCRIPTION
:
If you add otel in Suggests
, then it makes sense to create a helper
function that checks if otel is installed and also that tracing is enabled
for the caller. You can put this function in any R file, e.g. R/utils.R
is a nice place for it:
is_otel_tracing <- function() {
requireNamespace("otel", quietly = TRUE) && otel::is_tracing_enabled()
}
Choose a tracer name
Every package should have its own tracer with a name that is unique for the
package. See default_tracer_name()
for tips on choosing a good tracer
name. Set the otel_tracer_name
variable to the tracer name. No need to
export this symbol. In callr, we'll add
otel_tracer_name <- "org.r-lib.callr"
to the R/callr-package.R
file.
Create spans for selected functions
Select the functions you want to add tracing to. It is overkill to add tracing to small functions that are called lots of times. It makes sense to add spans to the main functions of the package.
The callr package has various ways of starting another R process and then running R code in it. We'll add tracing to the
functions first.
We add to callr::r()
in eval.R
:
if (is_otel_tracing()) {
otel::start_local_active_span(
"callr::r",
attributes = otel::as_attributes(options)
)
}
We use the
is_otel_tracing()
helper function, defined above.start_local_active_span()
starts a span and also activates it. It also sets up an exit handler that ends the span when the caller function (callr::r()
) exits.options
contain a long list of user-provided and other option, we add these to the span as attributes.
We add essentially the same code to callr::rcmd()
:
if (is_otel_tracing()) {
otel::start_local_active_span(
"callr::rcmd",
attributes = otel::as_attributes(options)
)
}
And to callr::rscript()
:
if (is_otel_tracing()) {
otel::start_local_active_span(
"callr::rscript",
attributes = otel::as_attributes(options)
)
}
Concurrency
An instance of the callr::r_session
R6 class represents persistent
R background processes. We want to collect all spans from an R process into
the same trace. Since the R processes are running concurrently, their
(sub)spans will not form the correct hierarchy if we use the default,
timing-based otel mechanism to organize spans into trees. We need to
manage the lifetime and activation of the spans that represent the R
processes manually.
A generic strategy for handling concurrency in otel is:
Create a new long lasting span with
start_span()
. (I.e. notstart_local_active_span()
!)Assign the returned span into the corresponding object of the concurrent and/or asynchronous computation. Every span has a finalizer that closes the span.
When running code that belongs to the concurrent computation represented by the span, activate it for a specific R scope by calling
with_active_span()
orlocal_active_span()
.When the concurrent computation ends, close the span manually with its
$end()
method orend_span()
. (Otherwise it would be only closed at the next garbage collection, assuming there are no references to it.)
This code goes into the constructor of the r_session
object:
if (is_otel_tracing()) {
private$options$otel_session <- otel::start_span(
"callr::r_session",
attributes = otel::as_attributes(options)
)
}
The finalize()
method (the finalizer) gets a call to close the span:
if (is_otel_tracing()) {
private$options$otel_session$end()
}
We also add (sub)spans to other operations, e.g. the read()
method gets
if (is_otel_tracing()) {
otel::local_session(private$options$otel_session)
spn <- otel::start_local_active_span("callr::r_session$read")
}
Testing
To test your instrumentation, you need to install the otelsdk package and you also need a local or remote OpenTelemetry collector.
I suggest you use otel-tui
,
a terminal OpenTelemetry viewer. To configure it, use the http
exporter,
see Environment Variables:
Development mode
By default otel functions never error, to avoid taking down a production app. For development this is not ideal, we want to catch errors early. I suggest you always turn on development mode when instrumenting a package:
Context propagation
OpenTelemetry supports distributed tracing. A span (context) can be serialized, copied to another process, and there it can be used to create child spans.
For applications communicating via HTTP the serialized span context is transmitted in HTTP headers. For our callr example we can copy the context to the R subprocess in environment variables.
For example in the callr:r()
code we may write:
if (is_otel_tracing()) {
otel::start_local_active_span(
"callr::r",
attributes = otel::as_attributes(options)
)
hdrs <- otel::pack_http_context()
names(hdrs) <- toupper(names(hdrs))
options$env[names(hdrs)] <- hdrs
}
options$env
contains the environment variables callr will set in the
newly started R process. This is where we need to add the output of
pack_http_context()
, which contains the serialized representation
of the active span, if there is any.
Additionally, the subprocess needs to pick up the span context from the
environment variables. The callr:::common_hook()
internal function
contains the code that the subprocess runs at startup. Here we need to
add:
has_otel <- nzchar(Sys.getenv("TRACEPARENT")) &&
requireNamespace("otel", quietly = TRUE)
assign(envir = env$`__callr_data__`, "has_otel", has_otel)
if (has_otel) {
hdrs <- as.list(c(
traceparent = Sys.getenv("TRACEPARENT"),
tracestate = Sys.getenv("TRACESTATE"),
baggage = Sys.getenv("BAGGAGE")
))
prtctx <- otel::extract_http_context(hdrs)
reg.finalizer(
env$`__callr_data__`,
function(e) e$otel_span$end(),
onexit = TRUE
)
assign(
envir = env$`__callr_data__`,
"otel_span",
otel::start_span(
"callr subprocess",
options = list(parent = prtctx)
)
)
}
First we check if the TRACEPARENT
environment variable is set. This
contains the serialization of the parent span. If it exists and the otel
package is also available, then we extract the span context from the
environment variables, and start a new span that is a child span or the
remote span obtained from the environment variables. We also set up a
finalizer that closes this span when the R process terminates.