Getting Started — Getting Started • otel

This page is about instrumenting you R package or project for OpenTelemetry. If you want to start collecting OpenTelemetry data for instrumented packages, see Collecting Telemetry Data in the otelsdk package.

About OpenTelemetry

OpenTelemetry is an observability framework. OpenTelemetry is a collection of tools, APIs, and SDKs used to instrument, generate, collect, and export telemetry data such as metrics, logs, and traces, for analysis in order to understand your software’s performance and behavior.

For an introduction to OpenTelemetry, see the OpenTelemetry website docs.

The otel and otelsdk R packages

Use the otel package as a dependency if you want to instrument your R package or project for OpenTelemetry.

Use the otelsdk package to produce OpenTelemetry output from an R package or project that was instrumented with the otel package.

Complete Example

To instrument your package with otel, you need to do a couple of steps. In this section we show how to instrument the callr package.

Add the otel package as a dependency

The first step is to add the otel package as a dependency. otel is a very lightweight package, so may want to add it as a hard dependency. This has the advantage that you don't need to check if otel is installed every time you call an otel function. Add otel to the Imports section in DESCRIPTION:

Imports:
    otel

Alternatively, you may add otel as a soft dependency. Add otel to the Suggests section in DESCRIPTION:

Suggests:
    otel

If you add otel in Suggests, then it makes sense to create a helper function that checks if otel is installed and also that tracing is enabled for the caller. You can put this function in any R file, e.g. R/utils.R is a nice place for it:

is_otel_tracing <- function() {
  requireNamespace("otel", quietly = TRUE) && otel::is_tracing_enabled()
}

Choose a tracer name

Every package should have its own tracer with a name that is unique for the package. See default_tracer_name() for tips on choosing a good tracer name. Set the otel_tracer_name variable to the tracer name. No need to export this symbol. In callr, we'll add

otel_tracer_name <- "org.r-lib.callr"

to the R/callr-package.R file.

Create spans for selected functions

Select the functions you want to add tracing to. It is overkill to add tracing to small functions that are called lots of times. It makes sense to add spans to the main functions of the package.

The callr package has various ways of starting another R process and then running R code in it. We'll add tracing to the

functions first.

We add to callr::r() in eval.R:

  if (is_otel_tracing()) {
    otel::start_local_active_span(
      "callr::r",
      attributes = otel::as_attributes(options)
    )
  }

We use the is_otel_tracing() helper function, defined above.
start_local_active_span() starts a span and also activates it. It also sets up an exit handler that ends the span when the caller function (callr::r()) exits.
options contain a long list of user-provided and other option, we add these to the span as attributes.

We add essentially the same code to callr::rcmd():

  if (is_otel_tracing()) {
    otel::start_local_active_span(
      "callr::rcmd",
      attributes = otel::as_attributes(options)
    )
  }

And to callr::rscript():

  if (is_otel_tracing()) {
    otel::start_local_active_span(
      "callr::rscript",
      attributes = otel::as_attributes(options)
    )
  }

Concurrency

An instance of the callr::r_session R6 class represents persistent R background processes. We want to collect all spans from an R process into the same trace. Since the R processes are running concurrently, their (sub)spans will not form the correct hierarchy if we use the default, timing-based otel mechanism to organize spans into trees. We need to manage the lifetime and activation of the spans that represent the R processes manually.

A generic strategy for handling concurrency in otel is:

Create a new long lasting span with start_span(). (I.e. not start_local_active_span()!)
Assign the returned span into the corresponding object of the concurrent and/or asynchronous computation. Every span has a finalizer that closes the span.
When running code that belongs to the concurrent computation represented by the span, activate it for a specific R scope by calling with_active_span() or local_active_span().
When the concurrent computation ends, close the span manually with its $end() method or end_span(). (Otherwise it would be only closed at the next garbage collection, assuming there are no references to it.)

This code goes into the constructor of the r_session object:

  if (is_otel_tracing()) {
    private$options$otel_session <- otel::start_span(
      "callr::r_session",
      attributes = otel::as_attributes(options)
    )
  }

The finalize() method (the finalizer) gets a call to close the span:

  if (is_otel_tracing()) {
    private$options$otel_session$end()
  }

We also add (sub)spans to other operations, e.g. the read() method gets

  if (is_otel_tracing()) {
    otel::local_session(private$options$otel_session)
    spn <- otel::start_local_active_span("callr::r_session$read")
  }

Testing

To test your instrumentation, you need to install the otelsdk package and you also need a local or remote OpenTelemetry collector.

I suggest you use otel-tui, a terminal OpenTelemetry viewer. To configure it, use the http exporter, see Environment Variables:

OTEL_TRACES_EXPORTER=http R -q

Development mode

By default otel functions never error, to avoid taking down a production app. For development this is not ideal, we want to catch errors early. I suggest you always turn on development mode when instrumenting a package:

OTEL_ENV=dev

Context propagation

OpenTelemetry supports distributed tracing. A span (context) can be serialized, copied to another process, and there it can be used to create child spans.

For applications communicating via HTTP the serialized span context is transmitted in HTTP headers. For our callr example we can copy the context to the R subprocess in environment variables.

For example in the callr:r() code we may write:

  if (is_otel_tracing()) {
    otel::start_local_active_span(
      "callr::r",
      attributes = otel::as_attributes(options)
    )
    hdrs <- otel::pack_http_context()
    names(hdrs) <- toupper(names(hdrs))
    options$env[names(hdrs)] <- hdrs
  }

options$env contains the environment variables callr will set in the newly started R process. This is where we need to add the output of pack_http_context(), which contains the serialized representation of the active span, if there is any.

Additionally, the subprocess needs to pick up the span context from the environment variables. The callr:::common_hook() internal function contains the code that the subprocess runs at startup. Here we need to add:

      has_otel <- nzchar(Sys.getenv("TRACEPARENT")) &&
        requireNamespace("otel", quietly = TRUE)
      assign(envir = env$`__callr_data__`, "has_otel", has_otel)
      if (has_otel) {
        hdrs <- as.list(c(
          traceparent = Sys.getenv("TRACEPARENT"),
          tracestate = Sys.getenv("TRACESTATE"),
          baggage = Sys.getenv("BAGGAGE")
        ))
        prtctx <- otel::extract_http_context(hdrs)
        reg.finalizer(
          env$`__callr_data__`,
          function(e) e$otel_span$end(),
          onexit = TRUE
        )
        assign(
          envir = env$`__callr_data__`,
          "otel_span",
          otel::start_span(
            "callr subprocess",
            options = list(parent = prtctx)
          )
        )
      }

First we check if the TRACEPARENT environment variable is set. This contains the serialization of the parent span. If it exists and the otel package is also available, then we extract the span context from the environment variables, and start a new span that is a child span or the remote span obtained from the environment variables. We also set up a finalizer that closes this span when the R process terminates.

Examples

# See above