Home forums Java plugins with isolating class loaders

  • This topic is empty.
Viewing 0 reply threads
  • Author
    Posts
    • #4293 Reply
      finderpress
      Keymaster

      How to write a plugin engine for the JVM

      Co-authored by: Luca Telolli and Dario Amorosi

      Sometimes it’s impossible to keep up with users’ demands. In these cases you may want to explore plugins, a mechanism to offload parts of your system’s logic to the user; the user writes the plugin, your system runs it. In this article, we explore how to support plugins on the JVM with isolating class loaders. Isolating class loaders prevent the problem named dependency hell where it becomes hard, or even impossible to find a set of dependencies that are interoperable. They do this by allowing each plugin to run in its own class loader, with its own set of dependencies. After reading this article you should be able to design and write your own plugin system on the JVM.

      In the Data and Analytics organisation of Adevinta we experimented by adding a framework to run message transformations on streaming data using plugins. We took this direction in the aftermath of an outage. The outage was caused by libraries with mismatching dependencies that managed to reach production.

      We quickly realised that information on how to properly write a plugin-engine is not easy to find on the internet. This article hopefully fills the gap.

      Note: all code examples are written in Scala, but they should all be simple to translate to Java or Kotlin.

      The plugin interface is the Java interface that will be implemented by the plugins. It is through this interface that the plugin engine will invoke the plugin.

      This might be surprising in the world of agility, but a good definition, right from the start, is quite important. There are several reasons for that:

      • Changing the interface at a later time requires rebuilding all existing plugins.
      • The plugins run in an isolated environment, unless you endorse reflection trickery, they can only do things through the interface.
      • Any type used in the interface must be shared by the plugin and the plugin-engine; the plugin can’t load a different version of that type.
      • Careless design easily leads to nasty memory leaks. More about this later.

      Here are some guidelines that will help in preventing these problems:

      • Avoid types from external libraries. So no ByteString from Akka, Chunk from ZIO and no JsonNode from Jackson.
      • Any type provided by Java or the JVM library is safe. This includes simple types such as int, byte and boolean, and types such as java.util.List, java.io.File, and odd ducks such as java.awt.Rectangle. Arrays of these types are also fine.
        As an example, for our plugin-engine it was very tempting to use JsonNode from Jackson. However, this would have severely limited what libraries could be used inside the plugin. So instead we went with byte arrays containing raw serialised JSON.
      • Types from language runtime libraries (e.g. Scala/Kotlin) are possible if you are okay with fixing the language version for the plugin. If you want the plugin to use any version, you should only use JVM-provided types and interfaces defined in pure Java.
        For our implementation, we decided that forcing the plugins to use Scala 2.13 is acceptable. This allowed us to use Scala types and constructs like case classes.
      • Define custom types (interfaces, records, enums etc.) in a separate package that contains no other types.

      Once we have defined the plugin interface, the plugin has to implement it. As we’ll show later in this article, the plugin engine gets instances of these implementations. Holding on to these instances for a long time can lead to memory leaks.

      To understand why, we have to consider that the plugin is going to run in its own class loader. Any code defined by the plugin will be loaded by the plugin’s class loader. In Java any instance refers to its type, which in turn, references the class loader that loaded it. Class loaders typically require a lot of memory because they track lots of code, including the compiled version of that code.

      How class loaders are referenced in memory

      The consequence is that class loaders are only collected by the garbage collector when all instances from that class loader can be garbage collected.

      As long as your program is actively using a plugin this is not an issue, but as soon as the plugin is no longer needed, you need to get rid of all references to any instance ever created by the plugin.

      Would you like to use a plugin for multiple use cases? Single-use case plugins are easier to handle but are less flexible. Multi-use case plugins need configuration, this needs to be taken into account when designing the plugin interface. The plugin engine will also need a mechanism to pass parameters from the user to the plugin.

      The example plugin engine we are building here will be written in Scala, however, all the principles are the same for Java, Kotlin. The plugins must implement the following interface (Scala interfaces are implemented as traits):

      package com.adevinta.blog.pluginengine.pluginapi

      trait ConverterPlugin {
      def init(): Unit
      def convert(in: String): String
      }

      Note how we’re adding lifecycle methods to facilitate initialisation and shutdown. Since this trait translates to a pure Java interface, the plugin can be written in Java, Kotlin or any Scala version, independently from the plugin engine.

      On the JVM, class loaders are responsible for two things: instantiating classes, typically from bytecode that is stored in jar files, and for loading ‘resources’ (e.g. config files stored in jar files). Any plain Java program already has multiple class loaders. Let’s simplify and only look at the bootstrap class loader (loads JVM types) and the system class loader (loads from jars on the application’s classpath).

      The default Java class loader hierarchy (simplified)

      Class loaders are placed in a directed graph; every class loader (except the bootstrap class loader) has a parent class loader. The JVM’s default class loaders always first delegate to their parent class loader. Only if the parent class loader doesn’t have the class, will it look for the class itself.

      // Behavior of default class loaders (simplified).
      override def loadClass(name: String): Class[_] = {
      // Delegate to the parent class loader.
      var c = parent.loadClass(name)
      // If nothing was found, load here in this class loader.
      if (c == null) c = this.findClass(name)
      c
      }

      However, this is not what we want from an isolating class loader. For isolation, we should only load classes from the jars given to the class loader. Unfortunately, it is not that simple; we need some common types to communicate with the plugin engine. These common types include at least all the standard Java types and the interface that we defined during step 1. To implement this we use a list of types that we want to load from the parent class loader, the rest will be loaded locally. Fortunately, we can do this fairly efficiently by package name.

      In the following example, we only delegate to the parent class loader for standard classes from the JVM, from the Scala runtime library and for the types defined in the package of the plugin interface in step 1.

      private val PackagePrefixes = Array(
      "java.",
      "jdk.",
      "scala.",
      "com.adevinta.blog.pluginengine.pluginapi."
      )

      // Behavior of isolating class loader (simplified).
      override def loadClass(name: String): Class[_] = {
      // Check if the class name starts with one of the prefixes.
      if (PackagePrefixes.exists(name.startsWith)) {
      // Delegate to the parent class loader.
      parent.loadClass(name)
      } else {
      // Load here in this class loader.
      this.findClass(name)
      }
      }

      We will do something very similar for loading resources:

      private val ResourcePrefixes = PackagePrefixes.map(_.replace('.', '/'))

      // Behavior of isolating class loader.
      override def getResource(name: String): URL = {
      if (ResourcePrefixes.exists(name.startsWith)) parent.getResource(name)
      else this.findResource(name)
      }

      The Java standard library provides excellent base classes for writing a custom class loader. All we need to do is provide the jars, extend URLClassLoader and override the mentioned methods (and all its variations).

      class IsolatingClassLoader(jarUrls: Array[URL], parent: ClassLoader) extends URLClassLoader(jarUrls, parent) {
      override def loadClass(name: String, resolve: Boolean): Class[_]
      override def getResource(name: String): URL
      override def getResources(name: String): util.Enumeration[URL]
      override def getPackages: Array[Package]
      }

      Suppose we have a jar file at /path/to/plugin.jar, we can then create the class loader as follows:

      import java.nio.file.Paths

      val classLoader = new IsolatingClassLoader(
      Array(Paths.get("/path/to/plugin.jar").toURI.toURL),
      this.getClass.getClassLoader
      )

      To see everything together, see the full code example.

      Now that we have a class loader that can load the code and files of a plugin, we need to figure out how to get an actual instance of the plugin. For that, we need to discover the implementing class. Luckily, the JVM already has excellent support for this in the form of the Service Provider Interfaces (SPI). Let’s see how it works.

      The plugin needs to provide an SPI file on the classpath (e.g. in one of its jars). The SPI file is named after the interface and placed in the META-INF/services directory. In our example, the file would be named META-INF/services/com.adevinta.blog.pluginengine.pluginapi.ConverterPlugin. The file contains plain text, each line contains the name of a class that implements the plugin interface.

      Let’s look in more detail. Note that SPI requires the class to have a public constructor without parameters.

      package com.adevinta.blog.plugin1
      import com.adevinta.blog.pluginengine.pluginapi._

      class MangleConverterPlugin extends ConverterPlugin {
      def init(): Unit = {}
      def convert(in: String): String =
      new Mangler().mangle(in)
      }

      class Mangler {
      def mangle(in: String): String =
      scala.util.Random.shuffle(in).toString
      }

      The SPI file contains a single line with the text:

      // META-INF/services/com.adevinta.blog.pluginengine.pluginapi.ConverterPlugin
      com.adevinta.blog.plugin1.MangleConverterPlugin

      After putting the compiled code and the SPI file in a jar file, we can create an IsolationClassLoader and ask for all plugin implementations:

      import java.util.ServiceLoader

      // Using SPI (naively, see next section)
      val pluginClassLoader = new IsolatingClassLoader(...)
      val pluginImplementations: Seq[ConverterPlugin] =
      ServiceLoader
      .load[ConverterPlugin](classOf[ConverterPlugin], pluginClassLoader)
      .iterator().asScala.toSeq
      val plugin: ConverterPlugin = pluginImplementations.head

      As you can see from the code example, the SPI service loader may return multiple implementations (one for each line in the SPI file). In our actual implementation, we added a name and version method to the plugin interface. With this addition, a single plugin jar can contain multiple plugins and/or plugin versions. For now, we’ll simply take the first.

      After getting an instance of a ConverterPlugin, we immediately attempted to use it. Unfortunately, we encountered a ClassNotFoundException.

      plugin.init()
      plugin.convert("Hello world!")
      // java.lang.ClassNotFoundException: com.adevinta.blog.plugin1.Mangler

      The JVM is not loading the classes from our custom class loader! Turns out that we haven’t set the thread’s context class loader yet. Setting the context class loader enables loading classes and resources from a non-standard class loader.

      To simplify this process, we define a utility method that sets the thread’s context class loader, executes a task and then restores the previous context class loader:

      object WithClassLoader {

      /** Runs `task` with the given `classLoader` as the
      * thread's context class loader, and restores the current
      * context class loader afterwards.
      */
      def withClassLoader[A](classLoader: ClassLoader)(task: => A): A = {
      val currentThread = Thread.currentThread()
      val previous = currentThread.getContextClassLoader
      currentThread.setContextClassLoader(classLoader)
      try {
      task
      } finally {
      currentThread.setContextClassLoader(previous)
      }
      }
      }

      We don’t want to bother the plugin users with this class loader stuff, so we wrap the obtained plugin:

      val wrapped: ConverterPlugin = new ConverterPlugin {
      def init(): Unit =
      withClassLoader(pluginClassLoader, plugin.init())

      def convert(in: String): String =
      withClassLoader(pluginClassLoader, plugin.convert(in))
      }

      And with that, we can use the wrapped plugin:

      wrapped.convert("Hello world!")
      // rldl!ooWelH

      But creating converters is parameterised, so our actual plugin interface looks more like this:

      package com.adevinta.blog.pluginengine.pluginapi

      trait ConverterPlugin {
      def init(): Unit
      def makeConverter(specification: String): Converter
      }

      trait Converter {
      def convert(in: String): String
      }

      We reimplement the plugin:

      package com.adevinta.blog.plugin
      import com.adevinta.blog.pluginengine.pluginapi._

      class MangleConverterPlugin extends ConverterPlugin {
      def init(): Unit = {}
      def makeConvert(specification: String): Converter =
      new ManglerConverter
      }

      class ManglerConverter extends Converter {
      def convert(in: String): String =
      new Mangler().mangle(in)
      }

      class Mangler {
      def mangle(in: String): String =
      scala.util.Random.shuffle(in).toString
      }

      We wrap the plugin again:

      val plugin: ConverterPlugin = ...
      val wrapped: ConverterPlugin = new ConverterPlugin {
      def init(): Unit =
      withClassLoader(pluginClassLoader, plugin.init())

      def makeConverter(specification: String): String =
      withClassLoader(pluginClassLoader, plugin.makeConverter(specification))
      }

      Filled with hope, we try to use the new interface, only to be greeted once again by:

      wrapped.init()
      val converter = wrapped.makeConverter("")
      converter.convert("Hello world!")
      // java.lang.ClassNotFoundException: com.adevinta.blog.plugin.Mangler

      Why did this happen? When we call convert, we again execute code that should run with the thread’s context class loader set to our isolating class loader. Let’s fix the wrapper:

      val plugin: ConverterPlugin = ...
      val wrapped: ConverterPlugin = new ConverterPlugin {
      def init(): Unit =
      withClassLoader(pluginClassLoader, plugin.init())

      def makeConverter(specification: String): Converter = {
      val c = withClassLoader(pluginClassLoader, plugin.makeConverter(specification))
      new Converter {
      def convert(in: String): String =
      withClassLoader(pluginClassLoader, c.convert(in))
      }
      }
      }

      Notice how we wrap both the call to makeConverter and further calls to the obtained Converter. Does it work now?

      val converter = wrapped.makeConverter("")
      converter.convert("Hello world!")
      // ll !reodWloH

      Success!

      So far, we have not talked about where to get the plugin’s jar(s) from. We also need to consider how to package the plugin with all its dependencies.

      Let’s consider the design forces for each actor:

      • Plugin makers: Building and uploading the packaged plugin should be straightforward, preferably using tools like Maven, Gradle or Sbt. What update frequency is expected? Is versioning important or can we always use the latest version?
      • Plugin engine operators: Operators need to download the packaged plugin, how is authorisation arranged? Do we need usage metrics?
      • Plugin users: Users should be able to choose a plugin, for example, based on name and version.

      For packaging, we considered two solutions: uber-jars and jars packaged in a zip file. Since building uber-jars is not straightforward (it often needs custom merge rules), we decided to use zip files. To support the default behaviour of both the Gradle and Sbt packagers, our plugin jars may be placed in the root of the zip file, or in the enclosed lib directory.

      For distribution of the zip files, we considered Adevinta’s artefact manager (Artifactory) and S3. Because we foresee limited activity in plugin creation and update, we decided that the upload should be handled by engine operators instead of plugin makers. This naturally led us to S3; S3 has a simpler interface and a very stable infrastructure. However, support for other distribution channels can be added in the future.

      Keep in consideration how your plugins are used. If you don’t know beforehand which plugin is needed, you may want to construct the plugins dynamically. However, this should be done carefully.

      Re-instantiating a class loader from scratch every time is not recommendable for the following reasons:

      • Creation overhead: Instantiating a new class loader involves downloading the plugin jars, allocating memory for the class loader itself and initialising various data structures that it uses to manage classes, such as caches and internal tables.
      • Class loading time: Each new class loader needs to load the classes from scratch, even if the same classes have already been loaded by another class loader.
      • Increased memory usage: Having multiple class loaders can lead to duplicated classes in memory if the same classes are loaded by different class loaders.
      • Compiler cold start: Newly loaded classes will be executed by the interpreter at first, it takes time before the JVM starts to optimise.

      These problems can be addressed by adding caching. Our system uses caches at several levels:

      • An on-disk cache for the downloaded plugin bundles (e.g. zip-files). Cached items that are no longer in use are kept for 24 hours.
      • An in-memory cache for the isolating class loaders. Cached class loaders that are no longer used are kept in memory for 10 minutes.
      • In addition, our plugins create a Converter. These are also cached by their specification.

      Be careful with choosing a cache implementation in multi-threaded applications. For example, it should not download the same plugin twice when it’s simultaneously requested from multiple threads.

      Before you consider building a Java-based plugin system, you should be aware of its limitations.

      • Plugins must be trusted, they still have full access to the JVM.
      • Badly written plugins can use more than their fair share of RAM and CPU.
      • Each class loader loads all the code it needs, even if the same code is already loaded by another class loader. This could lead to higher memory usage than expected.
      • It’s pretty hard to change the plugin interface when many plugins depend on it. A good upgrade path is very useful. For example, you could support multiple versions of the plugin interface at the same time.

      Despite little information on the internet, it’s possible to build powerful plugin systems on the Java platform that allow you to extend your application’s functionality through user-provided code. By carefully defining the plugin interface and using Java’s built-in service provider interface (SPI), you can dynamically discover and instantiate plugin implementations at runtime.

      Plugin dependency isolation, allowing each plugin to have its own dependency versions, is possible by creating an isolating class loader that only delegates to the parent class loader for selected packages. These selected packages include those from Java, possibly a language runtime like Scala, and from the plugin API.

      The plugin engine developed at Adevinta, allows us to run multiple message transformers without getting dependency hell. It has also proven very useful for gradual deployments of new message transformer versions without any downtime.

Viewing 0 reply threads
Reply To: Java plugins with isolating class loaders
Your information:




FREE CHEAT SHEET

Learn How to build a brand online and build authority 

This free (and highly detailed) cheat sheet will give you 20 smart strategies to help you grow your brand

No thanks, I don't want to build brand