Project icon

com.io7m.jxe

Build status Maven Central Codecov

The jxe package implements a set of classes intended to both provide more secure defaults and to eliminate much of the boilerplate required to set up the standard JDK SAX parsers.

It provides a sane API for setting up secure-by-default validating SAX parsers that dynamically locate schemas for incoming documents from a whitelisted set of locations without those documents knowing or caring where those schemas are actually located.

The package is capable of setting up extremely strict validating parsers. For example, many applications that receive XML have the following requirements on incoming data:

  • XML documents must be validated against one of a small set of schemas. Data that has not been validated must be rejected.
  • Documents must declare the namespace to which their data belongs, but must not be required to actually state the physical location of the schema. This is security sensitive: A document should not be able to tell a parser where to find a schema, because hostile documents could cause the parser to read a schema that trivially accepts all data. This would allow the document to essentially pass through without having to conform to the structure that an application expects. Documents that do not declare a namespace must be rejected.
  • The XML parser must not access the network except to explicitly permitted locations. This is security sensitive: A hostile document could declare a dependency on a schema that could cause the parser to contact attacker-controlled servers.
  • The XML parser must be robust in the face of attacks such as entity expansion attacks.
  • The XML parser must prevent path traversal attacks: Documents must not be able to cause files outside of a particular directory to be accessed.

The jxe package allows applications to enforce all of the above requirements via a very simple API:

// Incoming documents *must* be in the "urn:com.io7m.example:simple:1:0" namespace

URI schema_namespace =
  URI.create("urn:com.io7m.example:simple:1:0");

// When a document states that it is in the "urn:com.io7m.example:simple:1:0" namespace,
// the parser will open the schema at the URL returned by getResource("simple_1_0.xsd").
// All other namespaces will be rejected.

JXESchemaDefinition schema =
  JXESchemaDefinition.of(
    schema_namespace, "simple_1_0.xsd", Example.class.getResource("simple_1_0.xsd")));

// Declare an immutable map of schemas. In this example, there is only the
// one schema declared above.

final JXESchemaResolutionMappings schemas =
  JXESchemaResolutionMappings.builder()
    .putMappings(schema_namespace, schema)
    .build();

// Create a provider of hardened SAX parsers.

JXEHardenedSAXParsers parsers =
  new JXEHardenedSAXParsers();

// Specify a directory containing documents. The parser will not be allowed
// to access paths that are ancestors of the given directory. This prevents
// path traversal attacks such as trying to xinclude "../../../../etc/passwd".

Path document_directory = ...;

// Create an XInclude aware SAX parser.

XMLReader reader =
  parsers.createXMLReader(
    document_directory, JXEXInclude.XINCLUDE_ENABLED, schemas);

// Use the SAX parser.

reader.setContentHandler(...);
reader.parse(...);

Contents

Features

  • Hardened SAX parsers: Prevent path traversal attacks, prevent entity expansion attacks, prevent network access!
  • Dispatching XSD schema resolvers; XML documents specify namespaces and the resolvers find their respective XSD schemas from a provided whitelist of locations. Reject non-validated XML!
  • OSGi-ready
  • JPMS-ready
  • High coverage automated test suite
  • ISC license

Releases

The most recently published version of the software is 2.0.0.

Source code and binaries are available from the repository.

Documentation

Documentation for the 2.0.0 release is available for reading online.

Documentation for current and older releases is archived in the repository.

User documentation

See the README.

Maven

The following is a complete list of the project's modules expressed as Maven dependencies:

<dependency>
  <groupId>com.io7m.jxe</groupId>
  <artifactId>com.io7m.jxe</artifactId>
  <version>2.0.0</version>
</dependency>

<dependency>
  <groupId>com.io7m.jxe</groupId>
  <artifactId>com.io7m.jxe.core</artifactId>
  <version>2.0.0</version>
</dependency>

<dependency>
  <groupId>com.io7m.jxe</groupId>
  <artifactId>com.io7m.jxe.tests</artifactId>
  <version>2.0.0</version>
</dependency>

<dependency>
  <groupId>com.io7m.jxe</groupId>
  <artifactId>com.io7m.jxe.tests.xerces</artifactId>
  <version>2.0.0</version>
</dependency>

Each release of the project is made available on Maven Central within ten minutes of the release announcement.

Changes

Subscribe to the releases atom feed.

2024-09-06 Release: com.io7m.jxe 2.0.0
2024-09-01 Change: (Backwards incompatible) Require passing in a supplier of SAX parser factories instead of a single factory.
2024-09-01 Change: Simplify and improve schema handling in the case of multiple schemas.
2024-09-01 Change: Work correctly when running on Xerces-J. (tickets: 21 )
2024-09-01 Release: com.io7m.jxe 1.1.0
2024-06-28 Change: Update junit.version:5.10.2 → 5.10.3.
2024-08-07 Change: Update org.slf4j:slf4j-api:2.0.13 → 2.0.14.
2024-08-09 Change: Update org.slf4j:slf4j-api:2.0.14 → 2.0.15.
2024-08-12 Change: Update org.slf4j:slf4j-api:2.0.15 → 2.0.16.
2024-08-15 Change: Update junit.version:5.10.3 → 5.11.0.
2024-08-16 Change: Update ch.qos.logback:logback-classic:1.5.6 → 1.5.7.
2024-08-23 Change: Update nl.jqno.equalsverifier:equalsverifier:3.16.1 → 3.16.2.
2024-09-01 Change: Allow for passing in a SAX parser factory explicitly.
2024-05-07 Release: com.io7m.jxe 1.0.3
2024-04-19 Change: Update org.immutables:value:2.10.0 → 2.10.1.
2024-04-19 Change: Update org.slf4j:slf4j-api:2.0.10 → 2.0.13.
2024-04-19 Change: Update nl.jqno.equalsverifier:equalsverifier:3.15.5 → 3.16.1.
2024-04-19 Change: Update ch.qos.logback:logback-classic:1.4.14 → 1.5.6.
2024-04-22 Change: Update junit.version:5.10.1 → 5.10.2.
2024-05-07 Change: Move to new organization.
2023-08-09 Release: com.io7m.jxe 1.0.2
2023-08-09 Change: 1.0.1 had a broken deployment; no changes.
2023-08-09 Release: com.io7m.jxe 1.0.1
2023-08-09 Change: Update test dependencies.
2023-08-09 Change: Use latest build plugins.
2020-10-15 Release: com.io7m.jxe 1.0.0

Sources

This project uses Git to manage source code.

Repository: https://www.github.com/io7m-com/jxe

$ git clone https://www.github.com/io7m-com/jxe

License

Copyright © 2024 Mark Raynsford <code@io7m.com> https://www.io7m.com

Permission to use, copy, modify, and/or distribute this software for
any purpose with or without fee is hereby granted, provided that the
above copyright notice and this permission notice appear in all copies.

THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL
WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR
BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES
OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS,
WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION,
ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS
SOFTWARE.

Bug Tracker

The project uses GitHub Issues to track issues.