Application Security — Documents Are Untrusted Input

29 May 2026

A .docx is a ZIP file

Open a .docx file in a hex editor and the first two bytes are PK. Every modern Office format - .docx, .xlsx, .pptx - is a ZIP archive of XML parts and embedded resources. PDF is not a ZIP file, but just like Office formats it is a container. When your code loads any of these file types, some complex work is done with the content of that container: compressed streams are inflated, references are resolved, an object graph is materialized.

Most of us understand a “load document” feature to be a passive read operation: bytes in, document model out, nothing happens that we didn’t ask for. But loading is not passive, and the document itself dictates the shape and size of the process. A document can instruct the loader to allocate gigabytes from a few kilobytes on disk, to follow a path that climbs out of the extraction directory, or to decrypt with primitives pulled in by the format specification implicitly. The document is input, but it must be treated as untrusted input.

Loading a document implies a security boundary.

Here are a few examples of known attack vectors used “in the wild” against document loaders:

  • “Decompression bombs” are small archives that inflate to a size which exhausts memory.
  • Path traversal (“Zip Slip”) can write outside the intended directory when extracted carelessly, by using an entry named ../../etc/something or ..\..\Windows\System32\something.
  • Symlink and device-name tricks use entries resolving to absolute paths, or to reserved Windows names like CON, NUL, PRN, to cause damage when extracted.

”Compatible” is not “secure”

Most applications which load documents can’t easily decide to only support the latest formats. Of course we know not to assume that old formats are as secure as new ones, but since support for legacy formats is required, we may believe that if our loader algorithms are compliant with the latest standards, then we are secure. Unfortunately, that is not always the case.

For example, a PDF may be encrypted with AES-128. By name that is a modern choice, nothing obviously legacy about it. But the PDF standard (ISO 32000-1) requires AES-128-encrypted PDFs to derive the encryption key with MD5 and to validate permissions with RC4. Both are long recognized as cryptographically weak, and neither is permitted under FIPS 140-2, the US government’s cryptographic standard. PDF readers that support AES-128 encryption need to support MD5 and RC4, and this means that the code path that loads such documents is not FIPS-compliant.

This is what “compatible isn’t secure” actually means: you didn’t knowingly pick something old, but your choice to support a seemingly current format carries a hidden legacy dependency. The same is true of Office document protection: the latest OpenXML formats support SHA-512 hashing, but they also support SHA-1 and MD5 for legacy reasons. If you support the format, you support the legacy algorithms too.

A document format is a specification, and it can mandate approaches that your application code can’t avoid. Your exposure is inherited from the standards you choose to support, not introduced by any mistakes you made.

In the .NET space in particular, the Windows FIPS policy used to be the main line of defence against this problem. If you tried to load a document that triggered a non-compliant code path, the runtime would throw an exception. However, this safety mechanism was never perfect:

  • On .NET Framework with the Windows FIPS policy enabled, instantiating a non-validated algorithm threw an exception. However, that exception was a TargetInvocationException with a message pointing at a generic non-validated implementation, and lacking details like the document type or the offending algorithm.
  • On .NET 5 and later, managed FIPS enforcement was largely removed. An MD5/RC4 code path now runs with no error at all. The trap is the transition: code that threw reliably on .NET Framework can fall silent after a routine upgrade to .NET 5 or later, with no source change. It is easy to read that silence as the problem having been fixed, when in fact only the diagnostic has gone, while the code remains exactly as non -compliant, just quieter about it.

This means that any loader code is responsible for auditing the document types it supports. We can’t rely on the runtime to enforce compliance.

Office & PDF File API: compliance and safety by design

The latest Office & PDF File API answers both halves of the problem with a single idea: compliance and structural safety should be enforced automatically and early in the pipeline.

At the cryptographic layer, FIPS enforcement is now explicit. On a FIPS-enforced Windows system, opening or saving a document that depends on cryptography prohibited by FIPS (this includes encrypted XLS or DOC files, AES-128 or ARC4 PDFs, OpenXML protection using SHA-1 or MD5 hashes) throws a DevExpress.Utils.OperatingSystemLevelFipsMode.ComplianceViolationException before processing. We made two design choices to improve upon the .NET Framework-style handling. First, the exception derives from System.Security.SecurityException, so existing catch (SecurityException) blocks keep working unchanged. Second, the message includes actionable details: instead of the runtime’s generic string, our message tells you which detail of the document you attempted to load was non-compliant, and we include suggestions for compliance.

Note that on machines which are not configured with the Windows FIPS policy, or in non-Windows environments, no cryptography validation is performed. It is possible to force the same behavior by setting DevExpress.Utils.OperatingSystemLevelFipsMode.ForcedFipsMode to true, and you can use IsEnabled on the same type to detect whether the policy is active. Setting ForcedFipsMode does not change the operating system level policy.

At the structural layer, the new SecureZipPolicy applies to both the low-level data engine (DevExpress.Utils.Zip) and the high-level API (DevExpress.Compression.ZipArchive). It enforces resource limits with sensible defaults, such as maximum entry count, per-entry and total uncompressed size, per-entry and total compression ratio to guard against “decompression bombs”, as well as path-nesting depth. It blocks the structural attacks mentioned earlier: path traversal, absolute paths, control characters, reserved device names, symlinks. The write-time encryption default also changes from the old EncryptionType.PkZip to AES-256.

The structural enforcement through SecureZipPolicy applies to all ZIP processing and is not tied to the FIPS policy. But on systems that do not have FIPS enabled, a call to SecureZipPolicy.SetEncryptionPolicy(...) with either AesRequired or FipsStrict (the latter disallows any unknown encryption types on read) enables the encryption policies regardless of any OS-level configuration.

Some of these changes may be “breaking”

If you process documents on FIPS-enforced systems, code that previously ran on .NET 5 or later may now throw an exception. We have published detailed guidance for existing code, Breaking Change T1327031 for the Office and PDF File API and Breaking Change T1325920 for the new Zip Security Policy.

It is important to point out that these changes are only “breaking” in the sense that they change behavior for existing implementations. They make your code safer (very directly so in the case of the new Zip security policy), and offer improved discoverability and auditability of violations for FIPS compliance.

Without repeating the details of the guides, it is possible to adjust some of the defaults to restore old behavior, but also to accommodate your requirements. For example, the Zip policy has tunable parameters for resource protection.

We recommend that you take the opportunity to review your document processing code and consider whether you can migrate to more secure formats. You can use the new observable violations to identify documents that are currently being processed but would not be compliant with the new policies, and then make informed decisions.

See also: Application Security — Stronger Hashes and Safer Passwords.

Compatibility and security

A common perception is that security and compatibility are always in tension. We should state up front that this is not generally true. The new Office & PDF File API is an example of a case where the compliant choice and the convenient choice are the same, and the new enforcement simply makes that alignment visible. For many applications, there is no meaningful trade-off between security and compatibility.

The real conflict between security and compatibility is mostly at the legacy surface area. Sticking to documents as the main topic of this article, that legacy surface area can be large if you need to support old formats and old storage standards, but it can be small if you can migrate to current formats.

There are two recommendations for navigation of this tension.

First, if you can use current formats, do. Migrating encrypted XLS to XLSX, DOC to DOCX and AES-128/ARC4 PDFs to AES-256 (Revision 6) is an easy and cheap path - speaking from the purely technical perspective of course, while organizational and regulatory constraints may be more complex, and only you can judge the practical complexity of migration in your environment.

If legacy formats are genuinely unavoidable, then treat those documents explicitly as untrusted input and wrap them accordingly: you now get resource limits by default, and you can consider separating your loading or conversion logic out to a standalone process that makes it possible to apply OS limits on memory use or prevent network access - bearing in mind that any loading method still parses the document, so the point of a separate process is to contain that parse, not to avoid it.

Depending on the exposure your project has to unverified input, you will find your own balance of “defense in depth” measures, but it is important to make active decisions about these assessments. Monitoring for violations is easy with the new policies, and the ResourceLimitViolation and TrustBoundaryViolation events exist precisely so that you gain auditability that matters particularly in regulated, enterprise, and government environments.

Your Feedback Matters!

Free DevExpress Products - Get Your Copy Today

The following free DevExpress product offers remain available. Should you have any questions about the free offers below, please submit a ticket via the DevExpress Support Center at your convenience. We'll be happy to follow-up.