XML to JSON: A Practical Guide to the Mapping That Trips People Up
By the Super Simple Digital Tools Team · Updated June 2026
XML and JSON both describe structured data, but they were designed for different worlds. XML grew up around documents: it has namespaces, schemas, mixed content (text and tags woven together), comments, and a strict notion of element order. JSON grew up around data interchange for the web: objects, arrays, strings, numbers, booleans, and null, with no attributes, comments, or namespaces. Converting from one to the other is less a translation and more a projection. You are taking a richer document model and flattening it into a leaner data model, and understanding what gets lost in that projection is the key to using the result confidently.
The first thing every converter must solve is attributes. In XML, <product sku="A1">Widget</product> carries both an attribute and text. JSON has only properties, so the attribute has to become a property too. The common solution is to prefix attribute keys so they cannot clash with child element names, most often with @, and to park the element's own text under a separate key. The BadgerFish convention formalises this with @ for attributes and $ for text content, which is verbose but loses very little. Simpler conventions read better but may discard the distinction between an attribute and a child element.
The second classic problem is arrays. XML does not have an array type, it just lets you repeat a tag. So <tags><tag>a</tag><tag>b</tag></tags> clearly should become a list, but <tags><tag>a</tag></tags> is ambiguous: is tag a single value or a list of one? Converters guess, and different tools guess differently. This is the single most common cause of bugs after conversion, because code that expects an array gets an object, or vice versa. The defensive fix is to coerce the field to an array in your own code before iterating, regardless of how the converter rendered it.
Then there is everything JSON simply cannot hold. Comments and processing instructions vanish. The XML declaration is dropped. Namespaces are usually reduced to local names, so the prefix information is gone. Mixed content, where text and child elements interleave inside one element, is awkward to model and is often serialised as a single string. None of this means the conversion is wrong, it means the conversion is lossy by design. If you need a perfect round trip back to the original XML byte-for-byte, JSON is the wrong intermediate format and you should keep the XML.
In practice the workflow is straightforward: get your XML from the SOAP response, RSS feed, or config file, paste it in, and read the JSON output. Then inspect the structure before wiring it into code, paying special attention to which fields became arrays and how attributes were prefixed. Because the conversion runs in your browser, you can do this with private payloads without sending them anywhere. Once you know the shape, parsing the JSON in JavaScript, Python, or any other language is trivial compared with walking an XML tree by hand, which is the whole point of converting in the first place.
- Before coding against the output, scan for fields that became arrays versus single objects, and normalise repeated elements to arrays yourself so a one-item case does not break your loop.
- Expect attributes to appear with a prefix such as @ in the JSON; reference them by that exact key rather than assuming they merged into the element.
- If you rely on comments, the XML declaration, or strict element order, keep the original XML, since conversion to JSON discards all three.
- For SOAP or namespaced documents, remember keys are usually shortened to local names (soap:Body becomes Body), so target the simplified key in your code.