I receive xml strings from an external source that can contains unsanitized user contributed content.
The following xml string gave a ParseError in cElementTree:
>>> print repr(s)
'<Comment>ddddddddx08x08x08x08x08x08_____</Comment>'
>>> import xml.etree.cElementTree as ET
>>> ET.XML(s)
Traceback (most recent call last):
File "<pyshell#4>", line 1, in <module>
ET.XML(s)
File "<string>", line 106, in XML
ParseError: not well-formed (invalid token): line 1, column 17
Is there a way to make cElementTree not complain?
asked Oct 24, 2012 at 9:18
It seems to complain about x08 you will need to escape that.
Edit:
Or you can have the parser ignore the errors using recover
from lxml import etree
parser = etree.XMLParser(recover=True)
etree.fromstring(xmlstring, parser=parser)
answered Oct 24, 2012 at 9:25
iabdalkaderiabdalkader
16.8k4 gold badges46 silver badges74 bronze badges
3
I was having the same error (with ElementTree). In my case it was because of encodings, and I was able to solve it without having to use an external library. Hope this helps other people finding this question based on the title. (reference)
import xml.etree.ElementTree as ET
parser = ET.XMLParser(encoding="utf-8")
tree = ET.fromstring(xmlstring, parser=parser)
EDIT: Based on comments, this answer might be outdated. But this did work back when it was answered…
answered Nov 25, 2013 at 22:24
juanjuan
1,8492 gold badges16 silver badges14 bronze badges
4
This code snippet worked for me. I have an issue with the parsing batch of XML files. I had to encode them to ‘iso-8859-5’
import xml.etree.ElementTree as ET
tree = ET.parse(filename, parser = ET.XMLParser(encoding = 'iso-8859-5'))
answered Feb 25, 2020 at 19:24
1
See this answer to another question and the according part of the XML spec.
The backspace U+0008 is an invalid character in XML documents. It must be represented as escaped entity  and cannot occur plainly.
If you need to process this XML snippet, you must replace x08 in s before feeding it into an XML parser.
answered Oct 24, 2012 at 9:35
BoldewynBoldewyn
80.3k44 gold badges154 silver badges209 bronze badges
0
None of the above fixes worked for me. The only thing that worked was to use BeautifulSoup instead of ElementTree as follows:
from bs4 import BeautifulSoup
with open("data/myfile.xml") as fp:
soup = BeautifulSoup(fp, 'xml')
Then you can search the tree as:
soup.find_all('mytag')
answered May 8, 2018 at 10:56
tsandotsando
4,2942 gold badges32 silver badges34 bronze badges
2
This is most probably an encoding error. For example I had an xml file encoded in UTF-8-BOM (checked from the Notepad++ Encoding menu) and got similar error message.
The workaround (Python 3.6)
import io
from xml.etree import ElementTree as ET
with io.open(file, 'r', encoding='utf-8-sig') as f:
contents = f.read()
tree = ET.fromstring(contents)
Check the encoding of your xml file. If it is using different encoding, change the ‘utf-8-sig’ accordingly.
answered Feb 13, 2018 at 14:29
np8np8
25.3k10 gold badges88 silver badges93 bronze badges
After lots of searching through the entire WWW, I only found out that you have to escape certain characters if you want your XML parser to work! Here’s how I did it and worked for me:
escape_illegal_xml_characters = lambda x: re.sub(u'[x00-x08x0bx0cx0e-x1FuD800-uDFFFuFFFEuFFFF]', '', x)
And use it like you’d normally do:
ET.XML(escape_illegal_xml_characters(my_xml_string)) #instead of ET.XML(my_xml_string)
answered Dec 13, 2019 at 9:57
A solution for gottcha for me, using Python’s ElementTree… this has the invalid token error:
# -*- coding: utf-8 -*-
import xml.etree.ElementTree as ET
xml = u"""<?xml version='1.0' encoding='utf8'?>
<osm generator="pycrocosm server" version="0.6"><changeset created_at="2017-09-06T19:26:50.302136+00:00" id="273" max_lat="0.0" max_lon="0.0" min_lat="0.0" min_lon="0.0" open="true" uid="345" user="john"><tag k="test" v="Съешь же ещё этих мягких французских булок да выпей чаю" /><tag k="foo" v="bar" /><discussion><comment data="2015-01-01T18:56:48Z" uid="1841" user="metaodi"><text>Did you verify those street names?</text></comment></discussion></changeset></osm>"""
xmltest = ET.fromstring(xml.encode("utf-8"))
However, it works with the addition of a hyphen in the encoding type:
<?xml version='1.0' encoding='utf-8'?>
Most odd. Someone found this footnote in the python docs:
The encoding string included in XML output should conform to the
appropriate standards. For example, “UTF-8” is valid, but “UTF8” is
not.
answered Sep 6, 2017 at 19:35
TimSCTimSC
1,42116 silver badges20 bronze badges
I have been in stuck with similar problem. Finally figured out the what was the root cause in my particular case. If you read the data from multiple XML files that lie in same folder you will parse also .DS_Store file.
Before parsing add this condition
for file in files:
if file.endswith('.xml'):
run_your_code...
This trick helped me as well
Delimitry
2,9294 gold badges32 silver badges39 bronze badges
answered Jun 23, 2017 at 19:38
lxml solved the issue, in my case
from lxml import etree
for _, elein etree.iterparse(xml_file, tag='tag_i_wanted', unicode='utf-8'):
print(ele.tag, ele.text)
in another case,
parser = etree.XMLParser(recover=True)
tree = etree.parse(xml_file, parser=parser)
tags_needed = tree.iter('TAG NAME')
Thanks to theeastcoastwest
Python 2.7
answered Oct 24, 2019 at 5:49
John PrawynJohn Prawyn
1,3933 gold badges17 silver badges28 bronze badges
In my case I got the same error. (using Element Tree)
I had to add these lines:
import xml.etree.ElementTree as ET
from lxml import etree
parser = etree.XMLParser(recover=True,encoding='utf-8')
xml_file = ET.parse(path_xml,parser=parser)
Works in pyhton 3.10.2
answered Aug 9, 2022 at 17:43
What helped me with that error was Juan’s answer — https://stackoverflow.com/a/20204635/4433222
But wasn’t enough — after struggling I found out that an XML file needs to be saved with UTF-8 without BOM encoding.
The solution wasn’t working for «normal» UTF-8.
answered Feb 5, 2016 at 10:20
KonradKonrad
3516 silver badges18 bronze badges
2
The only thing that worked for me is I had to add mode and encoding while opening the file like below:
with open(filenames[0], mode='r',encoding='utf-8') as f:
readFile()
Otherwise it was failing every time with invalid token error if I simply do this:
f = open(filenames[0], 'r')
readFile()
answered Aug 29, 2019 at 18:28
VkoderVkoder
331 silver badge8 bronze badges
0
this error is coming while you are giving a link . but first you have to find the string of that link
- response = requests.get(Link)
root = cElementTree.fromstring(response.content)
answered Jul 26, 2022 at 9:57
1
I tried the other solutions in the answers here but had no luck. Since I only needed to extract the value from a single xml node I gave in and wrote my function to do so:
def ParseXmlTagContents(source, tag, tagContentsRegex):
openTagString = "<"+tag+">"
closeTagString = "</"+tag+">"
found = re.search(openTagString + tagContentsRegex + closeTagString, source)
if found:
start = found.regs[0][0]
end = found.regs[0][1]
return source[start+len(openTagString):end-len(closeTagString)]
return ""
Example usage would be:
<?xml version="1.0" encoding="utf-16"?>
<parentNode>
<childNode>123</childNode>
</parentNode>
ParseXmlTagContents(xmlString, "childNode", "[0-9]+")
answered Sep 6, 2018 at 13:36
t_warsopt_warsop
1,0601 gold badge25 silver badges37 bronze badges
-
#1
Hello ‘ello!
Now I have FM12, but when I want to play I get this XML parsing error
«not well-formed (invalid token) at line 1 of default.xml»
I’m still able to play but I don’t actually know if I can play matches…
What does it mean and how can I fix it?
Thanks
-
#2
Have the same problem, but no solution.
Anybody?
-
#4
I’ve deleted the folder already, but I still get it……
Last edited: Nov 16, 2011
-
#6
check my help thread (link in signature). also a sticky in this section about XML
-
#7
Guys, relax.. the error will stay the same but what you can do is to just load the game, eventually the error will still appear so what you guys can do is to quit to startscreen back and reload the game again. The error will still pop up but at least your cursor dont go haywire and the 3D match wouldn’t suck.
-
#8
I have deleted the file to and i cant even reload the game. It keeps popping up so i have to turn off my computer
-
#9
I also have the same problem when i open my fm it comes up with that message but has stopped now once i deleted the fm 12 file on app data, i then ran it again and a new file was created. However my fm opens fine and loads my other saved game but when i try to load my other one (the one i have been playing) it goes to load like a quick flash then comes up with the little box saying the saved game could not be loaded. I last saved it today and havent had any previous problems with it. Can anyone tell me what i need to do so i can access it again? any help would be great.
Last edited: May 8, 2012
-
#10
Try deleting the Settings folder from the following program path as detailed below:
— Windows XP:
My Computer -> Local Disk (C -> Documents and Settings -> <username> -> Application Data -> Sports Interactive -> Football Manager 2012 -> Settings
— Windows Vista / Windows 7:
Computer -> Local Disk (C -> Users -> <username> -> AppData -> Roaming -> Sports Interactive -> Football Manager 2012 -> Settings
Please note saved progress may be lost.
Try launching the game again or restart your computer and try launching the game again.
If you are unable to locate the folders please ensure that hidden folders are enabled as detailed below:
— Click on Start
— Select Control Panel
— Select Folder Options*
— Select View
— Select Show Hidden files and folders
*If Folder Options is not shown:
Windows XP/Vista : Select Classic View from the Control Panel.
Windows 7 : Select Large Icons from View By
-
#11
Anything work ?
I have finally found my FM12 data folder but it doesn’t have a settings folder. Now what should I do ?
Feeling like uninstalling and restarting from scratch, but someone has said that does not help.
Any advise,in plain English please, would be very welcome.
Thanks
-
#12
Just had an e-mail from Sega USA and it’s now sorted.
Easy to follow and effective.
-
#13
Just had an e-mail from Sega USA and it’s now sorted.
Easy to follow and effective.
![]()
What did sega told you so we can sort this out too?
thanks
-
#14
I’d like to know this too
When loading and saving XML data using FromXml() and ToXml() in C++, the data should be in the local C++ code page. If this is not the case then you may get the error ‘Error — not well-formed (invalid token) at line x’.
This may be resolved by building the application as Unicode, or by using FromXmlStream() and ToXmlStream() which deal with Binary Data and converting the data yourself.
XML Encoding
Xml documents can be encoded using a number of different encodings. The type of encoding is indicated using the encoding tag in the document header (i.e. <?xml version=»1.0″ encoding=»UTF-8″?>).
Writing an XML document to file
When an XML document is persisted as a file, it is safer to consider it in terms as of a stream of bytes as opposed to stream of characters. When an XML document is serialized to a file, an encoding is applied to it. The resulting file will then be correctly encoded given the encoding applied.
- If a Unicode encoding is applied, the resulting file is prefixed with the Unicode header 0xFF 0xFE, and will be encoded with 2 bytes per character.
- If a UTF-8 encoding is applied the resulting file will contain a variable number of bytes per character. If this file is then viewed using a tool incapable of decoding UTF-8, then you may see it contains a number of strange characters. If the file is viewed using an UTF-8 compliant application (e.g. IExplorer, Notepad on Win2000 onwards, Visual Studio .Net) then the XML Document will appear with the correct characters (if characters are corrupted or misrepresented, it should be noted that some fonts do not contain the full UNICODE set)
Turning an XML document a string
When an XML document is created from a generated class using ToXml (ToXml returns a string). The string returned is encoded as Unicode (except in C++ non-debug builds), however the XML document header does not show any encoding (<?xml version=»1.0″?>).
The string returned is Unicode, Unicode is the internal character representation for VB6, .Net & Java, as such if it is written to file or passed to another application, it should be passed as Unicode. If it has to be converted to a 1 byte per character representation prior to this, then data will likely be corrupted if complex characters have been used within the document.
If you need to persist an XML document to a file use ToXmlFile, if you need pass an XML document to another (non-Unicode) application, then should use ToXmlStream.
There is also a problem that commonly occurs in C++ UNICODE applications when dealing with UTF-8 encoded data. If you load a UFT-8 encoded file into a UNICODE application, the temptation is to store it in a UNICODE string (WCHAR*), and the conversion to Unicode is often implicit (part of some string/bstr class). However these conversions typically assume the source string is in the local code page, which is rarely UTF-8, and more frequently ANSI. So when the data is converted to UNICODE, the conversion function does not treat the data as UTF-8, and so does not correctly decode it. This results in a UNICODE string which no longer represents the source.
In these circumstances, it is better to either treat the data as binary or to use the appropriate conversion method — utf8 to Unicode.
Passing an XML document to a ASCII or ANSI application
It is common to want to pass the XML document you have created to a non-Unicode application. If you need to do this then you may look first at ToXml, this will provide you with a UNICODE string, however converting this to an ASCII or ANSI string may cause the corruption of complex characters (you lose information going from 2 bytes to 1 byte per character). You could take the string returned from ToXml, and apply your own UTF-8 encoding, however the encoding attribute in the header (<?xml version=»1.0″ encoding=»UTF-8″?>) would not be present, and the XML parser decoding the document may misinterpret it.
The better solution is to use the ToXmlStream method. This allows you to specify an encoding, and returns a stream of bytes (array of bytes in VB). This byte stream is a representation of the XML Document in the given encoding, containing the correct encoding attribute in the header (<?xml version=»1.0″ encoding=»UTF-8″?>).
Article ID: 87, Created: 3/20/2012 at 11:24 AM, Modified: 3/20/2012 at 11:24 AM
I’ve come across another APK that’s suffering from this issue:
Information
Apktool Version: 2.2.2
Operating System: Both Linux and Mac
APK From: https://forum.xda-developers.com/android/apps-games/ps4-remote-play-android-thread-t3068225
Steps to Reproduce
$ apktool d RemotePlayPortV5.1_ITB.apk
...
$ apktool b RemotePlayPortV5.1_ITB
I: Using Apktool 2.2.2 on RemotePlayPortV5.1_ITB.apk
I: Loading resource table...
I: Decoding AndroidManifest.xml with resources...
I: Loading resource table from file: /root/.local/share/apktool/framework/1.apk
I: Regular manifest package...
I: Decoding file-resources...
I: Decoding values */* XMLs...
I: Baksmaling classes.dex...
I: Copying assets and libs...
I: Copying unknown files...
I: Copying original files...
root@99033f046f3d:/usr/src/apk# apktool b RemotePlayPortV5.1_ITB
I: Using Apktool 2.2.2
I: Checking whether sources has changed...
I: Smaling smali folder into classes.dex...
I: Checking whether resources has changed...
I: Building resources...
W: /usr/src/apk/RemotePlayPortV5.1_ITB/res/layout/companionutil_layout_alert_dialog.xml:2: error: Error parsing XML: not well-formed (invalid token)
W:
W: /usr/src/apk/RemotePlayPortV5.1_ITB/res/layout/companionutil_layout_alert_dialog_game2_confirm.xml:2: error: Error parsing XML: not well-formed (invalid token)
W:
W: /usr/src/apk/RemotePlayPortV5.1_ITB/res/layout/companionutil_layout_alert_dialog_game_confirm.xml:2: error: Error parsing XML: not well-formed (invalid token)
W:
Exception in thread "main" brut.androlib.AndrolibException: brut.androlib.AndrolibException: brut.common.BrutException: could not exec (exit code = 1): [/tmp/brut_util_Jar_4284272564605293496.tmp, p, --forced-package-id, 127, --min-sdk-version, 17, --target-sdk-version, 19, --version-code, 10500, --version-name, 1.5.0, --no-version-vectors, -F, /tmp/APKTOOL6062192940819296925.tmp, -0, arsc, -0, arsc, -I, /root/.local/share/apktool/framework/1.apk, -S, /usr/src/apk/RemotePlayPortV5.1_ITB/res, -M, /usr/src/apk/RemotePlayPortV5.1_ITB/AndroidManifest.xml]
at brut.androlib.Androlib.buildResourcesFull(Androlib.java:477)
at brut.androlib.Androlib.buildResources(Androlib.java:411)
at brut.androlib.Androlib.build(Androlib.java:310)
at brut.androlib.Androlib.build(Androlib.java:263)
at brut.apktool.Main.cmdBuild(Main.java:227)
at brut.apktool.Main.main(Main.java:84)
Caused by: brut.androlib.AndrolibException: brut.common.BrutException: could not exec (exit code = 1): [/tmp/brut_util_Jar_4284272564605293496.tmp, p, --forced-package-id, 127, --min-sdk-version, 17, --target-sdk-version, 19, --version-code, 10500, --version-name, 1.5.0, --no-version-vectors, -F, /tmp/APKTOOL6062192940819296925.tmp, -0, arsc, -0, arsc, -I, /root/.local/share/apktool/framework/1.apk, -S, /usr/src/apk/RemotePlayPortV5.1_ITB/res, -M, /usr/src/apk/RemotePlayPortV5.1_ITB/AndroidManifest.xml]
at brut.androlib.res.AndrolibResources.aaptPackage(AndrolibResources.java:440)
at brut.androlib.Androlib.buildResourcesFull(Androlib.java:463)
... 5 more
Caused by: brut.common.BrutException: could not exec (exit code = 1): [/tmp/brut_util_Jar_4284272564605293496.tmp, p, --forced-package-id, 127, --min-sdk-version, 17, --target-sdk-version, 19, --version-code, 10500, --version-name, 1.5.0, --no-version-vectors, -F, /tmp/APKTOOL6062192940819296925.tmp, -0, arsc, -0, arsc, -I, /root/.local/share/apktool/framework/1.apk, -S, /usr/src/apk/RemotePlayPortV5.1_ITB/res, -M, /usr/src/apk/RemotePlayPortV5.1_ITB/AndroidManifest.xml]
at brut.util.OS.exec(OS.java:95)
at brut.androlib.res.AndrolibResources.aaptPackage(AndrolibResources.java:434)
... 6 more
Example invalid .xml:
<?xml version="1.0" encoding="utf-8"?> <o.ﺗ android:layout_gravity="center" android:orientation="vertical" android:background="@drawable/companionutil_drawable_alert_dialog" android:layout_width="fill_parent" android:layout_height="wrap_content" landscape_marginLeft="33dp" landscape_marginRight="33dp" portrait_marginLeft="11dp" portrait_marginRight="11dp" xmlns:android="http://schemas.android.com/apk/res/android"> <com.playstation.companionutil.CompanionUtilAdjustTextView android:textSize="16.0dip" android:textColor="#ffffffff" android:id="@id/com_playstation_companionutil_id_alert_text" android:layout_width="fill_parent" android:layout_height="wrap_content" android:layout_marginLeft="11.0dip" android:layout_marginTop="17.0dip" android:layout_marginRight="11.0dip" android:text="" android:lineSpacingExtra="1.0dip" /> <com.playstation.companionutil.CompanionUtilAdjustButton android:textSize="16.0dip" android:textColor="#ffffffff" android:id="@id/com_playstation_companionutil_id_alert_positive_button" android:background="@drawable/companionutil_drawable_alert_dialog_button" android:layout_width="fill_parent" android:layout_height="28.0dip" android:layout_marginLeft="11.0dip" android:layout_marginTop="15.0dip" android:layout_marginRight="11.0dip" android:layout_marginBottom="15.0dip" android:text="@string/com_playstation_companionutil_msg_ok" /> </o.ﺗ>
APK
PS4 Remote Play Port
(https://forum.xda-developers.com/android/apps-games/ps4-remote-play-android-thread-t3068225)
Questions to ask before submission
Have you tried apktool d, apktool b without changing anything? Yes
If you are trying to install a modified apk, did you resign it? No
Are you using the latest apktool version? Yes
- Remove From My Forums
-
Question
-
User386247 posted
I’m having this issue with 2 of my layouts whenever i try to run my app, this is the code that i have for the first layout
<include layout="@layout/content_main" /> -------------------->Error showing in this blank line <android.support.design.widget.FloatingActionButton android:id="@+id/fab" android:layout_width="wrap_content" android:layout_height="wrap_content" android:layout_gravity="bottom|end" android:layout_margin="@dimen/fab_margin" app:srcCompat="@android:drawable/ic_dialog_email" /> <Button android:text="Resgistrar Estudiantes" android:layout_width="400px" android:layout_height="110px" android:layout_marginTop="250px" android:layout_marginLeft="180px" android:id="@+id/BtnRegEstudiante" />/>
And this is the one the i have for my second layout
width=»matchparent»
android:layoutheight=»matchparent»
android:columnCount=»2″
android:rowCount=»6″
android:id=»@+id/gridLayout1″* android:layout_width=»120px» * ———Error showing in this line
android:layout_height=»100px»
ad:minWidth=»30px»
ndroiandroid:minHeight=»30px»
android:paddingTop=»60px»
android:paddingLeft=»20px»
android:text=»Nombre»
android:textColor=»@android:color/black»
android:id=»@+id/LblNombre»/> <EditText android:id="@+id/TxtNombre" android:layout_width="285px" android:layout_height="100px" android:minWidth="30px" android:minHeight="30px" android:paddingTop="40px" android:paddingLeft="20px" android:layout_marginLeft="150px" />/>
Answers
-
User386247 posted
I just found the error haha, i was missing a «=»
<Button android:id="@+id/BtnRegEstudiante" android:text="Registro de Estudiantes" android:layout_width="400px" android:layout_height="150px" android:layout_marginTop="250px" **android:layout_marginLeft="180px"/>**-
Marked as answer by
Thursday, June 3, 2021 12:00 AM
-
Marked as answer by
