I’m using Java and i’m trying to get XML document from some http link. Code I’m using is:
URL url = new URL(link);
HttpURLConnection connection = (HttpURLConnection)url.openConnection();
connection.setRequestMethod("GET");
connection.connect();
Document doc = null;
CountInputStream in = new CountInputStream(url.openStream());
doc = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(in);
Don’t pay attention at CountInputStream, it’s some special class acting like regular input stream.
Using the code above, I sometimes got error Fatal Error :1:1: Content is not allowed in prolog. I assume that is has something to do with bad format of xml, but I have no idea how to fix it.
posdef
6,45811 gold badges44 silver badges92 bronze badges
asked Jul 20, 2012 at 10:20
7
I’m turning my comment to an answer, so it can be accepted and this question no longer remains unanswered.
The most likely cause of this is a malformed response, which includes characters before the initial <?xml …>. So please have a look at the document as transferred over HTTP, and fix this on the server side.
answered Jul 22, 2012 at 21:02
MvGMvG
56.2k20 gold badges142 silver badges271 bronze badges
1
There are certainly some weird characters (e.g. BOM) or some whitespace before the XML preamble (<?xml ...?>)?
answered Jul 20, 2012 at 11:06
Johannes WeissJohannes Weiss
51.7k15 gold badges102 silver badges135 bronze badges
1
I wanted YAML for the log4j2 configuration file because it doffs XML’s visual clutter, but had the same error as Guest96. I scoured the web for a solution to the above, investigating a Utf-8 BOM or other content in the YAML header area; no joy. Of course, the answer is usually simple.
Somewhere, I had fully missed it that using YAML with log4j2 required the jackson libraries, per https://www.sentinelone.com/blog/log4j2-configuration-detailed-guide/. Adding the jackson reference to my (Gradle) configuration fixed the problem:
// Gain support for log4j2.
// https://mvnrepository.com/artifact/org.apache.logging.log4j/log4j
implementation 'org.apache.logging.log4j:log4j-api:2.14.1'
implementation 'org.apache.logging.log4j:log4j-core:2.14.1'
// Gain support for YAML with log4j2.
// https://www.sentinelone.com/blog/log4j2-configuration-detailed-guide/
implementation 'com.fasterxml.jackson.dataformat:jackson-dataformat-yaml:2.10.0'
implementation 'com.fasterxml.jackson.core:jackson-databind:2.10.0'
With that, the dreaded Fatal Error :1:1: Content is not allowed in prolog error went away.
answered Nov 12, 2021 at 17:48
The real solution that I found for this issue was by disabling any XML Format post processors. I have added a post processor called «jp@gc - XML Format Post Processor» and started noticing the error «Fatal Error :1:1: Content is not allowed in prolog«
By disabling the post processor had stopped throwing those errors.
anoopknr
2,9872 gold badges21 silver badges32 bronze badges
answered Jul 31, 2018 at 16:46
Someone should mark Johannes Weiß’s comment as the answer to this question. That is exactly why xml documents can’t just be loaded in a DOM Document class.
http://en.wikipedia.org/wiki/Byte_order_mark
answered Nov 13, 2013 at 9:04
smironsmiron
4083 silver badges13 bronze badges
It could be not supported file encoding. Change it to UTF-8 for example.
I’ve done this using Sublime
answered Sep 24, 2020 at 8:45
MikeMike
19.4k25 gold badges95 silver badges130 bronze badges
Looks like you forgot adding correct headers to your get request (ask the REST API developer or you specific API description):
HttpURLConnection connection = (HttpURLConnection)url.openConnection();
connection.header("Accept", "application/xml")
connection.setRequestMethod("GET");
connection.connect();
or
connection.header("Accept", "application/xml;version=1")
answered May 29, 2018 at 6:01
Daniel NelsonDaniel Nelson
1,6881 gold badge11 silver badges11 bronze badges
I’m trying to compare an XML file to an XSLT generated file from that XML file, and when I run the the class as a JUnit Test, I get the following:
[Fatal Error] :1:1: Content is not allowed in prolog.
org.xml.sax.SAXParseException: Content is not allowed in prolog.
at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(Unknown Source)
at org.custommonkey.xmlunit.XMLUnit.buildDocument(XMLUnit.java:352)
at org.custommonkey.xmlunit.XMLUnit.buildDocument(XMLUnit.java:339)
at org.custommonkey.xmlunit.XMLUnit.buildControlDocument(XMLUnit.java:283)
at org.custommonkey.xmlunit.Diff.<init>(Diff.java:116)
at org.custommonkey.xmlunit.examples.MyXMLTestCase.testXSLTransformation(MyXMLTestCase.java:70)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at junit.framework.TestCase.runTest(TestCase.java:164)
at junit.framework.TestCase.runBare(TestCase.java:130)
at junit.framework.TestResult$1.protect(TestResult.java:106)
at junit.framework.TestResult.runProtected(TestResult.java:124)
at junit.framework.TestResult.run(TestResult.java:109)
at junit.framework.TestCase.run(TestCase.java:120)
at junit.framework.TestSuite.runTest(TestSuite.java:230)
at junit.framework.TestSuite.run(TestSuite.java:225)
at org.eclipse.jdt.internal.junit.runner.junit3.JUnit3TestReference.run(JUnit3TestReference.java:130)
at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:460)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:673)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:386)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:196)
Any ideas?
Content is not allowed in prolog is an error generally
emitted by the Java XML parsers when data is encountered before the <?xml...
declaration. You may inspect the document in a text editor and think
nothing is wrong, but you need to go down to the byte level to understand
the problem. You probably have a character encoding bug.
This code reproduces the problem:
import java.io.*;
import java.nio.charset.Charset;
import javax.xml.parsers.*;
import org.xml.sax.SAXException;
import org.xml.sax.helpers.DefaultHandler;
public class ContentNotAllowedInProlog {
private static void parse(InputStream stream) throws SAXException,
ParserConfigurationException, IOException {
SAXParserFactory.newInstance().newSAXParser().parse(stream,
new DefaultHandler());
}
public static void main(String[] args) {
String[] encodings = { "UTF-8", "UTF-16", "ISO-8859-1" };
for (String actual : encodings) {
for (String declared : encodings) {
if (actual != declared) {
String xml = "<?xml version='1.0' encoding='" + declared
+ "'?><x/>";
byte[] encoded = xml.getBytes(Charset.forName(actual));
try {
parse(new ByteArrayInputStream(encoded));
System.out.println("HIDDEN ERROR! actual:" + actual + " " + xml);
} catch (Exception e) {
System.out.println(e.getMessage() + " actual:" + actual + " xml:"
+ xml);
}
}
}
}
}
}
The output:
Content is not allowed in prolog. actual:UTF-8 xml:<?xml version='1.0' encoding='UTF-16'?><x/> HIDDEN ERROR! actual:UTF-8 <?xml version='1.0' encoding='ISO-8859-1'?><x/> Content is not allowed in prolog. actual:UTF-16 xml:<?xml version='1.0' encoding='UTF-8'?><x/> Content is not allowed in prolog. actual:UTF-16 xml:<?xml version='1.0' encoding='ISO-8859-1'?><x/> HIDDEN ERROR! actual:ISO-8859-1 <?xml version='1.0' encoding='UTF-8'?><x/> Content is not allowed in prolog. actual:ISO-8859-1 xml:<?xml version='1.0' encoding='UTF-16'?><x/>
This code also highlights another, more insidious
character encoding issue — when we can accidentally encode with one
encoding thinking it is another and everything seems to work.
When you inspect the data in a hex editor problems become more
apparent.
A valid UTF-16 form:
FF FE 3C 00 3F 00 78 00 6D 00 6C 00 20 00 76 00 __<_?_x_m_l_ _v_ 65 00 72 00 73 00 69 00 6F 00 6E 00 3D 00 27 00 e_r_s_i_o_n_=_'_ 31 00 2E 00 30 00 27 00 20 00 65 00 6E 00 63 00 1_._0_'_ _e_n_c_ 6F 00 64 00 69 00 6E 00 67 00 3D 00 27 00 55 00 o_d_i_n_g_=_'_U_ 54 00 46 00 2D 00 31 00 36 00 27 00 3F 00 3E 00 T_F_-_1_6_'_?_>_ 3C 00 78 00 2F 00 3E 00 <_x_/_>_
Note: exact UTF-16 byte forms vary — big-endian,
little-endian, with or without a byte-order-mark. This one is
little-endian with a BOM.
An XML document that declares itself as UTF-16 but is really
UTF-8:
EF BB BF 3C 3F 78 6D 6C 20 76 65 72 73 69 6F 6E ___<?xml version 3D 27 31 2E 30 27 20 65 6E 63 6F 64 69 6E 67 3D ='1.0' encoding= 27 55 54 46 2D 31 36 27 3F 3E 3C 78 2F 3E 'UTF-16'?><x />
Note: UTF-8 XML documents can come with or without a
byte-order-mark. This one includes a BOM.
XML, Java and Encodings
- XML
1.0: Autodetection of Character Encodings (Non-Normative) - Java:
a rough guide to character encoding
The code was written and tested against Sun’s win32 Java 1.6.0_17
which uses a version of the Apache Xerces parser internally.
