@@ -1219,20 +1219,16 @@ it. ::
12191219 >>> with urllib.request.urlopen('http://www.python.org/') as f:
12201220 ... print(f.read(300))
12211221 ...
1222- b'<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
1223- "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">\n\n\n<html
1224- xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">\n\n<head>\n
1225- <meta http-equiv="content-type" content="text/html; charset=utf-8" />\n
1226- <title>Python Programming '
1222+ b'<!doctype html>\n<!--[if lt IE 7]> <html class="no-js ie6 lt-ie7 lt-ie8 lt-ie9"> <![endif]-->\n<!--[if IE 7]> <html class="no-js ie7 lt-ie8 lt-ie9"> <![endif]-->\n<!--[if IE 8]> <html class="no-js ie8 lt-ie9">
12271223
12281224Note that urlopen returns a bytes object. This is because there is no way
12291225for urlopen to automatically determine the encoding of the byte stream
12301226it receives from the HTTP server. In general, a program will decode
12311227the returned bytes object to string once it determines or guesses
12321228the appropriate encoding.
12331229
1234- The following W3C document, https://www.w3.org/International/O-charset \ , lists
1235- the various ways in which an (X) HTML or an XML document could have specified its
1230+ The following W3C document, https://www.w3.org/International/questions/qa-html-encoding-declarations \ , lists
1231+ the various ways in which an HTML document could have specified its
12361232encoding information.
12371233
12381234As the python.org website uses *utf-8 * encoding as specified in its meta tag, we
@@ -1241,17 +1237,19 @@ will use the same for decoding the bytes object. ::
12411237 >>> with urllib.request.urlopen('http://www.python.org/') as f:
12421238 ... print(f.read(100).decode('utf-8'))
12431239 ...
1244- <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
1245- "http://www.w3.org/TR/xhtml1/DTD/xhtm
1240+ <!doctype html>
1241+ <!--[if lt IE 7]> <html class="no-js ie6 lt-ie7 lt-ie8 lt-ie9"> <![endif]-->
1242+ <!-
12461243
12471244It is also possible to achieve the same result without using the
12481245:term: `context manager ` approach. ::
12491246
12501247 >>> import urllib.request
12511248 >>> f = urllib.request.urlopen('http://www.python.org/')
12521249 >>> print(f.read(100).decode('utf-8'))
1253- <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
1254- "http://www.w3.org/TR/xhtml1/DTD/xhtm
1250+ <!doctype html>
1251+ <!--[if lt IE 7]> <html class="no-js ie6 lt-ie7 lt-ie8 lt-ie9"> <![endif]-->
1252+ <!--
12551253
12561254In the following example, we are sending a data-stream to the stdin of a CGI
12571255and reading the data it returns to us. Note that this example will only work
0 commit comments