07.xhtml 12 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151
  1. <?xml version="1.0" encoding="utf-8"?>
  2. <!--
  3. h t t :: / / t /
  4. h t t :: // // t //
  5. h ttttt ttttt ppppp sssss // // y y sssss ttttt //
  6. hhhh t t p p s // // y y s t //
  7. h hh t t ppppp sssss // // yyyyy sssss t //
  8. h h t t p s :: / / y .. s t .. /
  9. h h t t p sssss :: / / yyyyy .. sssss t .. /
  10. <https://y.st./>
  11. Copyright © 2016 Alex Yst <mailto:copyright@y.st>
  12. This program is free software: you can redistribute it and/or modify
  13. it under the terms of the GNU General Public License as published by
  14. the Free Software Foundation, either version 3 of the License, or
  15. (at your option) any later version.
  16. This program is distributed in the hope that it will be useful,
  17. but WITHOUT ANY WARRANTY; without even the implied warranty of
  18. MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
  19. GNU General Public License for more details.
  20. You should have received a copy of the GNU General Public License
  21. along with this program. If not, see <https://www.gnu.org./licenses/>.
  22. -->
  23. <!DOCTYPE html>
  24. <html xmlns="http://www.w3.org/1999/xhtml">
  25. <head>
  26. <base href="https://y.st./en/weblog/2016/02-February/07.xhtml" />
  27. <title>Importing functions and constants &lt;https://y.st./en/weblog/2016/02-February/07.xhtml&gt;</title>
  28. <link rel="icon" type="image/png" href="/link/CC_BY-SA_4.0/y.st./icon.png" />
  29. <link rel="stylesheet" type="text/css" href="/link/basic.css" />
  30. <link rel="stylesheet" type="text/css" href="/link/site-specific.css" />
  31. <script type="text/javascript" src="/script/javascript.js" />
  32. <meta name="viewport" content="width=device-width" />
  33. </head>
  34. <body>
  35. <nav>
  36. <p>
  37. <a href="/en/">Home</a> |
  38. <a href="/en/a/about.xhtml">About</a> |
  39. <a href="/en/a/contact.xhtml">Contact</a> |
  40. <a href="/a/canary.txt">Canary</a> |
  41. <a href="/en/URI_research/"><abbr title="Uniform Resource Identifier">URI</abbr> research</a> |
  42. <a href="/en/opinion/">Opinions</a> |
  43. <a href="/en/coursework/">Coursework</a> |
  44. <a href="/en/law/">Law</a> |
  45. <a href="/en/a/links.xhtml">Links</a> |
  46. <a href="/en/weblog/2016/02-February/07.xhtml.asc">{this page}.asc</a>
  47. </p>
  48. <hr/>
  49. <p>
  50. Weblog index:
  51. <a href="/en/weblog/"><abbr title="American Standard Code for Information Interchange">ASCII</abbr> calendars</a> |
  52. <a href="/en/weblog/index_ol_ascending.xhtml">Ascending list</a> |
  53. <a href="/en/weblog/index_ol_descending.xhtml">Descending list</a>
  54. </p>
  55. <hr/>
  56. <p>
  57. Jump to entry:
  58. <a href="/en/weblog/2015/03-March/07.xhtml">&lt;&lt;First</a>
  59. <a rel="prev" href="/en/weblog/2016/02-February/06.xhtml">&lt;Previous</a>
  60. <a rel="next" href="/en/weblog/2016/02-February/08.xhtml">Next&gt;</a>
  61. <a href="/en/weblog/latest.xhtml">Latest&gt;&gt;</a>
  62. </p>
  63. <hr/>
  64. </nav>
  65. <header>
  66. <h1>Importing functions and constants</h1>
  67. <p>Day 00337: Sunday, 2016 February 07</p>
  68. </header>
  69. <p>
  70. I reworked my Gopher index handler to spit out objects of the <abbr title="Uniform Resource Identifier">URI</abbr> class instead of strings.
  71. In addition to making the output more usable, this also breaks the Gopher index handler&apos;s dependence on <code>\parse_url()</code>, which it had been using as a light-wight validation method (it did not insure that the <abbr title="Uniform Resource Identifier">URI</abbr> was completely valid, but it insured that it was valid enough to be handled by other functions that relied on <code>\parse_url()</code>).
  72. I then finished work modifying the spider to use my new uri class instead of the the old functions that I merged to create it.
  73. I thought that it might not work out of the box, as I was unsure if <a href="https://secure.php.net/manual/en/function.in-array.php"><code>\in_array()</code></a> performed strict or loose comparisons, but it looks like it thankfully can do both.
  74. It defaults to loose comparisons, which is exactly what I need for comparing two uri objects for equivalence.
  75. </p>
  76. <p>
  77. To deal with broken markup that the spider encounters, I though that I was going to have to build my own parser.
  78. However, it seems that <abbr title="PHP: Hypertext Preprocessor">PHP</abbr> has me covered.
  79. <a href="https://secure.php.net/manual/en/domdocument.loadhtml.php"><code>\DOMDocument::loadHTML()</code></a> can deal with broken markup.
  80. Also of use will be <a href="https://secure.php.net/manual/en/domdocument.getelementsbytagname.php"><code>\DOMDocument::getElementsByTagName()</code></a>, which will make it easy to find all the <code>&lt;a/&gt;</code>, <code>&lt;base/&gt;</code>, and <code>&lt;loc/&gt;</code> tags.
  81. For my website compilation scripts, <a href="https://secure.php.net/manual/en/domdocument.validate.php"><code>\DOMDocument::validate()</code></a> will be very nice, as I can validate my pages as they are generated instead of needing to validate them by hand later.
  82. </p>
  83. <p>
  84. I started work on the updating the spider to use the <code>\DOMDocument</code> class instead of the <code>wrapper\xml</code> class but starting to convert the base URI handler, but it looks like the <code>\DOMDocument</code> class takes care of finding the base <abbr title="Uniform Resource Identifier">URI</abbr> and passes this information on to the <code>\DOMElement</code> class.
  85. Finding the base <abbr title="Uniform Resource Identifier">URI</abbr> once at the beginning of the page seems like the most efficient way to handle the issue of <code>&lt;base/&gt;</code> tags, but making use of this feature in the <code>\DOMDocument</code> and <code>\DOMElement</code> classes seems like it might be the more correct option, as it insures that my code is not handling the <code>&lt;base/&gt;</code> tags directly, so there is less chance of bugs in my code.
  86. Until I see a way to set different bases for different elements of an <abbr title="Extensible Hypertext Markup Language">XHTML</abbr>/<abbr title="Hypertext Markup Language">HTML</abbr> page, I think that I will use the more efficient option of finding the base once myself instead of constantly instantiating new uri objects for each <code>&lt;a/&gt;</code> tag.
  87. After finishing the conversion, I ran into an issue.
  88. <code>\DOMDocument::loadHTML()</code> throws error when it finds malformed markup, even when I set the options that supposedly disable it! I asked for advice on the issue, but I had to leave quickly.
  89. </p>
  90. <p>
  91. While we were out today, my mother tried to backpedal on something that she had said to me before.
  92. She now claims that her issue with me carrying a drink only applies when I am alone, not when I am with her.
  93. I have several issues with this.
  94. First of all, this is dishonest.
  95. Before, she specifically mentioned a situation that she did not want me having a drink in which I am only in when I am with her: working in her classroom.
  96. Second, when I am out on my own is when I need a drink the most.
  97. I am not giving up being hydrated when she is not even around to care.
  98. </p>
  99. <p>
  100. Cyrus, our mother, and I went out to the hills to gather bullet shells, which she is collecting to sell for the copper, but also finds collecting them to be fun.
  101. At first, I was very anxious to get back home.
  102. My spider was down.
  103. When the spider is running, I can at least figure that it is making progress without me, but as long as that error was keeping the spider out of commission, no progress was being made unless I was making progress myself in fixing the error.
  104. However, I came to the conclusion that perhaps what I was doing in the hills was more important.
  105. The money made will likely not even cover the gasoline costs of the drive out there.
  106. However, it was time spent doing something that she wanted to do.
  107. She needs something pleasant in her life, seeing as her job is so taxing.
  108. </p>
  109. <p>
  110. When we got back home, I found two responses to my question.
  111. The first was unhelpful, saying that the only software that I would be able to find that repairs malformed markup like I need it to would be a Web browser.
  112. This person did not understand the issue.
  113. The <code>\DOMDocument</code> class was <strong>*already*</strong> repairing the document.
  114. The issue was that it was also spewing minor errors as it did so.
  115. The second person&apos;s response was much more practical and useful.
  116. They said to set <a href="https://secure.php.net/manual/en/class.domdocument.php#domdocument.props.recover"><code>\DOMDocument::$recover</code></a> to true and call <a href="https://secure.php.net/manual/en/function.libxml-use-internal-errors.php"><code>\libxml_use_internal_errors()</code></a> with a boolean true before attempting to parse the document, then call <a href="https://secure.php.net/manual/en/function.libxml-clear-errors.php"><code>\libxml_clear_errors()</code></a> after.
  117. <code>\DOMDocument::$recover</code> did not need to be changed to fix the problem.
  118. The other two did the trick! However, by setting <code>\DOMDocument::$recover</code> to true anyway, I was able use <a href="https://secure.php.net./manual/en/domdocument.loadxml.php"><code>\DOMDocument::loadXML()</code></a> instead of <code>\DOMDocument::loadHTML()</code>, which prevented some errors generated during the parsing of perfectly well-formed sitemaps, as <code>\DOMDocument::loadHTML()</code> was complaining about elements that are not present in <abbr title="Hypertext Markup Language">HTML</abbr>.
  119. </p>
  120. <p>
  121. I have been a bit frustrated because I have not been able to import functions and constants into the current name space.
  122. It is not the missing feature that has been bothering me either, it is the fact that it has not worked on my copy of <abbr title="PHP: Hypertext Preprocessor">PHP</abbr>, but the feature supposedly was added in a version slightly older than mine.
  123. So if the feature was made available prior to my version, where is it? Well, it turns out that I need to pay more attention when reading the manual.
  124. I assumed that the syntax for importing functions and constants was the same as that for importing classes and namespaces; the reason for this assumption being that the syntax for importing a name space is identical to the syntax for importing a class.
  125. In fact, if a class and a namespace shared a name, I think that it would be impossible to import one without importing the other at the same time.
  126. However, it seems that for functions and constants, it is <a href="http://docs.php.net/manual/en/migration56.new-features.php#migration56.new-features.use">a little different</a>.
  127. Tomorrow, I will go through and clean up the spider code to use proper imports.
  128. While I am at it, I should clean up the settings file and the code that pulls values from it.
  129. </p>
  130. <hr/>
  131. <p>
  132. Copyright © 2016 Alex Yst;
  133. You may modify and/or redistribute this document under the terms of the <a rel="license" href="/license/gpl-3.0-standalone.xhtml"><abbr title="GNU&apos;s Not Unix">GNU</abbr> <abbr title="General Public License version Three or later">GPLv3+</abbr></a>.
  134. If for some reason you would prefer to modify and/or distribute this document under other free copyleft terms, please ask me via email.
  135. My address is in the source comments near the top of this document.
  136. This license also applies to embedded content such as images.
  137. For more information on that, see <a href="/en/a/licensing.xhtml">licensing</a>.
  138. </p>
  139. <p>
  140. <abbr title="World Wide Web Consortium">W3C</abbr> standards are important.
  141. This document conforms to the <a href="https://validator.w3.org./nu/?doc=https%3A%2F%2Fy.st.%2Fen%2Fweblog%2F2016%2F02-February%2F07.xhtml"><abbr title="Extensible Hypertext Markup Language">XHTML</abbr> 5.1</a> specification and uses style sheets that conform to the <a href="http://jigsaw.w3.org./css-validator/validator?uri=https%3A%2F%2Fy.st.%2Fen%2Fweblog%2F2016%2F02-February%2F07.xhtml"><abbr title="Cascading Style Sheets">CSS</abbr>3</a> specification.
  142. </p>
  143. </body>
  144. </html>