123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151 |
- <?xml version="1.0" encoding="utf-8"?>
- <!--
-
- h t t :: / / t /
- h t t :: // // t //
- h ttttt ttttt ppppp sssss // // y y sssss ttttt //
- hhhh t t p p s // // y y s t //
- h hh t t ppppp sssss // // yyyyy sssss t //
- h h t t p s :: / / y .. s t .. /
- h h t t p sssss :: / / yyyyy .. sssss t .. /
-
- <https://y.st./>
- Copyright © 2016 Alex Yst <mailto:copyright@y.st>
- This program is free software: you can redistribute it and/or modify
- it under the terms of the GNU General Public License as published by
- the Free Software Foundation, either version 3 of the License, or
- (at your option) any later version.
- This program is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
- GNU General Public License for more details.
- You should have received a copy of the GNU General Public License
- along with this program. If not, see <https://www.gnu.org./licenses/>.
- -->
- <!DOCTYPE html>
- <html xmlns="http://www.w3.org/1999/xhtml">
- <head>
- <base href="https://y.st./en/weblog/2016/02-February/07.xhtml" />
- <title>Importing functions and constants <https://y.st./en/weblog/2016/02-February/07.xhtml></title>
- <link rel="icon" type="image/png" href="/link/CC_BY-SA_4.0/y.st./icon.png" />
- <link rel="stylesheet" type="text/css" href="/link/basic.css" />
- <link rel="stylesheet" type="text/css" href="/link/site-specific.css" />
- <script type="text/javascript" src="/script/javascript.js" />
- <meta name="viewport" content="width=device-width" />
- </head>
- <body>
- <nav>
- <p>
- <a href="/en/">Home</a> |
- <a href="/en/a/about.xhtml">About</a> |
- <a href="/en/a/contact.xhtml">Contact</a> |
- <a href="/a/canary.txt">Canary</a> |
- <a href="/en/URI_research/"><abbr title="Uniform Resource Identifier">URI</abbr> research</a> |
- <a href="/en/opinion/">Opinions</a> |
- <a href="/en/coursework/">Coursework</a> |
- <a href="/en/law/">Law</a> |
- <a href="/en/a/links.xhtml">Links</a> |
- <a href="/en/weblog/2016/02-February/07.xhtml.asc">{this page}.asc</a>
- </p>
- <hr/>
- <p>
- Weblog index:
- <a href="/en/weblog/"><abbr title="American Standard Code for Information Interchange">ASCII</abbr> calendars</a> |
- <a href="/en/weblog/index_ol_ascending.xhtml">Ascending list</a> |
- <a href="/en/weblog/index_ol_descending.xhtml">Descending list</a>
- </p>
- <hr/>
- <p>
- Jump to entry:
- <a href="/en/weblog/2015/03-March/07.xhtml"><<First</a>
- <a rel="prev" href="/en/weblog/2016/02-February/06.xhtml"><Previous</a>
- <a rel="next" href="/en/weblog/2016/02-February/08.xhtml">Next></a>
- <a href="/en/weblog/latest.xhtml">Latest>></a>
- </p>
- <hr/>
- </nav>
- <header>
- <h1>Importing functions and constants</h1>
- <p>Day 00337: Sunday, 2016 February 07</p>
- </header>
- <p>
- I reworked my Gopher index handler to spit out objects of the <abbr title="Uniform Resource Identifier">URI</abbr> class instead of strings.
- In addition to making the output more usable, this also breaks the Gopher index handler's dependence on <code>\parse_url()</code>, which it had been using as a light-wight validation method (it did not insure that the <abbr title="Uniform Resource Identifier">URI</abbr> was completely valid, but it insured that it was valid enough to be handled by other functions that relied on <code>\parse_url()</code>).
- I then finished work modifying the spider to use my new uri class instead of the the old functions that I merged to create it.
- I thought that it might not work out of the box, as I was unsure if <a href="https://secure.php.net/manual/en/function.in-array.php"><code>\in_array()</code></a> performed strict or loose comparisons, but it looks like it thankfully can do both.
- It defaults to loose comparisons, which is exactly what I need for comparing two uri objects for equivalence.
- </p>
- <p>
- To deal with broken markup that the spider encounters, I though that I was going to have to build my own parser.
- However, it seems that <abbr title="PHP: Hypertext Preprocessor">PHP</abbr> has me covered.
- <a href="https://secure.php.net/manual/en/domdocument.loadhtml.php"><code>\DOMDocument::loadHTML()</code></a> can deal with broken markup.
- Also of use will be <a href="https://secure.php.net/manual/en/domdocument.getelementsbytagname.php"><code>\DOMDocument::getElementsByTagName()</code></a>, which will make it easy to find all the <code><a/></code>, <code><base/></code>, and <code><loc/></code> tags.
- For my website compilation scripts, <a href="https://secure.php.net/manual/en/domdocument.validate.php"><code>\DOMDocument::validate()</code></a> will be very nice, as I can validate my pages as they are generated instead of needing to validate them by hand later.
- </p>
- <p>
- I started work on the updating the spider to use the <code>\DOMDocument</code> class instead of the <code>wrapper\xml</code> class but starting to convert the base URI handler, but it looks like the <code>\DOMDocument</code> class takes care of finding the base <abbr title="Uniform Resource Identifier">URI</abbr> and passes this information on to the <code>\DOMElement</code> class.
- Finding the base <abbr title="Uniform Resource Identifier">URI</abbr> once at the beginning of the page seems like the most efficient way to handle the issue of <code><base/></code> tags, but making use of this feature in the <code>\DOMDocument</code> and <code>\DOMElement</code> classes seems like it might be the more correct option, as it insures that my code is not handling the <code><base/></code> tags directly, so there is less chance of bugs in my code.
- Until I see a way to set different bases for different elements of an <abbr title="Extensible Hypertext Markup Language">XHTML</abbr>/<abbr title="Hypertext Markup Language">HTML</abbr> page, I think that I will use the more efficient option of finding the base once myself instead of constantly instantiating new uri objects for each <code><a/></code> tag.
- After finishing the conversion, I ran into an issue.
- <code>\DOMDocument::loadHTML()</code> throws error when it finds malformed markup, even when I set the options that supposedly disable it! I asked for advice on the issue, but I had to leave quickly.
- </p>
- <p>
- While we were out today, my mother tried to backpedal on something that she had said to me before.
- She now claims that her issue with me carrying a drink only applies when I am alone, not when I am with her.
- I have several issues with this.
- First of all, this is dishonest.
- Before, she specifically mentioned a situation that she did not want me having a drink in which I am only in when I am with her: working in her classroom.
- Second, when I am out on my own is when I need a drink the most.
- I am not giving up being hydrated when she is not even around to care.
- </p>
- <p>
- Cyrus, our mother, and I went out to the hills to gather bullet shells, which she is collecting to sell for the copper, but also finds collecting them to be fun.
- At first, I was very anxious to get back home.
- My spider was down.
- When the spider is running, I can at least figure that it is making progress without me, but as long as that error was keeping the spider out of commission, no progress was being made unless I was making progress myself in fixing the error.
- However, I came to the conclusion that perhaps what I was doing in the hills was more important.
- The money made will likely not even cover the gasoline costs of the drive out there.
- However, it was time spent doing something that she wanted to do.
- She needs something pleasant in her life, seeing as her job is so taxing.
- </p>
- <p>
- When we got back home, I found two responses to my question.
- The first was unhelpful, saying that the only software that I would be able to find that repairs malformed markup like I need it to would be a Web browser.
- This person did not understand the issue.
- The <code>\DOMDocument</code> class was <strong>*already*</strong> repairing the document.
- The issue was that it was also spewing minor errors as it did so.
- The second person's response was much more practical and useful.
- They said to set <a href="https://secure.php.net/manual/en/class.domdocument.php#domdocument.props.recover"><code>\DOMDocument::$recover</code></a> to true and call <a href="https://secure.php.net/manual/en/function.libxml-use-internal-errors.php"><code>\libxml_use_internal_errors()</code></a> with a boolean true before attempting to parse the document, then call <a href="https://secure.php.net/manual/en/function.libxml-clear-errors.php"><code>\libxml_clear_errors()</code></a> after.
- <code>\DOMDocument::$recover</code> did not need to be changed to fix the problem.
- The other two did the trick! However, by setting <code>\DOMDocument::$recover</code> to true anyway, I was able use <a href="https://secure.php.net./manual/en/domdocument.loadxml.php"><code>\DOMDocument::loadXML()</code></a> instead of <code>\DOMDocument::loadHTML()</code>, which prevented some errors generated during the parsing of perfectly well-formed sitemaps, as <code>\DOMDocument::loadHTML()</code> was complaining about elements that are not present in <abbr title="Hypertext Markup Language">HTML</abbr>.
- </p>
- <p>
- I have been a bit frustrated because I have not been able to import functions and constants into the current name space.
- It is not the missing feature that has been bothering me either, it is the fact that it has not worked on my copy of <abbr title="PHP: Hypertext Preprocessor">PHP</abbr>, but the feature supposedly was added in a version slightly older than mine.
- So if the feature was made available prior to my version, where is it? Well, it turns out that I need to pay more attention when reading the manual.
- I assumed that the syntax for importing functions and constants was the same as that for importing classes and namespaces; the reason for this assumption being that the syntax for importing a name space is identical to the syntax for importing a class.
- In fact, if a class and a namespace shared a name, I think that it would be impossible to import one without importing the other at the same time.
- However, it seems that for functions and constants, it is <a href="http://docs.php.net/manual/en/migration56.new-features.php#migration56.new-features.use">a little different</a>.
- Tomorrow, I will go through and clean up the spider code to use proper imports.
- While I am at it, I should clean up the settings file and the code that pulls values from it.
- </p>
- <hr/>
- <p>
- Copyright © 2016 Alex Yst;
- You may modify and/or redistribute this document under the terms of the <a rel="license" href="/license/gpl-3.0-standalone.xhtml"><abbr title="GNU's Not Unix">GNU</abbr> <abbr title="General Public License version Three or later">GPLv3+</abbr></a>.
- If for some reason you would prefer to modify and/or distribute this document under other free copyleft terms, please ask me via email.
- My address is in the source comments near the top of this document.
- This license also applies to embedded content such as images.
- For more information on that, see <a href="/en/a/licensing.xhtml">licensing</a>.
- </p>
- <p>
- <abbr title="World Wide Web Consortium">W3C</abbr> standards are important.
- This document conforms to the <a href="https://validator.w3.org./nu/?doc=https%3A%2F%2Fy.st.%2Fen%2Fweblog%2F2016%2F02-February%2F07.xhtml"><abbr title="Extensible Hypertext Markup Language">XHTML</abbr> 5.1</a> specification and uses style sheets that conform to the <a href="http://jigsaw.w3.org./css-validator/validator?uri=https%3A%2F%2Fy.st.%2Fen%2Fweblog%2F2016%2F02-February%2F07.xhtml"><abbr title="Cascading Style Sheets">CSS</abbr>3</a> specification.
- </p>
- </body>
- </html>
|