{"id":1261,"date":"2025-10-20T19:29:44","date_gmt":"2025-10-20T11:29:44","guid":{"rendered":"https:\/\/www.zhaozhao123.cn\/php\/my1js\/1261.html"},"modified":"2025-10-20T19:29:45","modified_gmt":"2025-10-20T11:29:45","slug":"symfony-domcrawler-%e5%ae%89%e8%a3%85%e7%94%a8%e6%b3%95%e5%8f%8a%e6%a0%b8%e5%bf%83%e5%8a%9f%e8%83%bd%e8%af%a6%e8%a7%a3","status":"publish","type":"my1js","link":"https:\/\/www.zhaozhao123.cn\/php\/my1js\/1261.html","title":{"rendered":"Symfony DomCrawler \u5b89\u88c5\u7528\u6cd5\u53ca\u6838\u5fc3\u529f\u80fd\u8be6\u89e3"},"content":{"rendered":"<p>Symfony DomCrawler \u662f\u4e00\u4e2a\u5f3a\u5927\u7684 PHP \u7ec4\u4ef6\uff0c\u7528\u4e8e\u89e3\u6790\u548c\u64cd\u4f5c HTML\/XML \u6587\u6863\u3002\u5b83\u652f\u6301 XPath \u548c CSS \u9009\u62e9\u5668\uff0c\u5e76\u96c6\u6210\u4e86 HTML5 \u89e3\u6790\u80fd\u529b\u3002\u672c\u6559\u7a0b\u5c06\u6db5\u76d6\u5b89\u88c5\u3001\u6838\u5fc3\u529f\u80fd\u53ca\u5b9e\u9645\u5e94\u7528\u573a\u666f\u3002<\/p><h2 class=\"wp-block-heading\">\u80cc\u666f<\/h2><p>\u672c\u6559\u7a0b\u4e8e2025\u5e744\u670827\u65e5\u6574\u7406\u5b9a\u7a3f\uff0c\u4ee5\u4e0b\u6559\u7a0b\u662f\u57fa\u4e8e\u8fd9\u4e2a\u65f6\u95f4\u70b9 Symfony DomCrawler \u5b98\u65b9\u8bf4\u660e\u53ca\u7f51\u7edc\u6559\u7a0b\u6574\u7406\u3002\u5bf9\u4e8e\u672c\u6559\u7a0b\u672a\u63d0\u53ca\u7684\u65b9\u6cd5\u4e0d\u5efa\u8bae\u4f7f\u7528\uff08\u6216\u4f7f\u7528\u524d\u9274\u5b9a\u6216\u9a8c\u8bc1\u6709\u6548\u6027\uff09\u3002<\/p><p>\u672c\u6559\u7a0b\u5df2\u7ecf\u5c06\u4e00\u4e9b\u7591\u70b9\u9a8c\u8bc1\uff0c\u5e76\u5907\u6ce8\u8bf4\u660e\u3002<\/p><hr class=\"wp-block-separator has-alpha-channel-opacity\"><h2 class=\"wp-block-heading\">\u4e00\u3001\u5b89\u88c5<\/h2><p>\u901a\u8fc7 Composer \u5b89\u88c5\u7ec4\u4ef6\uff1a<\/p><pre class=\"wp-block-code\"><code>composer require symfony\/dom-crawler<\/code><\/pre><p>\u82e5\u9700\u652f\u6301 HTML5 \u89e3\u6790\uff08\u63a8\u8350\uff09\uff1a<\/p><pre class=\"wp-block-code\"><code>composer require masterminds\/html5<\/code><\/pre><p>\u82e5\u9700\u8981\u4f7f\u7528css\u9009\u62e9\u5668\uff08\u63a8\u8350\uff09\uff1a<\/p><pre class=\"wp-block-code\"><code>composer require symfony\/css-selector<\/code><\/pre><hr class=\"wp-block-separator has-alpha-channel-opacity\"><h2 class=\"wp-block-heading\">\u4e8c\u3001\u57fa\u7840\u7528\u6cd5<\/h2><h3 class=\"wp-block-heading\">1. \u521d\u59cb\u5316 Crawler<\/h3><pre class=\"wp-block-code\"><code>use SymfonyComponentDomCrawlerCrawler;\n\n$html = &lt;&lt;&lt;HTML\n&lt;!DOCTYPE html&gt;\n&lt;html&gt;\n  &lt;body&gt;\n    &lt;h1 class=\"title\"&gt;Symfony DomCrawler&lt;\/h1&gt;\n  &lt;\/body&gt;\n&lt;\/html&gt;\nHTML;\n\n$crawler = new Crawler($html);<\/code><\/pre><hr class=\"wp-block-separator has-alpha-channel-opacity\"><h2 class=\"wp-block-heading\">\u4e09\u3001\u8282\u70b9\u7b5b\u9009<\/h2><h3 class=\"wp-block-heading\">1. CSS \u9009\u62e9\u5668<\/h3><blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p>\u6ce8\uff1a\u5173\u4e8e Symfony DomCrawler \u4f7f\u7528 CSS \u9009\u62e9\u5668\u5fc5\u987b\u989d\u5916\u5b89\u88c5\u00a0<code>symfony\/css-selector<\/code> \u3002<\/p>\n<\/blockquote><pre class=\"wp-block-code\"><code>$titles = $crawler-&gt;filter('h1.title');<\/code><\/pre><h3 class=\"wp-block-heading\">2. XPath \u8868\u8fbe\u5f0f<\/h3><pre class=\"wp-block-code\"><code>$titles = $crawler-&gt;filterXPath('<a href=\"https:\/\/www.zhaozhao123.cn\/skin\/go?url=aHR0cDovL2g=\" rel=\"noopener noreferrer nofollow\">\/\/h<\/a>[contains(@class, \"title\")]');<\/code><\/pre><h3 class=\"wp-block-heading\">3.\u7b5b\u9009\u66f4\u590d\u6742\u7684\u6761\u4ef6<\/h3><p>\u533f\u540d\u51fd\u6570\u53ef\u7528\u4e8e\u7b5b\u9009\u66f4\u590d\u6742\u7684\u6761\u4ef6\uff1a<\/p><pre class=\"wp-block-code\"><code>use SymfonyComponentDomCrawlerCrawler;\n\/\/ ...\n\n$crawler = $crawler\n    -&gt;filter('body &gt; p')\n    -&gt;reduce(function (Crawler $node, $i): bool {\n        \/\/ filters every other node\n        return ($i % 2) === 0;\n    });<\/code><\/pre><p>\u8981\u5220\u9664\u8282\u70b9\uff0c\u533f\u540d\u51fd\u6570\u5fc5\u987b\u8fd4\u56de <code>false<\/code> \u3002<\/p><blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p>\u6240\u6709\u7b5b\u9009\u65b9\u6cd5\u90fd\u4f1a\u8fd4\u56de\u4e00\u4e2a\u5305\u542b\u7b5b\u9009\u5185\u5bb9\u7684\u65b0\u00a0Crawler\u00a0\u5b9e\u4f8b\u3002\u8981\u68c0\u67e5\u8fc7\u6ee4\u5668\u662f\u5426\u771f\u7684 \u627e\u5230\u4e86\u4e00\u4e9b\u4e1c\u897f\uff0c\u4f7f\u7528 <code>$crawler-&gt;count() &gt; 0<\/code> \u6765\u5224\u65ad<\/p>\n<\/blockquote><hr class=\"wp-block-separator has-alpha-channel-opacity\"><p>filterXPath\uff08\uff09\u00a0\u548c\u00a0filter\uff08\uff09\u00a0\u65b9\u6cd5\u90fd\u53ef\u4ee5\u4f7f\u7528 XML \u547d\u540d\u7a7a\u95f4\uff0c\u53ef\u4ee5\u81ea\u52a8\u53d1\u73b0\u6216\u6ce8\u518c\u76ee\u6807\u3002<\/p><h2 class=\"wp-block-heading\">\u56db\u3001\u8282\u70b9\u904d\u5386<\/h2><p>Access \u8282\u70b9\uff08\u6309\u5176\u5728\u5217\u8868\u4e2d\u7684\u4f4d\u7f6e\uff09\uff1a<\/p><pre class=\"wp-block-code\"><code>$crawler-&gt;filter('body &gt; p')-&gt;eq(0);<\/code><\/pre><p>\u83b7\u53d6\u5f53\u524d\u9009\u62e9\u7684\u7b2c\u4e00\u4e2a\u6216\u6700\u540e\u4e00\u4e2a\u8282\u70b9\uff1a<\/p><pre class=\"wp-block-code\"><code>$crawler-&gt;filter('body &gt; p')-&gt;first(); \/\/ \u5907\u6ce8\uff1afirst() \u7684\u62ec\u53f7\u4e2d\u5efa\u8bae\u7559\u7a7a\uff0c\u56e0\u4e3a\u4e0d\u652f\u6301\u518d\u8fdb\u884c\u8fdb\u4e00\u6b65\u5b9a\u4f4d\n$crawler-&gt;filter('body &gt; p')-&gt;last(); \/\/ \u5907\u6ce8\uff1alast() \u7684\u62ec\u53f7\u4e2d\u5efa\u8bae\u7559\u7a7a\uff0c\u56e0\u4e3a\u4e0d\u652f\u6301\u518d\u8fdb\u884c\u8fdb\u4e00\u6b65\u5b9a\u4f4d<\/code><\/pre><p>\u83b7\u53d6\u4e0e\u5f53\u524d\u9009\u62e9\u76f8\u540c\u7ea7\u522b\u7684\u8282\u70b9\uff1a<\/p><pre class=\"wp-block-code\"><code>$crawler-&gt;filter('body &gt; p')-&gt;siblings(); \/\/ \u5907\u6ce8\uff1asiblings() \u7684\u62ec\u53f7\u4e2d\u5efa\u8bae\u7559\u7a7a\uff0c\u56e0\u4e3a\u4e0d\u652f\u6301\u518d\u8fdb\u884c\u8fdb\u4e00\u6b65\u5b9a\u4f4d<\/code><\/pre><p>\u5728\u5f53\u524d\u9009\u62e9\u4e4b\u540e\u6216\u4e4b\u524d\u83b7\u53d6\u76f8\u540c\u7ea7\u522b\u7684\u8282\u70b9\uff1a<\/p><pre class=\"wp-block-code\"><code>$crawler-&gt;filter('body &gt; p')-&gt;nextAll(); \/\/ \u5907\u6ce8\uff1anextAll() \u7684\u62ec\u53f7\u4e2d\u5efa\u8bae\u7559\u7a7a\uff0c\u56e0\u4e3a\u4e0d\u652f\u6301\u518d\u8fdb\u884c\u8fdb\u4e00\u6b65\u5b9a\u4f4d\n$crawler-&gt;filter('body &gt; p')-&gt;previousAll(); \/\/ \u5907\u6ce8\uff1apreviousAll() \u7684\u62ec\u53f7\u4e2d\u5efa\u8bae\u7559\u7a7a\uff0c\u56e0\u4e3a\u4e0d\u652f\u6301\u518d\u8fdb\u884c\u8fdb\u4e00\u6b65\u5b9a\u4f4d<\/code><\/pre><p>\u83b7\u53d6\u6240\u6709\u5b50\u8282\u70b9\u6216\u4e0a\u7ea7\u8282\u70b9\uff1a<\/p><pre class=\"wp-block-code\"><code>$crawler-&gt;filter('body')-&gt;children();\n$crawler-&gt;filter('body &gt; p')-&gt;ancestors(); \n\/\/ \u5907\u6ce8\uff1aancestors() \u7684\u62ec\u53f7\u4e2d\u5efa\u8bae\u7559\u7a7a\uff0c\u56e0\u4e3a\u4e0d\u652f\u6301\u518d\u8fdb\u884c\u8fdb\u4e00\u6b65\u5b9a\u4f4d\uff1b\u5e76\u4e14 ancestors() \u65b9\u6cd5\u662f\u83b7\u53d6\u6240\u6709\u4e0a\u7ea7\u8282\u70b9\uff08\u5e76\u975e\u4ec5\u4ec5\u662f\u76f4\u63a5\u4e0a\u7ea7\u8282\u70b9\uff09<\/code><\/pre><p>\u83b7\u53d6\u4e0e CSS \u9009\u62e9\u5668\u5339\u914d\u7684\u6240\u6709\u76f4\u63a5\u5b50\u8282\u70b9\uff1a<\/p><pre class=\"wp-block-code\"><code>$crawler-&gt;filter('body')-&gt;children('p.lorem');<\/code><\/pre><p>\u83b7\u53d6\u4e0e\u63d0\u4f9b\u7684\u9009\u62e9\u5668\u5339\u914d\u7684\u5143\u7d20\u7684\u7b2c\u4e00\u4e2a\u7236\u5143\u7d20\uff08\u671d\u5411\u6587\u6863\u6839\u76ee\u5f55\uff09\uff1a<\/p><pre class=\"wp-block-code\"><code>$crawler-&gt;closest('p.lorem');<\/code><\/pre><p>closest() \u65b9\u6cd5\u52a0\u5f3a\u7406\u89e3\u4f7f\u7528\u793a\u4f8b\uff1a<\/p><pre class=\"wp-block-code\"><code>$html = &lt;&lt;&lt;'HTML'\n&lt;div class=\"parent\"&gt;\n    &lt;div class=\"b1\"&gt;\u7b2c\u4e00\u4e2a\u5757&lt;\/div&gt;\n    &lt;div class=\"b2\"&gt;\n        &lt;h2&gt;\u7b2c\u4e8c\u4e2a\u5757&lt;\/h2&gt;\n        &lt;div class=\"ddf\"&gt;\u540c\u7ea7\u8282\u70b91&lt;\/div&gt;\n        &lt;div class=\"other\"&gt;\u4e0d\u5e94\u8be5\u51fa\u73b0\u7684&lt;span&gt;\u8282\u70b9(\u4e0a)&lt;\/span&gt;&lt;\/div&gt;\n        &lt;div class=\"current-node\"&gt;\u5f53\u524d\u8282\u70b9&lt;\/div&gt;\n        &lt;div class=\"other\"&gt;\u4e0d\u5e94\u8be5\u51fa\u73b0\u7684&lt;span class=\"dd\"&gt;\u8282\u70b9\uff08\u4e0b\uff09&lt;\/span&gt;&lt;\/div&gt;\n        &lt;div class=\"ddf\"&gt;\u540c\u7ea7\u8282\u70b92&lt;\/div&gt;\n    &lt;\/div&gt;\n    &lt;div class=\"b2\"&gt;\n        &lt;h2&gt;\u6a21\u4eff\u7b2c\u4e8c\u4e2a\u5757&lt;\/h2&gt;\n        &lt;div class=\"ddf\"&gt;\u6a21\u4eff\u7b2c\u4e8c\u4e2a\u5757\uff1a\u540c\u7ea7\u8282\u70b91&lt;\/div&gt;\n        &lt;div class=\"other\"&gt;\u6a21\u4eff\u7b2c\u4e8c\u4e2a\u5757\uff1a\u4e0d\u5e94\u8be5\u51fa\u73b0\u7684&lt;span&gt;\u8282\u70b9(\u4e0a)&lt;\/span&gt;&lt;\/div&gt;\n        &lt;div class=\"current-node\"&gt;\u6a21\u4eff\u7b2c\u4e8c\u4e2a\u5757\uff1a\u5f53\u524d\u8282\u70b9&lt;\/div&gt;\n        &lt;div class=\"other\"&gt;\u6a21\u4eff\u7b2c\u4e8c\u4e2a\u5757\uff1a\u4e0d\u5e94\u8be5\u51fa\u73b0\u7684&lt;span class=\"dd\"&gt;\u8282\u70b9\uff08\u4e0b\uff09&lt;\/span&gt;&lt;\/div&gt;\n        &lt;div class=\"ddf\"&gt;\u6a21\u4eff\u7b2c\u4e8c\u4e2a\u5757\uff1a\u540c\u7ea7\u8282\u70b92&lt;\/div&gt;\n    &lt;\/div&gt;\n&lt;\/div&gt;\nHTML;\n\n$crawler = new Crawler($html);\n\n\/\/\u793a\u4f8b1\n$currentNode = $crawler-&gt;filter('.b2');\n$siblingTexts = $currentNode-&gt;each(function (Crawler $node, $i) {\n    return $node-&gt;closest('.dd')-&gt;text();\n});\nprint_r($siblingTexts);\n\/\/\u6253\u5370\u7ed3\u679c\uff1a\u62a5\u9519\uff01\uff01\uff01\uff08\u672a\u6355\u83b7\u7684\u9519\u8bef\uff1a\u5728 \u201creturn $node-&gt;closest('.dd')\u201d \u4e2d\u7684 null \u4e0a\u8c03\u7528\u6210\u5458\u51fd\u6570 text\uff08\uff09 \uff09\n\n\/\/\u793a\u4f8b2\n$currentNode = $crawler-&gt;filter('.current-node');\n$siblingTexts = $currentNode-&gt;each(function (Crawler $node, $i) {\n    return $node-&gt;closest('.b2')-&gt;text();\n});\nprint_r($siblingTexts);\n\/\/\u6253\u5370\u7ed3\u679c\uff1aArray ( [0] =&gt; \u7b2c\u4e8c\u4e2a\u5757 \u540c\u7ea7\u8282\u70b91 \u4e0d\u5e94\u8be5\u51fa\u73b0\u7684\u8282\u70b9(\u4e0a) \u5f53\u524d\u8282\u70b9 \u4e0d\u5e94\u8be5\u51fa\u73b0\u7684\u8282\u70b9\uff08\u4e0b\uff09 \u540c\u7ea7\u8282\u70b92 [1] =&gt; \u6a21\u4eff\u7b2c\u4e8c\u4e2a\u5757 \u6a21\u4eff\u7b2c\u4e8c\u4e2a\u5757\uff1a\u540c\u7ea7\u8282\u70b91 \u6a21\u4eff\u7b2c\u4e8c\u4e2a\u5757\uff1a\u4e0d\u5e94\u8be5\u51fa\u73b0\u7684\u8282\u70b9(\u4e0a) \u6a21\u4eff\u7b2c\u4e8c\u4e2a\u5757\uff1a\u5f53\u524d\u8282\u70b9 \u6a21\u4eff\u7b2c\u4e8c\u4e2a\u5757\uff1a\u4e0d\u5e94\u8be5\u51fa\u73b0\u7684\u8282\u70b9\uff08\u4e0b\uff09 \u6a21\u4eff\u7b2c\u4e8c\u4e2a\u5757\uff1a\u540c\u7ea7\u8282\u70b92 )\n\n\n\/**\n * \u603b\u7ed3\uff1a\n * $node-&gt;closest('.b2')\n * closest() \u4e2d\u7684\u9009\u62e9\u5668\u5fc5\u987b\u662f\u5f53\u524d\u9009\u62e9\uff08$node\uff09\u7684\u7236\u5143\u7d20\u7684css\u5b9a\u4f4d\n * *\/<\/code><\/pre><blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p>\u6240\u6709\u904d\u5386\u65b9\u6cd5\u90fd\u8fd4\u56de\u4e00\u4e2a\u65b0\u7684\u00a0Crawler\u00a0\u5b9e\u4f8b\u3002<\/p>\n\n\n\n<p>\u4e0a\u65b9\u201c\u5907\u6ce8\u201d\u7531\u672c\u7ad9\u957f\u6d4b\u9a8c\u6240\u5f97\u3002<\/p>\n<\/blockquote><hr class=\"wp-block-separator has-alpha-channel-opacity\"><h2 class=\"wp-block-heading\">\u4e94\u3001\u8bbf\u95ee\u8282\u70b9\u503c<\/h2><p class=\"has-vivid-red-color has-text-color has-link-color wp-elements-7405b6cbf1806764b7489e14ba5b304b\"><strong>\u8bbf\u95ee\u5f53\u524d\u6240\u9009\u5185\u5bb9\u7684\u7b2c\u4e00\u4e2a\u8282\u70b9\u7684\u8282\u70b9\u540d\u79f0\uff08HTML \u6807\u8bb0\u540d\u79f0\uff09\uff08\u4f8b\u5982 \u201cp\u201d \u6216 \u201cdiv\u201d\uff09\uff1a<\/strong><\/p><pre class=\"wp-block-code\"><code>\/\/ \u8fd4\u56de body \u4e0b\u7b2c\u4e00\u4e2a\u5b50\u5143\u7d20\u7684\u8282\u70b9\u540d\u79f0\uff08HTML \u6807\u7b7e\u540d\u79f0\uff09\n$tag = $crawler-&gt;filterXPath('<a href=\"https:\/\/www.zhaozhao123.cn\/skin\/go?url=aHR0cDovL2JvZHkv\" rel=\"noopener noreferrer nofollow\">\/\/body\/<\/a>*')-&gt;nodeName();<\/code><\/pre><p class=\"has-vivid-red-color has-text-color has-link-color wp-elements-b32ba9e20c06ae770d6bff7fa62caba6\"><strong>\u8bbf\u95ee\u5f53\u524d\u9009\u62e9\u7684\u7b2c\u4e00\u4e2a\u8282\u70b9\u7684\u503c\uff1a<\/strong><\/p><pre class=\"wp-block-code\"><code>\/\/ \u5982\u679c\u8282\u70b9\u4e0d\u5b58\u5728\uff0c\u5219\u8c03\u7528 text\uff08\uff09 \u5c06\u5bfc\u81f4\u5f02\u5e38\n$message = $crawler-&gt;filterXPath('<a href=\"https:\/\/www.zhaozhao123.cn\/skin\/go?url=aHR0cDovL2JvZHkvcA==\" rel=\"noopener noreferrer nofollow\">\/\/body\/p<\/a>')-&gt;text();\n\n\/\/ \u907f\u514d\u5728 node \u4e0d\u5b58\u5728\u65f6\u4f20\u9012 text\uff08\uff09 \u8fd4\u56de\u7684\u53c2\u6570\u7684\u5f02\u5e38\n$message = $crawler-&gt;filterXPath('<a href=\"https:\/\/www.zhaozhao123.cn\/skin\/go?url=aHR0cDovL2JvZHkvcA==\" rel=\"noopener noreferrer nofollow\">\/\/body\/p<\/a>')-&gt;text('Default text content');\n\n\/\/\u9ed8\u8ba4\u60c5\u51b5\u4e0b\uff0ctext\uff08\uff09 \u4f1a\u4fee\u526a\u7a7a\u683c\u5b57\u7b26\uff0c\u5305\u62ec\u5185\u90e8\u5b57\u7b26\n\/\/\uff08\u4f8b\u5982\uff0c\"foon bar baz n\" \u88ab\u8fd4\u56de\u4e3a \u201cfoo bar baz\u201d\uff09\n\/\/\u5c06 FALSE \u4f5c\u4e3a\u7b2c\u4e8c\u4e2a\u53c2\u6570\u4f20\u9012\uff0c\u4ee5\u8fd4\u56de\u539f\u59cb\u6587\u672c\u4e0d\u53d8\n$crawler-&gt;filterXPath('<a href=\"https:\/\/www.zhaozhao123.cn\/skin\/go?url=aHR0cDovL2JvZHkvcA==\" rel=\"noopener noreferrer nofollow\">\/\/body\/p<\/a>')-&gt;text('Default text content', false);\n\n<a href=\"https:\/\/www.zhaozhao123.cn\/skin\/go?url=aHR0cDovL2lubmVyVGV4dA==\" rel=\"noopener noreferrer nofollow\">\/\/innerText<\/a>\uff08\uff09 \u7c7b\u4f3c\u4e8e text\uff08\uff09\uff0c\u4f46\u4ec5\u8fd4\u56de\u5f53\u524d\u8282\u70b9\u7684\u76f4\u63a5\u540e\u4ee3\u6587\u672c\uff0c\u4e0d\u5305\u62ec\u5b50\u8282\u70b9\u4e2d\u7684\u6587\u672c.\n\/\/\u5982\u679c\u5185\u5bb9\u4e3a &lt;p&gt;Foo &lt;span&gt;Bar&lt;\/span&gt;&lt;\/p&gt; \u6216 &lt;p&gt;&lt;span&gt;Bar&lt;\/span&gt; Foo&lt;\/p&gt; \n\/\/ innerText\uff08\uff09 \u5728\u8fd9\u4e24\u79cd\u60c5\u51b5\u4e0b\u90fd\u8fd4\u56de 'Foo';text\uff08\uff09 \u5206\u522b\u8fd4\u56de 'Foo Bar' \u548c 'Bar Foo'\n$text = $crawler-&gt;filterXPath('<a href=\"https:\/\/www.zhaozhao123.cn\/skin\/go?url=aHR0cDovL2JvZHkvcA==\" rel=\"noopener noreferrer nofollow\">\/\/body\/p<\/a>')-&gt;innerText();\n\n\/\/\u5982\u679c\u6709\u591a\u4e2a\u6587\u672c\u8282\u70b9\uff0c\u5219\u5728\u5176\u4ed6\u5b50\u8282\u70b9\u4e4b\u95f4\uff0c\u4f8b\u5982 &lt;p&gt;Foo &lt;span&gt;Bar&lt;\/span&gt; Baz&lt;\/p&gt; ,innerText\uff08\uff09 \u4ec5\u8fd4\u56de\u7b2c\u4e00\u4e2a\u6587\u672c\u8282\u70b9 'Foo'\n\/\/\u4e0e text\uff08\uff09 \u4e00\u6837\uff0cinnerText\uff08\uff09 \u4e5f\u9ed8\u8ba4\u4fee\u526a\u7a7a\u767d\u5b57\u7b26\uff0c\u4f46\u662f\u60a8\u53ef\u4ee5\u901a\u8fc7\u5c06 FALSE \u4f5c\u4e3a\u53c2\u6570\u4f20\u9012\u6765\u83b7\u53d6\u672a\u66f4\u6539\u7684\u6587\u672c\n$text = $crawler-&gt;filterXPath('<a href=\"https:\/\/www.zhaozhao123.cn\/skin\/go?url=aHR0cDovL2JvZHkvcA==\" rel=\"noopener noreferrer nofollow\">\/\/body\/p<\/a>')-&gt;innerText(false);<\/code><\/pre><p class=\"has-vivid-red-color has-text-color has-link-color wp-elements-c62d82046dbed7b0cf1d7e1847659dc3\"><strong>\u8bbf\u95ee\u5f53\u524d\u9009\u62e9\u7684\u7b2c\u4e00\u4e2a\u8282\u70b9\u7684\u5c5e\u6027\u503c<\/strong>\uff1a<\/p><pre class=\"wp-block-code\"><code>$class = $crawler-&gt;filterXPath('<a href=\"https:\/\/www.zhaozhao123.cn\/skin\/go?url=aHR0cDovL2JvZHkvcA==\" rel=\"noopener noreferrer nofollow\">\/\/body\/p<\/a>')-&gt;attr('class');<\/code><\/pre><p class=\"has-vivid-red-color has-text-color has-link-color wp-elements-2cf26701d0d5a2cb1cf938856f4f67ae\"><strong>\u60a8\u53ef\u4ee5\u5b9a\u4e49\u8282\u70b9\u6216\u5c5e\u6027\u4e3a\u7a7a\u65f6\u8981\u4f7f\u7528\u7684\u9ed8\u8ba4\u503c \u901a\u8fc7\u4f7f\u7528\u8be5\u65b9\u6cd5\u7684\u7b2c\u4e8c\u4e2a\u53c2\u6570\uff1a<\/strong><code>attr()<\/code><\/p><pre class=\"wp-block-code\"><code>$class = $crawler-&gt;filterXPath('<a href=\"https:\/\/www.zhaozhao123.cn\/skin\/go?url=aHR0cDovL2JvZHkvcA==\" rel=\"noopener noreferrer nofollow\">\/\/body\/p<\/a>')-&gt;attr('class', 'my-default-class');<\/code><\/pre><p class=\"has-vivid-red-color has-text-color has-link-color wp-elements-9c2fcce2331b32320ab8d8cdc131ee94\"><strong>\u4ece\u8282\u70b9\u5217\u8868\u4e2d\u63d0\u53d6\u5c5e\u6027\u548c\/\u6216\u8282\u70b9\u503c\uff1a<\/strong><\/p><pre class=\"wp-block-code\"><code>$attributes = $crawler\n    -&gt;filterXpath('<a href=\"https:\/\/www.zhaozhao123.cn\/skin\/go?url=aHR0cDovL2JvZHkvcA==\" rel=\"noopener noreferrer nofollow\">\/\/body\/p<\/a>')\n    -&gt;extract(['_name', '_text', 'class'])\n;<\/code><\/pre><p>Special \u5c5e\u6027\u8868\u793a\u8282\u70b9\u503c\uff0c\u800c\u8868\u793a\u5143\u7d20\u540d\u79f0\uff08HTML \u6807\u8bb0\u540d\u79f0\uff09\u3002<code>_text<\/code><code>_name<\/code><\/p><p>\u5728\u5217\u8868\u7684\u6bcf\u4e2a\u8282\u70b9\u4e0a\u8c03\u7528\u533f\u540d\u51fd\u6570\uff1a<\/p><pre class=\"wp-block-code\"><code>use SymfonyComponentDomCrawlerCrawler;\n\/\/ ...\n\n$nodeValues = $crawler-&gt;filter('p')-&gt;each(function (Crawler $node, $i): string {\n    return $node-&gt;text();\n});<\/code><\/pre><p>\u533f\u540d\u51fd\u6570\u63a5\u6536\u8282\u70b9\uff08\u4f5c\u4e3a Crawler\uff09\u548c\u4f4d\u7f6e\u4f5c\u4e3a\u53c2\u6570\u3002 \u7ed3\u679c\u662f\u533f\u540d\u51fd\u6570\u8c03\u7528\u8fd4\u56de\u7684\u503c\u6570\u7ec4\u3002<\/p><p>\u4f7f\u7528\u5d4c\u5957\u722c\u7f51\u7a0b\u5e8f\u65f6\uff0c\u8bf7\u6ce8\u610f\uff0c\u5728 \u722c\u7f51\u7a0b\u5e8f\u7684\u4e0a\u4e0b\u6587\uff1a<code>filterXPath()<\/code><\/p><pre class=\"wp-block-code\"><code>$crawler-&gt;filterXPath('parent')-&gt;each(function (Crawler $parentCrawler, $i): void {\n    \/\/ DON'T DO THIS: direct child can not be found (DON'T DO THIS\uff1a \u65e0\u6cd5\u627e\u5230\u76f4\u63a5\u5b50\u9879)\n    $subCrawler = $parentCrawler-&gt;filterXPath('sub-tag\/sub-child-tag');\n\n    \/\/ DO THIS: specify the parent tag too\uff08\u6267\u884c\u6b64\u4f5c\uff1a\u4e5f\u6307\u5b9a\u7236\u6807\u7b7e\uff09\n    $subCrawler = $parentCrawler-&gt;filterXPath('parent\/sub-tag\/sub-child-tag');\n    $subCrawler = $parentCrawler-&gt;filterXPath('node()\/sub-tag\/sub-child-tag');\n});<\/code><\/pre><hr class=\"wp-block-separator has-alpha-channel-opacity\"><h2 class=\"wp-block-heading\">\u516d\u3001\u6dfb\u52a0\u5185\u5bb9<\/h2><p>\u6dfb\u52a0\u52a8\u6001\u5185\u5bb9\uff08\u9700\u7ed3\u5408 DOMDocument\uff09<\/p><pre class=\"wp-block-code\"><code>$dom = new DOMDocument();\n$dom-&gt;loadHTML($html);\n$newNode = $dom-&gt;createElement('p', 'New content');\n$dom-&gt;appendChild($newNode);\n$crawler-&gt;add($dom);<\/code><\/pre><blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p>\u722c\u7f51\u7a0b\u5e8f\u652f\u6301\u591a\u79cd\u6dfb\u52a0\u5185\u5bb9\u7684\u65b9\u5f0f\uff0c\u4f46\u5b83\u4eec\u662f\u76f8\u4e92\u7684 exclusive\uff0c\u56e0\u6b64\u60a8\u53ea\u80fd\u4f7f\u7528\u5176\u4e2d\u4e00\u4e2a\u6765\u6dfb\u52a0\u5185\u5bb9\uff08\u4f8b\u5982\uff0c\u5982\u679c\u60a8\u5c06 content \u6dfb\u52a0\u5230\u6784\u9020\u51fd\u6570\u4e2d\uff0c\u5219\u4ee5\u540e\u4e0d\u80fd\u8c03\u7528\uff09\uff1a<code>CrawleraddContent()<\/code><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>$crawler = new Crawler('&lt;html&gt;&lt;body\/&gt;&lt;\/html&gt;');\n\n$crawler-&gt;addHtmlContent('&lt;html&gt;&lt;body\/&gt;&lt;\/html&gt;');\n$crawler-&gt;addXmlContent('&lt;root&gt;&lt;node\/&gt;&lt;\/root&gt;');\n\n$crawler-&gt;addContent('&lt;html&gt;&lt;body\/&gt;&lt;\/html&gt;');\n$crawler-&gt;addContent('&lt;root&gt;&lt;node\/&gt;&lt;\/root&gt;', 'text\/xml');\n\n$crawler-&gt;add('&lt;html&gt;&lt;body\/&gt;&lt;\/html&gt;');\n$crawler-&gt;add('&lt;root&gt;&lt;node\/&gt;&lt;\/root&gt;');<\/code><\/pre>\n\n\n\n<p><a href=\"https:\/\/www.zhaozhao123.cn\/skin\/go?url=aHR0cHM6Ly9naXRodWIuY29tL3N5bWZvbnkvc3ltZm9ueS9ibG9iLzcuMi9zcmMvU3ltZm9ueS9Db21wb25lbnQvRG9tQ3Jhd2xlci9DcmF3bGVyLnBocCM6fjp0ZXh0PWZ1bmN0aW9uJTIwYWRkSHRtbENvbnRlbnQ=\" rel=\"noopener noreferrer nofollow\" target=\"_blank\">addHtmlContent\uff08\uff09<\/a>\u00a0\u548c\u00a0<a href=\"https:\/\/www.zhaozhao123.cn\/skin\/go?url=aHR0cHM6Ly9naXRodWIuY29tL3N5bWZvbnkvc3ltZm9ueS9ibG9iLzcuMi9zcmMvU3ltZm9ueS9Db21wb25lbnQvRG9tQ3Jhd2xlci9DcmF3bGVyLnBocCM6fjp0ZXh0PWZ1bmN0aW9uJTIwYWRkWG1sQ29udGVudA==\" rel=\"noopener noreferrer nofollow\" target=\"_blank\">addXmlContent\uff08\uff09<\/a>\u00a0\u65b9\u6cd5 \u9ed8\u8ba4\u4e3a UTF-8 \u7f16\u7801\uff0c\u4f46\u60a8\u53ef\u4ee5\u4f7f\u7528\u5176\u7b2c\u4e8c\u4e2a optional \u53c2\u6570\u3002<\/p>\n\n\n\n<p><a href=\"https:\/\/www.zhaozhao123.cn\/skin\/go?url=aHR0cHM6Ly9naXRodWIuY29tL3N5bWZvbnkvc3ltZm9ueS9ibG9iLzcuMi9zcmMvU3ltZm9ueS9Db21wb25lbnQvRG9tQ3Jhd2xlci9DcmF3bGVyLnBocCM6fjp0ZXh0PWZ1bmN0aW9uJTIwYWRkQ29udGVudA==\" rel=\"noopener noreferrer nofollow\" target=\"_blank\">addContent\uff08\uff09<\/a>\u00a0\u65b9\u6cd5 \u6839\u636e\u7ed9\u5b9a\u7684\u5185\u5bb9\u731c\u6d4b\u6700\u4f73\u5b57\u7b26\u96c6\uff0c\u5e76\u9ed8\u8ba4\u4e3a\u5728\u65e0\u6cd5\u731c\u51fa\u5b57\u7b26\u96c6\u7684\u60c5\u51b5\u4e0b\u3002<code>ISO-8859-1<\/code><\/p>\n\n\n\n<p>\u7531\u4e8e Crawler \u7684\u5b9e\u73b0\u57fa\u4e8e DOM \u6269\u5c55\uff0c\u56e0\u6b64\u5b83\u4e5f\u80fd\u591f \u4e0e\u539f\u751f\u00a0<a href=\"https:\/\/www.zhaozhao123.cn\/skin\/go?url=aHR0cHM6Ly9zZWN1cmUucGhwLm5ldC9tYW51YWwvZW4vY2xhc3MuZG9tZG9jdW1lbnQucGhw\" rel=\"noopener noreferrer nofollow\" target=\"_blank\">DOMDocument<\/a>\u3001<a href=\"https:\/\/www.zhaozhao123.cn\/skin\/go?url=aHR0cHM6Ly9zZWN1cmUucGhwLm5ldC9tYW51YWwvZW4vY2xhc3MuZG9tbm9kZWxpc3QucGhw\" rel=\"noopener noreferrer nofollow\" target=\"_blank\">DOMNodeList<\/a>\u00a0\u548c\u00a0<a href=\"https:\/\/www.zhaozhao123.cn\/skin\/go?url=aHR0cHM6Ly9zZWN1cmUucGhwLm5ldC9tYW51YWwvZW4vY2xhc3MuZG9tbm9kZS5waHA=\" rel=\"noopener noreferrer nofollow\" target=\"_blank\">DOMNode<\/a>\u00a0\u5bf9\u8c61\u4ea4\u4e92\uff1a<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>$domDocument = new DOMDocument();\n$domDocument-&gt;loadXml('&lt;root&gt;&lt;node\/&gt;&lt;node\/&gt;&lt;\/root&gt;');\n$nodeList = $domDocument-&gt;getElementsByTagName('node');\n$node = $domDocument-&gt;getElementsByTagName('node')-&gt;item(0);\n\n$crawler-&gt;addDocument($domDocument);\n$crawler-&gt;addNodeList($nodeList);\n$crawler-&gt;addNodes([$node]);\n$crawler-&gt;addNode($node);\n$crawler-&gt;add($domDocument);<\/code><\/pre>\n\n\n\n<p>\u7eb5\u548c\u8f6c\u50a8<code><\/code><\/p>\n\n\n\n<p>\u4e0a\u7684\u8fd9\u4e9b\u65b9\u6cd5\u6700\u521d\u7528\u4e8e\u586b\u5145\u60a8\u7684\uff0c\u800c\u4e0d\u662f\u7528\u4e8e\u8fdb\u4e00\u6b65\u4f5c DOM \uff08\u5c3d\u7ba1\u8fd9\u662f\u53ef\u80fd\u7684\uff09\u3002\u4f46\u662f\uff0c\u7531\u4e8e \u662f\u4e00\u7ec4\u00a0<a href=\"https:\/\/www.zhaozhao123.cn\/skin\/go?url=aHR0cHM6Ly9zZWN1cmUucGhwLm5ldC9tYW51YWwvZW4vY2xhc3MuZG9tZWxlbWVudC5waHA=\" rel=\"noopener noreferrer nofollow\" target=\"_blank\">DOMElement<\/a>\u00a0\u5bf9\u8c61\uff0c\u56e0\u6b64\u60a8\u53ef\u4ee5\u4f7f\u7528\u4efb\u4f55\u53ef\u7528\u7684\u65b9\u6cd5\u6216\u5c5e\u6027 \u5728\u00a0<a href=\"https:\/\/www.zhaozhao123.cn\/skin\/go?url=aHR0cHM6Ly9zZWN1cmUucGhwLm5ldC9tYW51YWwvZW4vY2xhc3MuZG9tZWxlbWVudC5waHA=\" rel=\"noopener noreferrer nofollow\" target=\"_blank\">DOMElement<\/a>\u3001<a href=\"https:\/\/www.zhaozhao123.cn\/skin\/go?url=aHR0cHM6Ly9zZWN1cmUucGhwLm5ldC9tYW51YWwvZW4vY2xhc3MuZG9tbm9kZS5waHA=\" rel=\"noopener noreferrer nofollow\" target=\"_blank\">DOMNode<\/a>\u00a0\u6216\u00a0<a href=\"https:\/\/www.zhaozhao123.cn\/skin\/go?url=aHR0cHM6Ly9zZWN1cmUucGhwLm5ldC9tYW51YWwvZW4vY2xhc3MuZG9tZG9jdW1lbnQucGhw\" rel=\"noopener noreferrer nofollow\" target=\"_blank\">DOMDocument<\/a>\u00a0\u4e0a\u3002 \u4f8b\u5982\uff0c\u60a8\u53ef\u4ee5\u83b7\u53d6 a \u7684 HTML\uff0c\u5982\u4e0b\u6240\u793a \u8fd9\uff1a<code>CrawlerCrawlerCrawlerCrawler<\/code><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>$html = '';\n\nforeach ($crawler as $domElement) {\n    $html .= $domElement-&gt;ownerDocument-&gt;saveHTML($domElement);\n}<\/code><\/pre>\n\n\n\n<p>\u6216\u8005\u4f60\u53ef\u4ee5\u4f7f\u7528\u00a0<a href=\"https:\/\/www.zhaozhao123.cn\/skin\/go?url=aHR0cHM6Ly9naXRodWIuY29tL3N5bWZvbnkvc3ltZm9ueS9ibG9iLzcuMi9zcmMvU3ltZm9ueS9Db21wb25lbnQvRG9tQ3Jhd2xlci9DcmF3bGVyLnBocCM6fjp0ZXh0PWZ1bmN0aW9uJTIwaHRtbA==\" rel=\"noopener noreferrer nofollow\" target=\"_blank\">html\uff08\uff09<\/a>\u00a0\u83b7\u53d6\u7b2c\u4e00\u4e2a\u8282\u70b9\u7684 HTML\uff1a<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>\/\/ \u5982\u679c\u8282\u70b9\u4e0d\u5b58\u5728\uff0c\u5219\u8c03\u7528 html\uff08\uff09 \u5c06\u5bfc\u81f4\u5f02\u5e38\n$html = $crawler-&gt;html();\n\n\/\/ \u907f\u514d\u5728 node \u4e0d\u5b58\u5728\u65f6\u4f20\u9012 html\uff08\uff09 \u8fd4\u56de\u7684\u53c2\u6570\u7684\u5f02\u5e38\n$html = $crawler-&gt;html('Default &lt;strong&gt;HTML&lt;\/strong&gt; content');<\/code><\/pre>\n\n\n\n<p>\u6216\u8005\u4f60\u53ef\u4ee5\u4f7f\u7528\u00a0<a href=\"https:\/\/www.zhaozhao123.cn\/skin\/go?url=aHR0cHM6Ly9naXRodWIuY29tL3N5bWZvbnkvc3ltZm9ueS9ibG9iLzcuMi9zcmMvU3ltZm9ueS9Db21wb25lbnQvRG9tQ3Jhd2xlci9DcmF3bGVyLnBocCM6fjp0ZXh0PWZ1bmN0aW9uJTIwb3V0ZXJIdG1s\" rel=\"noopener noreferrer nofollow\" target=\"_blank\">outerHtml\uff08\uff09<\/a>\u00a0\u83b7\u53d6\u7b2c\u4e00\u4e2a\u8282\u70b9\u7684\u5916\u90e8 HTML\uff1a<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>$html = $crawler-&gt;outerHtml();<\/code><\/pre>\n<\/blockquote><h2 class=\"wp-block-heading\">\u4e03\u3001\u8868\u8fbe\u5f0f\u8bc4\u4f30<\/h2><p>\u4f7f\u7528\u00a0<code>evaluate()<\/code>\u00a0\u8fdb\u884c\u590d\u6742\u8ba1\u7b97<\/p><pre class=\"wp-block-code\"><code>$totalLinks = $crawler-&gt;evaluate('count(\/\/');<\/code><\/pre><blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p>\u8be5\u65b9\u6cd5\u8ba1\u7b97\u7ed9\u5b9a\u7684 XPath \u8868\u8fbe\u5f0f\u3002\u56de\u5f52 \u503c\u53d6\u51b3\u4e8e XPath \u8868\u8fbe\u5f0f\u3002\u5982\u679c\u8868\u8fbe\u5f0f\u7684\u8ba1\u7b97\u7ed3\u679c\u4e3a\u6807\u91cf value \uff08e.g. HTML attributes\uff09 \u65f6\uff0c\u5c06\u8fd4\u56de\u4e00\u4e2a\u7ed3\u679c\u6570\u7ec4\u3002\u5982\u679c expression \u7684\u8ba1\u7b97\u7ed3\u679c\u4e3a DOM \u6587\u6863\uff0c\u5219\u65b0\u5b9e\u4f8b\u5c06\u4e3a \u8fd4\u56de\u3002<code>evaluate()<\/code><code>Crawler<\/code><\/p>\n\n\n\n<p>\u6b64\u884c\u4e3a\u6700\u597d\u901a\u8fc7\u793a\u4f8b\u6765\u8bf4\u660e\uff1a<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>use SymfonyComponentDomCrawlerCrawler;\n\n$html = '&lt;html&gt;\n&lt;body&gt;\n    &lt;span id=\"article-100\" class=\"article\"&gt;Article 1&lt;\/span&gt;\n    &lt;span id=\"article-101\" class=\"article\"&gt;Article 2&lt;\/span&gt;\n    &lt;span id=\"article-102\" class=\"article\"&gt;Article 3&lt;\/span&gt;\n&lt;\/body&gt;\n&lt;\/html&gt;';\n\n$crawler = new Crawler();\n$crawler-&gt;addHtmlContent($html);\n\n$crawler-&gt;filterXPath('<a href=\"https:\/\/www.zhaozhao123.cn\/skin\/go?url=aHR0cDovL3NwYW4=\" rel=\"noopener noreferrer nofollow\">\/\/span<\/a>[contains(@id, \"article-\")]')-&gt;evaluate('substring-after(@id, \"-\")');\n\/* Result:\n[\n    0 =&gt; '100',\n    1 =&gt; '101',\n    2 =&gt; '102',\n];\n*\/\n\n$crawler-&gt;evaluate('substring-after(<a href=\"https:\/\/www.zhaozhao123.cn\/skin\/go?url=aHR0cDovL3NwYW4=\" rel=\"noopener noreferrer nofollow\">\/\/span<\/a>[contains(@id, \"article-\")]\/@id, \"-\")');\n\/* Result:\n[\n    0 =&gt; '100',\n]\n*\/\n\n$crawler-&gt;filterXPath('<a href=\"https:\/\/www.zhaozhao123.cn\/skin\/go?url=aHR0cDovL3NwYW4=\" rel=\"noopener noreferrer nofollow\">\/\/span<\/a>[@class=\"article\"]')-&gt;evaluate('count(@id)');\n\/* Result:\n[\n    0 =&gt; 1.0,\n    1 =&gt; 1.0,\n    2 =&gt; 1.0,\n]\n*\/\n\n$crawler-&gt;evaluate('count(<a href=\"https:\/\/www.zhaozhao123.cn\/skin\/go?url=aHR0cDovL3NwYW4=\" rel=\"noopener noreferrer nofollow\">\/\/span<\/a>[@class=\"article\"])');\n\/* Result:\n[\n    0 =&gt; 3.0,\n]\n*\/\n\n$crawler-&gt;evaluate('<a href=\"https:\/\/www.zhaozhao123.cn\/skin\/go?url=aHR0cDovL3NwYW4=\" rel=\"noopener noreferrer nofollow\">\/\/span<\/a>[1]');\n\/\/ A SymfonyComponentDomCrawlerCrawler instance<\/code><\/pre>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\">\n<\/blockquote><h2 class=\"wp-block-heading\">\u516b\u3001\u94fe\u63a5\u5904\u7406<\/h2><p>\u63d0\u53d6\u5e76\u89e3\u6790\u94fe\u63a5<\/p><pre class=\"wp-block-code\"><code>use SymfonyComponentDomCrawlerUriResolver;\n\n$links = $crawler-&gt;filter('a')-&gt;links();\nforeach ($links as $link) {\n    $absoluteUrl = UriResolver::resolve($link-&gt;getUri(), '<a href=\"https:\/\/www.zhaozhao123.cn\/skin\/go?url=aHR0cHM6Ly9leGFtcGxlLmNvbQ==\" rel=\"noopener noreferrer nofollow\">https:\/\/example.com<\/a>');\n    echo $absoluteUrl;\n}<\/code><\/pre><blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p>\u4f7f\u7528 \u8be5\u65b9\u6cd5\u6309\u5176 or \u5c5e\u6027\u67e5\u627e\u94fe\u63a5 \u548c \u4f7f\u7528 \u8be5\u65b9\u6cd5\u6309\u5176\u5185\u5bb9\u67e5\u627e\u94fe\u63a5 \uff08\u5b83\u8fd8\u4f1a\u67e5\u627e\u5176\u5c5e\u6027\u4e2d\u5305\u542b\u8be5\u5185\u5bb9\u7684\u53ef\u70b9\u51fb\u56fe\u50cf\uff09\u3002<code>filter()<\/code><code>id<\/code><code>class<\/code><code>selectLink()<\/code><code>alt<\/code><\/p>\n\n\n\n<p>\u8fd9\u4e24\u79cd\u65b9\u6cd5\u90fd\u8fd4\u56de\u4ec5\u5305\u542b\u6240\u9009\u94fe\u63a5\u7684\u5b9e\u4f8b\u3002\u4f7f\u7528\u8be5\u65b9\u6cd5\u83b7\u53d6\u00a0<a href=\"https:\/\/www.zhaozhao123.cn\/skin\/go?url=aHR0cHM6Ly9naXRodWIuY29tL3N5bWZvbnkvc3ltZm9ueS9ibG9iLzcuMi9zcmMvU3ltZm9ueS9Db21wb25lbnQvRG9tQ3Jhd2xlci9MaW5rLnBocA==\" rel=\"noopener noreferrer nofollow\" target=\"_blank\">Link<\/a>\u00a0\u5bf9\u8c61 \uff0c\u8fd9\u8868\u793a\u94fe\u63a5\uff1a<code>Crawlerlink()<\/code><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>\/\/ \u9996\u5148\uff0c\u6309 ID\u3001\u7c7b\u6216\u5185\u5bb9\u9009\u62e9\u94fe\u63a5...\n$linkCrawler = $crawler-&gt;filter('#sign-up');\n$linkCrawler = $crawler-&gt;filter('.user-profile');\n$linkCrawler = $crawler-&gt;selectLink('Log in');\n\n\/\/ ...\u7136\u540e\uff0c\u83b7\u53d6 Link \u5bf9\u8c61\uff1a\n$link = $linkCrawler-&gt;link();\n\n\/\/\u6216\u8005\u4e00\u6b21\u6027\u6267\u884c\u6240\u6709\u8fd9\u4e9b\u4f5c\uff1a\n$link = $crawler-&gt;filter('#sign-up')-&gt;link();\n$link = $crawler-&gt;filter('.user-profile')-&gt;link();\n$link = $crawler-&gt;selectLink('Log in')-&gt;link();<\/code><\/pre>\n\n\n\n<p><a href=\"https:\/\/www.zhaozhao123.cn\/skin\/go?url=aHR0cHM6Ly9naXRodWIuY29tL3N5bWZvbnkvc3ltZm9ueS9ibG9iLzcuMi9zcmMvU3ltZm9ueS9Db21wb25lbnQvRG9tQ3Jhd2xlci9MaW5rLnBocA==\" rel=\"noopener noreferrer nofollow\" target=\"_blank\">Link<\/a>\u00a0\u5bf9\u8c61\u6709\u51e0\u4e2a\u6709\u7528\u7684 \u83b7\u53d6\u6709\u5173\u6240\u9009\u94fe\u63a5\u672c\u8eab\u7684\u66f4\u591a\u4fe1\u606f\u7684\u65b9\u6cd5\uff1a<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>\/\/ \u8fd4\u56de\u53ef\u7528\u4e8e\u53d1\u51fa\u53e6\u4e00\u4e2a\u8bf7\u6c42\u7684\u6b63\u786e URI\n$uri = $link-&gt;getUri();<\/code><\/pre>\n\n\n\n<p>\u8fd9\u7279\u522b\u6709\u7528\uff0c\u56e0\u4e3a\u5b83\u4f1a\u6e05\u7406\u503c\u548c \u5c06\u5176\u8f6c\u5316\u4e3a\u5b83\u771f\u6b63\u5e94\u8be5\u5982\u4f55\u5904\u7406\u3002\u4f8b\u5982\uff0c\u5bf9\u4e8e link \u66ff\u6362\u4e3a \uff0c\u8fd9\u5c06\u8fd4\u56de\u5f53\u524d \u4ee5 .return from \u59cb\u7ec8\u662f\u5b8c\u6574\u7684 URI \u4e2d\u6267\u884c\u4f5c\u3002<code>getUri()<\/code><code>href<\/code><code>href=\"#foo\"<\/code><code>#foo<\/code><code>getUri()<\/code><\/p>\n<\/blockquote><hr class=\"wp-block-separator has-alpha-channel-opacity\"><h2 class=\"wp-block-heading\">\u4e5d\u3001\u56fe\u50cf\u5904\u7406<\/h2><blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p>\u8981\u6309\u5c5e\u6027\u67e5\u627e\u56fe\u50cf\uff0c\u8bf7\u4f7f\u7528 \u73b0\u6709\u722c\u7f51\u7a0b\u5e8f\u3002\u8fd9\u5c06\u8fd4\u56de\u4e00\u4e2a\u5b9e\u4f8b\uff0c\u5176\u4e2d\u4ec5\u5305\u542b\u9009\u5b9a\u7684 \u56fe\u7247\u3002\u8c03\u7528 \u4f1a\u5f97\u5230\u4e00\u4e2a\u7279\u6b8a\u7684\u00a0<a href=\"https:\/\/www.zhaozhao123.cn\/skin\/go?url=aHR0cHM6Ly9naXRodWIuY29tL3N5bWZvbnkvc3ltZm9ueS9ibG9iLzcuMi9zcmMvU3ltZm9ueS9Db21wb25lbnQvRG9tQ3Jhd2xlci9JbWFnZS5waHA=\" rel=\"noopener noreferrer nofollow\" target=\"_blank\">Image<\/a>\u00a0\u5bf9\u8c61\uff1a<code>altselectImageCrawlerimage()<\/code><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>$imagesCrawler = $crawler-&gt;selectImage('Kitten');\n$image = $imagesCrawler-&gt;image();\n\n\/\/ or do this all at once\n$image = $crawler-&gt;selectImage('Kitten')-&gt;image();<\/code><\/pre>\n\n\n\n<p><a href=\"https:\/\/www.zhaozhao123.cn\/skin\/go?url=aHR0cHM6Ly9naXRodWIuY29tL3N5bWZvbnkvc3ltZm9ueS9ibG9iLzcuMi9zcmMvU3ltZm9ueS9Db21wb25lbnQvRG9tQ3Jhd2xlci9JbWFnZS5waHA=\" rel=\"noopener noreferrer nofollow\" target=\"_blank\">Image<\/a>\u00a0\u5bf9\u8c61\u4e0e\u00a0<a href=\"https:\/\/www.zhaozhao123.cn\/skin\/go?url=aHR0cHM6Ly9naXRodWIuY29tL3N5bWZvbnkvc3ltZm9ueS9ibG9iLzcuMi9zcmMvU3ltZm9ueS9Db21wb25lbnQvRG9tQ3Jhd2xlci9MaW5rLnBocA==\" rel=\"noopener noreferrer nofollow\" target=\"_blank\">Link<\/a>\u00a0\u5177\u6709\u76f8\u540c\u7684\u65b9\u6cd5\u3002<code>getUri()<\/code><\/p>\n<\/blockquote><p class=\"has-vivid-red-color has-text-color has-link-color wp-elements-f88c89823295a12b61125df9e69bf8bf\"><strong>\u63d0\u53d6\u56fe\u50cf\u6e90<\/strong>\uff1a<\/p><pre class=\"wp-block-code\"><code>$images = $crawler-&gt;filter('img')-&gt;each(function (Crawler $node) {\n    return $node-&gt;attr('src');\n});<\/code><\/pre><hr class=\"wp-block-separator has-alpha-channel-opacity\"><h2 class=\"wp-block-heading\">\u5341\u3001\u8868\u5355\u5904\u7406<\/h2><h3 class=\"wp-block-heading\">1. \u83b7\u53d6\u8868\u5355\u5b57\u6bb5<\/h3><pre class=\"wp-block-code\"><code>$form = $crawler-&gt;filter('form')-&gt;form();\n$values = $form-&gt;getValues();<\/code><\/pre><h3 class=\"wp-block-heading\">2. \u6a21\u62df\u63d0\u4ea4<\/h3><pre class=\"wp-block-code\"><code>$form['username'] = 'admin';\n$form['password'] = 'pass123';\n$submittedData = $form-&gt;getPhpValues();<\/code><\/pre><blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p>\u8868\u683c\u4e5f\u53d7\u5230\u7279\u6b8a\u5904\u7406\u3002\u65b9\u6cd5\u662f available \u5728 Crawler \u4e0a\uff0c\u5b83\u5c06\u8fd4\u56de\u53e6\u4e00\u4e2a\u5339\u914d or \u6216 \u5143\u7d20\uff08\u6216\u5176\u4e2d\u7684\u5143\u7d20\uff09\u7684 Crawler\u3002\u4f5c\u4e3a\u53c2\u6570\u7ed9\u51fa\u7684\u5b57\u7b26\u4e32\u5728 \u3001 \u548c \u5c5e\u6027\u4ee5\u53ca \u90a3\u4e9b\u5143\u7d20\u3002<code>selectButton()<\/code><code>&lt;button&gt;<\/code><code>&lt;input type=\"submit\"&gt;<\/code><code>&lt;input type=\"button\"&gt;<\/code><code>&lt;img&gt;<\/code><code>id<\/code><code>alt<\/code><code>name<\/code><code>value<\/code><\/p>\n\n\n\n<p>\u6b64\u65b9\u6cd5\u7279\u522b\u6709\u7528\uff0c\u56e0\u4e3a\u60a8\u53ef\u4ee5\u4f7f\u7528\u5b83\u6765\u8fd4\u56de \u4e00\u4e2a\u00a0<a href=\"https:\/\/www.zhaozhao123.cn\/skin\/go?url=aHR0cHM6Ly9naXRodWIuY29tL3N5bWZvbnkvc3ltZm9ueS9ibG9iLzcuMi9zcmMvU3ltZm9ueS9Db21wb25lbnQvRG9tQ3Jhd2xlci9Gb3JtLnBocA==\" rel=\"noopener noreferrer nofollow\" target=\"_blank\">Form<\/a>\u00a0\u5bf9\u8c61\uff0c\u8868\u793a \u6309\u94ae\u6240\u5728\u7684\u5f62\u5f0f\uff1a<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>\/\/ button example: &lt;button id=\"my-super-button\" type=\"submit\"&gt;My super button&lt;\/button&gt;\n\n\/\/ you can get button by its label\n$form = $crawler-&gt;selectButton('My super button')-&gt;form();\n\n\/\/ or by button id (#my-super-button) if the button doesn't have a label\n$form = $crawler-&gt;selectButton('my-super-button')-&gt;form();\n\n\/\/ or you can filter the whole form, for example a form has a class attribute: &lt;form class=\"form-vertical\" method=\"POST\"&gt;\n$crawler-&gt;filter('.form-vertical')-&gt;form();\n\n\/\/ or \"fill\" the form fields with data\n$form = $crawler-&gt;selectButton('my-super-button')-&gt;form([\n    'name' =&gt; 'Ryan',\n]);<\/code><\/pre>\n\n\n\n<p><a href=\"https:\/\/www.zhaozhao123.cn\/skin\/go?url=aHR0cHM6Ly9naXRodWIuY29tL3N5bWZvbnkvc3ltZm9ueS9ibG9iLzcuMi9zcmMvU3ltZm9ueS9Db21wb25lbnQvRG9tQ3Jhd2xlci9Gb3JtLnBocA==\" rel=\"noopener noreferrer nofollow\" target=\"_blank\">Form<\/a>\u00a0\u5bf9\u8c61\u6709\u5f88\u591a\u975e\u5e38 \u4f7f\u7528\u8868\u5355\u7684\u6709\u7528\u65b9\u6cd5\uff1a<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>$uri = $form-&gt;getUri();\n$method = $form-&gt;getMethod();\n$name = $form-&gt;getName();<\/code><\/pre>\n\n\n\n<p><a href=\"https:\/\/www.zhaozhao123.cn\/skin\/go?url=aHR0cHM6Ly9naXRodWIuY29tL3N5bWZvbnkvc3ltZm9ueS9ibG9iLzcuMi9zcmMvU3ltZm9ueS9Db21wb25lbnQvRG9tQ3Jhd2xlci9Gb3JtLnBocCM6fjp0ZXh0PWZ1bmN0aW9uJTIwZ2V0VXJp\" rel=\"noopener noreferrer nofollow\" target=\"_blank\">getUri\uff08\uff09<\/a>\u00a0\u65b9\u6cd5\u6267\u884c\u66f4\u591a\u4f5c \u800c\u4e0d\u4ec5\u4ec5\u662f\u8fd4\u56de\u8868\u5355\u7684\u5c5e\u6027\u3002\u5982\u679c form \u65b9\u6cd5 \u662f GET\uff0c\u5219\u5b83\u4f1a\u6a21\u62df\u6d4f\u89c8\u5668\u7684\u884c\u4e3a\u5e76\u8fd4\u56de\u5c5e\u6027\uff0c\u540e\u8ddf\u5305\u542b\u6240\u6709\u8868\u5355\u503c\u7684\u67e5\u8be2\u5b57\u7b26\u4e32\u3002<code>action<\/code><code>action<\/code><\/p>\n\n\n\n<p>optional \u548c button \u5c5e\u6027\u5305\u62ec \u652f\u6301\u3002\u548c \u65b9\u6cd5\u8003\u8651\u4e86 \u8fd9\u4e9b\u5c5e\u6027\u59cb\u7ec8\u8fd4\u56de\u6b63\u786e\u7684\u4f5c\u548c\u65b9\u6cd5\uff0c\u5177\u4f53\u53d6\u51b3\u4e8e \u7528\u4e8e\u83b7\u53d6\u8868\u5355\u7684\u6309\u94ae\u3002<code>formaction<\/code><code>formmethod<\/code><code>getUri()<\/code><code>getMethod()<\/code><\/p>\n\n\n\n<p>\u60a8\u53ef\u4ee5\u5728\u8868\u5355\u4e0a\u865a\u62df\u8bbe\u7f6e\u548c\u83b7\u53d6\u503c\uff1a<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>\/\/ sets values on the form internally\n$form-&gt;setValues([\n    'registration[username]' =&gt; 'symfonyfan',\n    'registration[terms]'    =&gt; 1,\n]);\n\n\/\/ gets back an array of values - in the \"flat\" array like above\n$values = $form-&gt;getValues();\n\n\/\/ returns the values like PHP would see them,\n\/\/ where \"registration\" is its own array\n$values = $form-&gt;getPhpValues();<\/code><\/pre>\n\n\n\n<p>\u8981\u4f7f\u7528\u591a\u7ef4\u5b57\u6bb5\uff1a<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>&lt;form&gt;\n    &lt;input name=\"multi[]\"&gt;\n    &lt;input name=\"multi[]\"&gt;\n    &lt;input name=\"multi[dimensional]\"&gt;\n    &lt;input name=\"multi[dimensional][]\" value=\"1\"&gt;\n    &lt;input name=\"multi[dimensional][]\" value=\"2\"&gt;\n    &lt;input name=\"multi[dimensional][]\" value=\"3\"&gt;\n&lt;\/form&gt;<\/code><\/pre>\n\n\n\n<p>\u4f20\u9012\u503c\u6570\u7ec4\uff1a<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>\/\/ sets a single field\n$form-&gt;setValues(['multi' =&gt; ['value']]);\n\n\/\/ sets multiple fields at once\n$form-&gt;setValues(['multi' =&gt; [\n    1             =&gt; 'value',\n    'dimensional' =&gt; 'an other value',\n]]);\n\n\/\/ tick multiple checkboxes at once\n$form-&gt;setValues(['multi' =&gt; [\n    'dimensional' =&gt; [1, 3] \/\/ it uses the input value to determine which checkbox to tick\n]]);<\/code><\/pre>\n\n\n\n<p>\u8fd9\u5f88\u597d\uff0c\u4f46\u5b83\u4f1a\u53d8\u5f97\u66f4\u597d\uff01\u8be5\u5bf9\u8c61\u5141\u8bb8\u60a8\u8fdb\u884c\u4ea4\u4e92 \u50cf\u6d4f\u89c8\u5668\u4e00\u6837\u4f7f\u7528\u8868\u5355\uff0c\u9009\u62e9\u5355\u9009\u503c\uff0c\u52fe\u9009\u590d\u9009\u6846\uff0c \u548c\u4e0a\u4f20\u6587\u4ef6\uff1a<code>Form<\/code><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>$form['registration[username]']-&gt;setValue('symfonyfan');\n\n\/\/ checks or unchecks a checkbox\n$form['registration[terms]']-&gt;tick();\n$form['registration[terms]']-&gt;untick();\n\n\/\/ selects an option\n$form['registration[birthday][year]']-&gt;select(1984);\n\n\/\/ selects many options from a \"multiple\" select\n$form['registration[interests]']-&gt;select(['symfony', 'cookies']);\n\n\/\/ fakes a file upload\n$form['registration[photo]']-&gt;upload('\/path\/to\/lucas.jpg');<\/code><\/pre>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"using-the-form-data\"><a href=\"https:\/\/www.zhaozhao123.cn\/skin\/go?url=aHR0cHM6Ly9zeW1mb255LmNvbS9kb2MvY3VycmVudC9jb21wb25lbnRzL2RvbV9jcmF3bGVyLmh0bWwjdXNpbmctdGhlLWZvcm0tZGF0YQ==\" rel=\"noopener noreferrer nofollow\">\u4f7f\u7528\u8868\u5355\u6570\u636e<\/a><\/h4>\n\n\n\n<p>\u505a\u8fd9\u4e00\u5207\u6709\u4ec0\u4e48\u610f\u4e49\u5462\uff1f\u5982\u679c\u60a8\u5728\u5185\u90e8\u8fdb\u884c\u6d4b\u8bd5\uff0c\u5219 \u53ef\u4ee5\u4ece\u8868\u5355\u4e2d\u83b7\u53d6\u4fe1\u606f\uff0c\u5c31\u50cf\u5b83\u521a\u521a\u63d0\u4ea4\u4e00\u6837 \u901a\u8fc7\u4f7f\u7528 PHP \u503c\uff1a<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>$values = $form-&gt;getPhpValues();\n$files = $form-&gt;getPhpFiles();<\/code><\/pre>\n\n\n\n<p>\u5982\u679c\u60a8\u4f7f\u7528\u7684\u662f\u5916\u90e8 HTTP \u5ba2\u6237\u7aef\uff0c\u5219\u53ef\u4ee5\u4f7f\u7528\u8868\u5355\u6765\u83b7\u53d6\u6240\u6709\u5185\u5bb9 \u4e2d\uff0c\u60a8\u9700\u8981\u4e3a\u8868\u5355\u521b\u5efa POST \u8bf7\u6c42\uff1a<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>$uri = $form-&gt;getUri();\n$method = $form-&gt;getMethod();\n$values = $form-&gt;getValues();\n$files = $form-&gt;getFiles();\n\n\/\/ now use some HTTP client and post using this information<\/code><\/pre>\n\n\n\n<p>\u4f7f\u7528\u6240\u6709\u8fd9\u4e9b\u7684\u96c6\u6210\u7cfb\u7edf\u7684\u4e00\u4e2a\u5f88\u597d\u7684\u4f8b\u5b50\u662f\u00a0<a href=\"https:\/\/www.zhaozhao123.cn\/skin\/go?url=aHR0cHM6Ly9naXRodWIuY29tL3N5bWZvbnkvc3ltZm9ueS9ibG9iLzcuMi9zcmMvU3ltZm9ueS9Db21wb25lbnQvQnJvd3NlcktpdC9IdHRwQnJvd3Nlci5waHA=\" rel=\"noopener noreferrer nofollow\" target=\"_blank\">\u7531<\/a>\u00a0<a href=\"https:\/\/www.zhaozhao123.cn\/skin\/go?url=aHR0cHM6Ly9zeW1mb255LmNvbS9kb2MvY3VycmVudC9jb21wb25lbnRzL2Jyb3dzZXJfa2l0Lmh0bWw=\" rel=\"noopener noreferrer nofollow\">BrowserKit \u7ec4\u4ef6<\/a>\u3002 \u5b83\u7406\u89e3 Symfony Crawler \u5bf9\u8c61\uff0c\u5e76\u53ef\u4ee5\u4f7f\u7528\u5b83\u6765\u63d0\u4ea4\u8868\u5355 \u5f84\u76f4\uff1a<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>use SymfonyComponentBrowserKitHttpBrowser;\nuse SymfonyComponentHttpClientHttpClient;\n\n\/\/ makes a real request to an external site\n$browser = new HttpBrowser(HttpClient::create());\n$crawler = $browser-&gt;request('GET', '<a href=\"https:\/\/www.zhaozhao123.cn\/skin\/go?url=aHR0cHM6Ly9naXRodWIuY29tL2xvZ2lu\" rel=\"noopener noreferrer nofollow\">https:\/\/github.com\/login<\/a>');\n\n\/\/ select the form and fill in some values\n$form = $crawler-&gt;selectButton('Sign in')-&gt;form();\n$form['login'] = 'symfonyfan';\n$form['password'] = 'anypass';\n\n\/\/ submits the given form\n$crawler = $browser-&gt;submit($form);<\/code><\/pre>\n\n\n\n<h4 class=\"wp-block-heading\" id=\"selecting-invalid-choice-values\"><a href=\"https:\/\/www.zhaozhao123.cn\/skin\/go?url=aHR0cHM6Ly9zeW1mb255LmNvbS9kb2MvY3VycmVudC9jb21wb25lbnRzL2RvbV9jcmF3bGVyLmh0bWwjc2VsZWN0aW5nLWludmFsaWQtY2hvaWNlLXZhbHVlcw==\" rel=\"noopener noreferrer nofollow\">\u9009\u62e9\u65e0\u6548\u7684\u9009\u62e9\u503c<\/a><\/h4>\n\n\n\n<p>\u9ed8\u8ba4\u60c5\u51b5\u4e0b\uff0c\u9009\u62e9\u5b57\u6bb5 \uff08select\uff0c radio\uff09 \u5df2\u6fc0\u6d3b\u5185\u90e8\u9a8c\u8bc1 \u4ee5\u9632\u6b62\u60a8\u8bbe\u7f6e\u65e0\u6548\u503c\u3002\u5982\u679c\u60a8\u5e0c\u671b\u80fd\u591f\u8bbe\u7f6e invalid values \u7684 API \u4e2d\uff0c\u60a8\u53ef\u4ee5\u5728 \u6574\u4e2a\u8868\u5355\u6216\u7279\u5b9a\u5b57\u6bb5\uff1a<code>disableValidation()<\/code><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>\/\/ disables validation for a specific field\n$form['country']-&gt;disableValidation()-&gt;select('Invalid value');\n\n\/\/ disables validation for the whole form\n$form-&gt;disableValidation();\n$form['country']-&gt;select('Invalid value');<\/code><\/pre>\n<\/blockquote><hr class=\"wp-block-separator has-alpha-channel-opacity\"><h2 class=\"wp-block-heading\">\u5341\u4e00\u3001\u89e3\u6790 URI<\/h2><p>\u76f8\u5bf9\u8def\u5f84\u8f6c\u7edd\u5bf9\u8def\u5f84<\/p><pre class=\"wp-block-code\"><code>$relativeUrl = '\/about';\n$baseUrl = '<a href=\"https:\/\/www.zhaozhao123.cn\/skin\/go?url=aHR0cHM6Ly9leGFtcGxlLmNvbQ==\" rel=\"noopener noreferrer nofollow\">https:\/\/example.com<\/a>';\n$absoluteUrl = UriResolver::resolve($relativeUrl, $baseUrl); \/\/ <a href=\"https:\/\/www.zhaozhao123.cn\/skin\/go?url=aHR0cHM6Ly9leGFtcGxlLmNvbS9hYm91dA==\" rel=\"noopener noreferrer nofollow\">https:\/\/example.com\/about<\/a><\/code><\/pre><blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p><a href=\"https:\/\/www.zhaozhao123.cn\/skin\/go?url=aHR0cHM6Ly9naXRodWIuY29tL3N5bWZvbnkvc3ltZm9ueS9ibG9iLzcuMi9zcmMvU3ltZm9ueS9Db21wb25lbnQvRG9tQ3Jhd2xlci9VcmlSZXNvbHZlci5waHA=\" rel=\"noopener noreferrer nofollow\" target=\"_blank\">UriResolver<\/a>\u00a0\u7c7b\u91c7\u7528 URI \uff08\u76f8\u5bf9\u3001\u7edd\u5bf9\u3001\u7247\u6bb5\u7b49\uff09\u5e76\u5c06\u5176\u8f6c\u6362\u4e3a\u9488\u5bf9 \u53e6\u4e00\u4e2a\u7ed9\u5b9a\u7684\u57fa URI\uff1a<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>use SymfonyComponentDomCrawlerUriResolver;\n\nUriResolver::resolve('\/foo', '<a href=\"https:\/\/www.zhaozhao123.cn\/skin\/go?url=aHR0cDovL2xvY2FsaG9zdC9iYXIvZm9vLw==\" rel=\"noopener noreferrer nofollow\">http:\/\/localhost\/bar\/foo\/<\/a>'); \/\/ <a href=\"https:\/\/www.zhaozhao123.cn\/skin\/go?url=aHR0cDovL2xvY2FsaG9zdC9mb28=\" rel=\"noopener noreferrer nofollow\">http:\/\/localhost\/foo<\/a>\nUriResolver::resolve('?a=b', '<a href=\"https:\/\/www.zhaozhao123.cn\/skin\/go?url=aHR0cDovL2xvY2FsaG9zdC9iYXI=\" rel=\"noopener noreferrer nofollow\">http:\/\/localhost\/bar<\/a>#foo'); \/\/ <a href=\"https:\/\/www.zhaozhao123.cn\/skin\/go?url=aHR0cDovL2xvY2FsaG9zdC9iYXI=\" rel=\"noopener noreferrer nofollow\">http:\/\/localhost\/bar<\/a>=b\nUriResolver::resolve('..\/..\/', '<a href=\"https:\/\/www.zhaozhao123.cn\/skin\/go?url=aHR0cDovL2xvY2FsaG9zdC8=\" rel=\"noopener noreferrer nofollow\">http:\/\/localhost\/<\/a>'); \/\/ <a href=\"https:\/\/www.zhaozhao123.cn\/skin\/go?url=aHR0cDovL2xvY2FsaG9zdC8=\" rel=\"noopener noreferrer nofollow\">http:\/\/localhost\/<\/a><\/code><\/pre>\n<\/blockquote><hr class=\"wp-block-separator has-alpha-channel-opacity\"><h2 class=\"wp-block-heading\">\u5341\u4e8c\u3001HTML5 \u89e3\u6790\u5668\u96c6\u6210<\/h2><p>\u4f7f\u7528 HTML5 \u89e3\u6790\u5668<\/p><pre class=\"wp-block-code\"><code>use SymfonyComponentDomCrawlerCrawler;\n\n$html5 = new MastermindsHTML5();\n$dom = $html5-&gt;parse($htmlContent);\n$crawler = new Crawler($dom);<\/code><\/pre><blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p>\u5982\u679c\u60a8\u9700\u8981\u00a0<a href=\"https:\/\/www.zhaozhao123.cn\/skin\/go?url=aHR0cHM6Ly9naXRodWIuY29tL3N5bWZvbnkvc3ltZm9ueS9ibG9iLzcuMi9zcmMvU3ltZm9ueS9Db21wb25lbnQvRG9tQ3Jhd2xlci9DcmF3bGVyLnBocA==\" rel=\"noopener noreferrer nofollow\" target=\"_blank\">Crawler<\/a>\u00a0\u4f7f\u7528 HTML5 parser \u4e2d\uff0c\u5c06\u5176 constructor \u53c2\u6570\u8bbe\u7f6e\u4e3a\uff1a<code>useHtml5Parsertrue<\/code><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>use SymfonyComponentDomCrawlerCrawler;\n\n$crawler = new Crawler(null, $uri, useHtml5Parser: true);<\/code><\/pre>\n\n\n\n<p>\u8fd9\u6837\uff0c\u722c\u866b\u5c06\u4f7f\u7528\u00a0<a href=\"https:\/\/www.zhaozhao123.cn\/skin\/go?url=aHR0cHM6Ly9wYWNrYWdpc3Qub3JnL3BhY2thZ2VzL21hc3Rlcm1pbmRzL2h0bWw1\" rel=\"noopener noreferrer nofollow\" target=\"_blank\">masterminds\/html5 \u5e93\u63d0\u4f9b\u7684 HTML5<\/a>\u00a0\u89e3\u6790\u5668\u6765\u89e3\u6790\u6587\u6863\u3002<\/p>\n<\/blockquote><hr class=\"wp-block-separator has-alpha-channel-opacity\"><h2 class=\"wp-block-heading\">\u5b9e\u6218\u793a\u4f8b\uff1a\u6293\u53d6\u9875\u9762\u94fe\u63a5\u548c\u56fe\u7247<\/h2><pre class=\"wp-block-code\"><code>$crawler = new Crawler(file_get_contents('<a href=\"https:\/\/www.zhaozhao123.cn\/skin\/go?url=aHR0cHM6Ly9leGFtcGxlLmNvbQ==\" rel=\"noopener noreferrer nofollow\">https:\/\/example.com<\/a>'));\n\n\/\/ \u63d0\u53d6\u6240\u6709\u94fe\u63a5\n$links = $crawler-&gt;filter('a')-&gt;each(function (Crawler $node) {\n    return $node-&gt;attr('href');\n});\n\n\/\/ \u63d0\u53d6\u6240\u6709\u56fe\u7247\n$images = $crawler-&gt;filter('img')-&gt;each(function (Crawler $node) {\n    return [\n        'src' =&gt; $node-&gt;attr('src'),\n        'alt' =&gt; $node-&gt;attr('alt')\n    ];\n});<\/code><\/pre><hr class=\"wp-block-separator has-alpha-channel-opacity\"><h2 class=\"wp-block-heading\">\u6ce8\u610f\u4e8b\u9879<\/h2><ol class=\"wp-block-list\">\n<li><strong>\u7f16\u7801\u95ee\u9898<\/strong>\uff1a\u786e\u4fdd\u6587\u6863\u7f16\u7801\u4e0e\u89e3\u6790\u5668\u4e00\u81f4<\/li>\n\n\n\n<li><strong>\u6027\u80fd\u4f18\u5316<\/strong>\uff1a\u907f\u514d\u5728\u5faa\u73af\u4e2d\u91cd\u590d\u521b\u5efa Crawler \u5bf9\u8c61<\/li>\n\n\n\n<li><strong>\u9519\u8bef\u5904\u7406<\/strong>\uff1a\u4f7f\u7528 <code>count()<\/code> \u68c0\u67e5\u8282\u70b9\u662f\u5426\u5b58\u5728\uff1a<\/li>\n<\/ol><pre class=\"wp-block-code\"><code>if ($crawler-&gt;filter('.not-exists')-&gt;count() &gt; 0) {\n    \/\/ \u5904\u7406\u8282\u70b9\n}<\/code><\/pre><p>\u901a\u8fc7\u672c\u6559\u7a0b\uff0c\u60a8\u5df2\u638c\u63e1 Symfony DomCrawler \u7684\u6838\u5fc3\u529f\u80fd\u3002\u8be5\u7ec4\u4ef6\u7279\u522b\u9002\u7528\u4e8e\u7f51\u9875\u6293\u53d6\u3001\u81ea\u52a8\u5316\u6d4b\u8bd5\u548c\u5185\u5bb9\u5206\u6790\u573a\u666f\u3002<\/p>","protected":false},"excerpt":{"rendered":"<p>Symfony DomCrawler \u662f\u4e00\u4e2a\u5f3a\u5927\u7684 PHP \u7ec4\u4ef6\uff0c\u7528\u4e8e\u89e3\u6790\u548c\u64cd\u4f5c HTML\/XML \u6587\u6863\u3002\u5b83\u652f\u6301 XPath \u548c CSS \u9009\u62e9\u5668\uff0c\u5e76\u96c6\u6210\u4e86 HTML5 \u89e3\u6790\u80fd\u529b\u3002\u672c\u6559\u7a0b\u5c06\u6db5\u76d6\u5b89\u88c5\u3001\u6838\u5fc3\u529f\u80fd\u53ca\u5b9e\u9645\u5e94\u7528\u573a\u666f\u3002 \u80cc\u666f \u672c\u6559\u7a0b\u4e8e2025\u5e744\u670827\u65e5\u6574\u7406\u5b9a\u7a3f\uff0c\u4ee5\u4e0b\u6559\u7a0b\u662f\u57fa\u4e8e\u8fd9\u4e2a\u65f6\u95f4\u70b9 Symfony DomCra..<\/p>\n","protected":false},"author":1,"featured_media":0,"menu_order":0,"template":"","meta":{"_acf_changed":false},"tags":[],"my1js2nav":[45],"tuisongtax":[],"class_list":["post-1261","my1js","type-my1js","status-publish","hentry","my1js2nav-symfony"],"acf":{"qian_art_seotitle":"","qian_art_seotitle_source":{"label":"SEO\u6807\u9898","type":"text","formatted_value":""},"qian_art_seokws":"","qian_art_seokws_source":{"label":"SEO\u5173\u952e\u8bcd","type":"text","formatted_value":""},"qian_art_stzhong":"","qian_art_stzhong_source":{"label":"\u4e2d | \u77ed\u6807\u9898","type":"text","formatted_value":""}},"_links":{"self":[{"href":"https:\/\/www.zhaozhao123.cn\/php\/wp-json\/wp\/v2\/my1js\/1261","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.zhaozhao123.cn\/php\/wp-json\/wp\/v2\/my1js"}],"about":[{"href":"https:\/\/www.zhaozhao123.cn\/php\/wp-json\/wp\/v2\/types\/my1js"}],"author":[{"embeddable":true,"href":"https:\/\/www.zhaozhao123.cn\/php\/wp-json\/wp\/v2\/users\/1"}],"wp:attachment":[{"href":"https:\/\/www.zhaozhao123.cn\/php\/wp-json\/wp\/v2\/media?parent=1261"}],"wp:term":[{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.zhaozhao123.cn\/php\/wp-json\/wp\/v2\/tags?post=1261"},{"taxonomy":"my1js2nav","embeddable":true,"href":"https:\/\/www.zhaozhao123.cn\/php\/wp-json\/wp\/v2\/my1js2nav?post=1261"},{"taxonomy":"tuisongtax","embeddable":true,"href":"https:\/\/www.zhaozhao123.cn\/php\/wp-json\/wp\/v2\/tuisongtax?post=1261"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}