Script Loader: Harden removal of script tag wrappers.

* Add `wp_remove_surrounding_empty_script_tags()` to more precisely remove script tag wrappers and warn when doing it wrong.
* Add clarifying comments for XML escaping logic in `wp_get_inline_script_tag()`.
* Leverage `WP_HTML_Tag_Processor` in `test_remove_frameless_preview_messenger_channel`.
* Reuse `assertEqualMarkup` in `test_blocking_dependent_with_delayed_dependency`.
* Normalize whitespace in `parse_markup_fragment` for `assertEqualMarkup`.

Follow-up to [56687].
Props dmsnell, westonruter, flixos90.
See #58664.


git-svn-id: https://develop.svn.wordpress.org/trunk@56748 602fd350-edb4-49c9-b593-d223f7449a82
This commit is contained in:
Weston Ruter
2023-09-29 19:45:53 +00:00
parent 4baf0a1eda
commit 8c0adc93df
12 changed files with 193 additions and 20 deletions

View File

@@ -2853,9 +2853,43 @@ function wp_get_inline_script_tag( $javascript, $attributes = array() ) {
);
}
// Ensure markup is XHTML compatible if not HTML5.
/*
* XHTML extracts the contents of the SCRIPT element and then the XML parser
* decodes character references and other syntax elements. This can lead to
* misinterpretation of the script contents or invalid XHTML documents.
*
* Wrapping the contents in a CDATA section instructs the XML parser not to
* transform the contents of the SCRIPT element before passing them to the
* JavaScript engine.
*
* Example:
*
* <script>console.log('&hellip;');</script>
*
* In an HTML document this would print "&hellip;" to the console,
* but in an XHTML document it would print "…" to the console.
*
* <script>console.log('An image is <img> in HTML');</script>
*
* In an HTML document this would print "An image is <img> in HTML",
* but it's an invalid XHTML document because it interprets the `<img>`
* as an empty tag missing its closing `/`.
*
* @see https://www.w3.org/TR/xhtml1/#h-4.8
*/
if ( ! $is_html5 ) {
$javascript = str_replace( ']]>', ']]]]><![CDATA[>', $javascript ); // Escape any existing CDATA section.
/*
* If the string `]]>` exists within the JavaScript it would break
* out of any wrapping CDATA section added here, so to start, it's
* necessary to escape that sequence which requires splitting the
* content into two CDATA sections wherever it's found.
*
* Note: it's only necessary to escape the closing `]]>` because
* an additional `<![CDATA[` leaves the contents unchanged.
*/
$javascript = str_replace( ']]>', ']]]]><![CDATA[>', $javascript );
// Wrap the entire escaped script inside a CDATA section.
$javascript = sprintf( "/* <![CDATA[ */\n%s\n/* ]]> */", $javascript );
}
@@ -3299,3 +3333,51 @@ function wp_add_editor_classic_theme_styles( $editor_settings ) {
return $editor_settings;
}
/**
* Removes leading and trailing _empty_ script tags.
*
* This is a helper meant to be used for literal script tag construction
* within `wp_get_inline_script_tag()` or `wp_print_inline_script_tag()`.
* It removes the literal values of "<script>" and "</script>" from
* around an inline script after trimming whitespace. Typlically this
* is used in conjunction with output buffering, where `ob_get_clean()`
* is passed as the `$contents` argument.
*
* Example:
*
* // Strips exact literal empty SCRIPT tags.
* $js = '<script>sayHello();</script>;
* 'sayHello();' === wp_remove_surrounding_empty_script_tags( $js );
*
* // Otherwise if anything is different it warns in the JS console.
* $js = '<script type="text/javascript">console.log( "hi" );</script>';
* 'console.error( ... )' === wp_remove_surrounding_empty_script_tags( $js );
*
* @private
* @since 6.4.0
*
* @see wp_print_inline_script_tag()
* @see wp_get_inline_script_tag()
*
* @param string $contents Script body with manually created SCRIPT tag literals.
* @return string Script body without surrounding script tag literals, or
* original contents if both exact literals aren't present.
*/
function wp_remove_surrounding_empty_script_tags( $contents ) {
$contents = trim( $contents );
$opener = '<SCRIPT>';
$closer = '</SCRIPT>';
if (
strlen( $contents ) > strlen( $opener ) + strlen( $closer ) &&
strtoupper( substr( $contents, 0, strlen( $opener ) ) ) === $opener &&
strtoupper( substr( $contents, -strlen( $closer ) ) ) === $closer
) {
return substr( $contents, strlen( $opener ), -strlen( $closer ) );
} else {
$error_message = __( 'Expected string to start with script tag (without attributes) and end with script tag, with optional whitespace.' );
_doing_it_wrong( __FUNCTION__, $error_message, '6.4' );
return sprintf( 'console.error(%s)', wp_json_encode( __( 'Function wp_remove_surrounding_empty_script_tags() used incorrectly in PHP.' ) . ' ' . $error_message ) );
}
}