Don't consider textual characters to be emoji (#12582)

* Don't consider textual characters to be emoji

We were using emojibase-regex to match emoji within messages. However, the docs (https://emojibase.dev/docs/regex/) state that this regex matches both emoji and text presentation characters. This is not what we want, and will result in false positives for characters like '↔' that could turn into an emoji if paired with a variation selector. Unfortunately, none of the other regexes provided by Emojibase do what we want either (https://github.com/milesj/emojibase/issues/174). In the meantime, browser support for the RGI_Emoji character sequence class has made it feasible to write an emoji regex by hand, so that's what I've done.

* Add a fallback for BIGEMOJI_REGEX as well
This commit is contained in:
Robin 2024-07-04 13:48:07 -04:00 committed by GitHub
parent 489bc32674
commit c61eca8c24
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
6 changed files with 98 additions and 12 deletions

View file

@ -78,6 +78,11 @@ module.exports = {
name: "matrix-react-sdk/",
message: "Please use matrix-react-sdk/src/index instead",
},
{
name: "emojibase-regex",
message:
"This regex doesn't actually test for emoji. See the docs at https://emojibase.dev/docs/regex/ and prefer our own EMOJI_REGEX from HtmlUtils.",
},
],
patterns: [
{
@ -141,6 +146,11 @@ module.exports = {
],
message: "Please use matrix-js-sdk/src/matrix instead",
},
{
group: ["emojibase-regex/emoji*"],
message:
"This regex doesn't actually test for emoji. See the docs at https://emojibase.dev/docs/regex/ and prefer our own EMOJI_REGEX from HtmlUtils.",
},
],
},
],