Don't consider textual characters to be emoji (#12582)

* Don't consider textual characters to be emoji

We were using emojibase-regex to match emoji within messages. However, the docs (https://emojibase.dev/docs/regex/) state that this regex matches both emoji and text presentation characters. This is not what we want, and will result in false positives for characters like '↔' that could turn into an emoji if paired with a variation selector. Unfortunately, none of the other regexes provided by Emojibase do what we want either (https://github.com/milesj/emojibase/issues/174). In the meantime, browser support for the RGI_Emoji character sequence class has made it feasible to write an emoji regex by hand, so that's what I've done.

* Add a fallback for BIGEMOJI_REGEX as well
This commit is contained in:
Robin 2024-07-04 13:48:07 -04:00 committed by GitHub
parent 489bc32674
commit c61eca8c24
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
6 changed files with 98 additions and 12 deletions

View file

@ -15,7 +15,6 @@ limitations under the License.
*/
import React, { createRef, KeyboardEvent, SyntheticEvent } from "react";
import EMOJI_REGEX from "emojibase-regex";
import {
IContent,
MatrixEvent,
@ -70,6 +69,7 @@ import { doMaybeLocalRoomAction } from "../../../utils/local-room";
import { Caret } from "../../../editor/caret";
import { IDiff } from "../../../editor/diff";
import { getBlobSafeMimeType } from "../../../utils/blobs";
import { EMOJI_REGEX } from "../../../HtmlUtils";
/**
* Build the mentions information based on the editor model (and any related events):