When composing a toot with a system emoji, the emoji will replace the :text: when the second colon is typed or when an emoji is selected from the list. This does not occur for custom emoji for an instance. Note that the emoji appear normally after the toot is posted.
The big difference is that system emojis are a font and they can be inserted into any normal text field so this works without any problems.
Custom emojis are small PNG images which can be shown in the emoji picker and in the timeline but not within a editable text field.
That's a limitation of the HTML standard.
A workaround could be using a contenteditable DIV instead of a Textarea but then other formatting such as bold and italics could be added to the text,too but that's not supported by the Mastodon API when sending.
This change could have some not that nice effects so I think it would be best to leave the current behaviour.