JavaScript String codePointAt() - Get Unicode Code Point

Updated on November 13, 2024
codePointAt() header image

Introduction

The JavaScript string method codePointAt() is designed to return a Unicode code point value for a character at a specified position in a string. This method plays a crucial role in handling Unicode characters and symbols, especially those outside the basic multilingual plane that are common in various writing systems, emojis, and complex symbols.

In this article, you will learn how to utilize the codePointAt() method to read and manipulate Unicode characters in JavaScript efficiently. This includes retrieving the code point of characters and understanding their significance in different programming contexts.

Understanding codePointAt()

Retrieve Unicode Code Point of a Character

  1. Choose a string that contains a variety of characters, including multilingual texts or emojis.

  2. Specify the position of the character within the string using its index.

  3. Use the codePointAt() method to get the Unicode code point of the character.

    javascript
    let text = '𠮷';
    let codePoint = text.codePointAt(0);
    console.log(codePoint);
    

    This example retrieves the code point for the character '𠮷'. Since this character is outside the basic multilingual plane, it correctly displays 131071, which is the Unicode code point in decimal.

Use codePointAt() with Common characters

  1. Select a simple English string.

  2. Apply codePointAt() to a target character.

    javascript
    let phrase = "Hello";
    let codePoint = phrase.codePointAt(1);
    console.log(codePoint);
    

    In this snippet, codePointAt(1) fetches the Unicode code point of 'e', which outputs 101, the Unicode code point for the lowercase letter 'e'.

Handle Characters with Surrogate Pairs

  1. Understand the concept of surrogate pairs in Unicode.

  2. Choose a character or string that uses surrogate pairs.

  3. Apply codePointAt() to retrieve the complete Unicode code point for the character.

    javascript
    let symbol = '😊';
    let codePoint = symbol.codePointAt(0);
    console.log(codePoint);
    

    This code snippet fetches Unicode code points for a smiley face emoji. Emojis often utilize surrogate pairs in UTF-16, and this codePointAt(0) correctly fetches the complete code point 128522.

Exploring Sequential Characters

Loop Through Characters of a String

  1. Use a loop to traverse through a string.

  2. Apply codePointAt() in each iteration to retrieve and print code points of all characters.

    javascript
    let greeting = "Hello 🌍!";
    for (let i = 0; i < greeting.length; i++) {
        let codePoint = greeting.codePointAt(i);
        console.log(`Character: ${greeting.charAt(i)}, Code Point: ${codePoint}`);
    }
    

    This code goes through each character in the string and outputs its respective Unicode code point. For characters represented by surrogate pairs, the correct complete code point is displayed.

Conclusion

The JavaScript codePointAt() function is a vital tool for managing Unicode characters and ensuring that your applications properly handle diverse languages and symbols. Recognize and implement this method to enhance the functionality of web applications dealing with internationalization, emojis, certain specialized symbols, and other Unicode-related tasks. By mastering codePointAt(), you contribute to making your software globally accessible and functional across various platforms and devices.