Whitespaces in strings, while sometimes essential for formatting, often need to be removed for processing text, such as when preparing data for machine learning models, handling user input in web forms, or simply for aesthetics in string display. Java, being a robust language, provides several methods to accomplish this, leveraging its standard libraries and features.
In this article, you will learn how to efficiently remove all whitespaces from a string using Java. Explore different methods including using regular expressions, the replaceAll()
method, Java streams, and external libraries like Apache Commons Lang to achieve this with various examples.
Understand that the String
class in Java has a replaceAll()
method which accepts a regular expression. This method reads each character in the string and replaces each substring that matches the regex with the given replacement.
Utilize the regex \\s+
which matches one or more whitespace characters.
public class RemoveSpaces {
public static void main(String[] args) {
String original = "Java is fun";
String noSpaces = original.replaceAll("\\s+", "");
System.out.println(noSpaces);
}
}
Here, the replaceAll("\\s+", "")
statement replaces all groups of whitespace characters in the string original
with an empty string, effectively removing them. This results in the output Javaisfun
.
Evaluate handling spaces, tabs, newlines, and other unicode whitespace characters.
Apply the same replaceAll()
method to ensure it removes these types as well.
public class RemoveAllWhitespaces {
public static void main(String[] args) {
String original = "Java\t is\n fun \r\n";
String noSpaces = original.replaceAll("\\s+", "");
System.out.println(noSpaces);
}
}
The code snippet above also demonstrates removing tab (\t
) and newline (\n
) characters along with carriage returns. The output from this code is Javaisfun
.
Leverage Java 8's Streams API to filter out whitespace characters from a string.
Convert the string into a stream of characters, filter out the whitespace characters, and collect the result back into a string.
import java.util.stream.Collectors;
public class RemoveWhitespacesStreams {
public static void main(String[] args) {
String original = " Stream fun ";
String noSpaces = original.chars()
.filter(c -> !Character.isWhitespace(c))
.mapToObj(c -> String.valueOf((char) c))
.collect(Collectors.joining());
System.out.println(noSpaces);
}
}
This example filters out all characters that Character.isWhitespace()
identifies as whitespace, which includes spaces, tabs, and other similar characters. The filter method only allows non-whitespace characters to pass through, which are then collected back to a String.
Add Apache Commons Lang to your Java project to access enhanced string manipulation utilities. This can be done using Maven by adding the dependency to your pom.xml
. Note that the specific version may change over time:
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-lang3</artifactId>
<version>3.12.0</version>
</dependency>
After adding the dependency, use the StringUtils.deleteWhitespace()
method provided by Apache Commons Lang for removing all types of whitespace characters from a string.
import org.apache.commons.lang3.StringUtils;
public class RemoveSpacesCommonsLang {
public static void main(String[] args) {
String original = "Commons Lang helps \t\n";
String noSpaces = StringUtils.deleteWhitespace(original);
System.out.println(noSpaces);
}
}
In this example, StringUtils.deleteWhitespace(original)
removes all whitespace characters efficiently without the need for writing regular expressions manually. The output will be "CommonsLanghelps".
Removing all whitespaces from a string in Java can be achieved through various methods, each serving different complexities and use cases. While replaceAll()
with a regex is quite flexible for most needs, Java streams offer a more functional approach, particularly beneficial when dealing with large data streams. For those seeking an external library with more string utilities, Apache Commons Lang provides a simple and effective solution. Employ these approaches in your Java projects to maintain clean, readable strings free of unnecessary whitespace.