In Java, the InputStreamReader accepts a charset to decode the byte streams into character streams. We can pass a StandardCharsets.UTF_8 into the InputStreamReader constructor to read data from a UTF-8 file.
import java.nio.charset.StandardCharsets;
//...
try (FileInputStream fis = new FileInputStream(file);
InputStreamReader isr = new InputStreamReader(fis, StandardCharsets.UTF_8);
BufferedReader reader = new BufferedReader(isr)
) {
String str;
while ((str = reader.readLine()) != null) {
System.out.println(str);
}
} catch (IOException e) {
e.printStackTrace();
}
In Java 7+, many file read APIs start to accept charset as an argument, making reading a UTF-8 very easy.
// Java 7 BufferedReader reader = Files.newBufferedReader(path, StandardCharsets.UTF_8); // Java 8 List<String> list = Files.readAllLines(path, StandardCharsets.UTF_8); // Java 8 Stream<String> lines = Files.lines(path, StandardCharsets.UTF_8); // Java 11 String s = Files.readString(path, StandardCharsets.UTF_8);
1. UTF-8 File
A UTF-8 encoded file c:\\temp\\test.txt, with Chinese characters.

2. Read UTF-8 file
This example shows a few ways to read a UTF-8 file.
UnicodeRead.java
package com.favtuts.io.howto;
import java.io.BufferedReader;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.nio.charset.StandardCharsets;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.List;
import java.util.stream.Stream;
public class UnicodeRead {
public static void main(String[] args) {
String fileName = "/home/tvt/workspace/favtuts/unicode.txt";
//readUnicodeClassic(fileName);
//readUnicodeFiles(fileName);
//readUnicodeJava11(fileName);
readUnicodeBufferedReader(fileName);
}
// Java 7 - Files.newBufferedReader(path, StandardCharsets.UTF_8)
// Java 8 - Files.newBufferedReader(path) // default UTF-8
public static void readUnicodeBufferedReader(String fileName) {
Path path = Paths.get(fileName);
// Java 8, default UTF-8
try (BufferedReader reader = Files.newBufferedReader(path)) {
String str;
while ((str = reader.readLine()) != null) {
System.out.println(str);
}
} catch (IOException e) {
e.printStackTrace();
}
}
// Java 11, adds charset to FileReader
public static void readUnicodeJava11(String fileName) {
Path path = Paths.get(fileName);
try (FileReader fr = new FileReader(fileName, StandardCharsets.UTF_8);
BufferedReader reader = new BufferedReader(fr)) {
String str;
while ((str = reader.readLine()) != null) {
System.out.println(str);
}
} catch (IOException e) {
e.printStackTrace();
}
}
public static void readUnicodeFiles(String fileName) {
Path path = Paths.get(fileName);
try {
// Java 11
String s = Files.readString(path, StandardCharsets.UTF_8);
System.out.println(s);
// Java 8
List<String> list = Files.readAllLines(path, StandardCharsets.UTF_8);
list.forEach(System.out::println);
// Java 8
Stream<String> lines = Files.lines(path, StandardCharsets.UTF_8);
lines.forEach(System.out::println);
lines.close();
} catch (IOException e) {
e.printStackTrace();
}
}
public static void readUnicodeClassic(String fileName) {
File file = new File(fileName);
try (FileInputStream fis = new FileInputStream(file);
InputStreamReader isr = new InputStreamReader(fis, StandardCharsets.UTF_8);
BufferedReader reader = new BufferedReader(isr)
) {
String str;
while((str = reader.readLine()) != null) {
System.out.println(str);
}
} catch (IOException e) {
e.printStackTrace();
}
}
}
Output
line 1
line 2
line 3
你好,世界
Further Reading
How to write to a UTF-8 file in Java
Download Source Code
$ git clone https://github.com/favtuts/java-core-tutorials-examples
$ cd java-io/howto