In Java, the InputStreamReader accepts a charset to decode the byte streams into character streams. We can pass a StandardCharsets.UTF_8
into the InputStreamReader
constructor to read data from a UTF-8 file.
import java.nio.charset.StandardCharsets; //... try (FileInputStream fis = new FileInputStream(file); InputStreamReader isr = new InputStreamReader(fis, StandardCharsets.UTF_8); BufferedReader reader = new BufferedReader(isr) ) { String str; while ((str = reader.readLine()) != null) { System.out.println(str); } } catch (IOException e) { e.printStackTrace(); }
In Java 7+, many file read APIs start to accept charset
as an argument, making reading a UTF-8 very easy.
// Java 7 BufferedReader reader = Files.newBufferedReader(path, StandardCharsets.UTF_8); // Java 8 List<String> list = Files.readAllLines(path, StandardCharsets.UTF_8); // Java 8 Stream<String> lines = Files.lines(path, StandardCharsets.UTF_8); // Java 11 String s = Files.readString(path, StandardCharsets.UTF_8);
1. UTF-8 File
A UTF-8 encoded file c:\\temp\\test.txt
, with Chinese characters.

2. Read UTF-8 file
This example shows a few ways to read a UTF-8 file.
UnicodeRead.java
package com.favtuts.io.howto; import java.io.BufferedReader; import java.io.File; import java.io.FileInputStream; import java.io.FileReader; import java.io.IOException; import java.io.InputStreamReader; import java.nio.charset.StandardCharsets; import java.nio.file.Files; import java.nio.file.Path; import java.nio.file.Paths; import java.util.List; import java.util.stream.Stream; public class UnicodeRead { public static void main(String[] args) { String fileName = "/home/tvt/workspace/favtuts/unicode.txt"; //readUnicodeClassic(fileName); //readUnicodeFiles(fileName); //readUnicodeJava11(fileName); readUnicodeBufferedReader(fileName); } // Java 7 - Files.newBufferedReader(path, StandardCharsets.UTF_8) // Java 8 - Files.newBufferedReader(path) // default UTF-8 public static void readUnicodeBufferedReader(String fileName) { Path path = Paths.get(fileName); // Java 8, default UTF-8 try (BufferedReader reader = Files.newBufferedReader(path)) { String str; while ((str = reader.readLine()) != null) { System.out.println(str); } } catch (IOException e) { e.printStackTrace(); } } // Java 11, adds charset to FileReader public static void readUnicodeJava11(String fileName) { Path path = Paths.get(fileName); try (FileReader fr = new FileReader(fileName, StandardCharsets.UTF_8); BufferedReader reader = new BufferedReader(fr)) { String str; while ((str = reader.readLine()) != null) { System.out.println(str); } } catch (IOException e) { e.printStackTrace(); } } public static void readUnicodeFiles(String fileName) { Path path = Paths.get(fileName); try { // Java 11 String s = Files.readString(path, StandardCharsets.UTF_8); System.out.println(s); // Java 8 List<String> list = Files.readAllLines(path, StandardCharsets.UTF_8); list.forEach(System.out::println); // Java 8 Stream<String> lines = Files.lines(path, StandardCharsets.UTF_8); lines.forEach(System.out::println); lines.close(); } catch (IOException e) { e.printStackTrace(); } } public static void readUnicodeClassic(String fileName) { File file = new File(fileName); try (FileInputStream fis = new FileInputStream(file); InputStreamReader isr = new InputStreamReader(fis, StandardCharsets.UTF_8); BufferedReader reader = new BufferedReader(isr) ) { String str; while((str = reader.readLine()) != null) { System.out.println(str); } } catch (IOException e) { e.printStackTrace(); } } }
Output
line 1
line 2
line 3
你好,世界
Further Reading
How to write to a UTF-8 file in Java
Download Source Code
$ git clone https://github.com/favtuts/java-core-tutorials-examples
$ cd java-io/howto