The Excel spreadsheet is the very popular file format created by Microsoft. Apache POI is a pure java library developed for reading and writing Microsoft files(eg. Excel spreadsheet, Word, PowerPoint, etc.). In this article, you will see how to read Excel files using the Apache POI library.
Table of Contents
Apache POI Setup in Eclipse IDE
There are two ways for installing Apache POI in Eclipse, based on your project type:
- Maven Project
- Stand alone Java Project (Non Maven)
Note: In this article, we have used the Apache POI-4.0.1 version of the library to demonstrate.
Maven Project
If you are going to create a maven project then you have to add the following maven dependency in the pom.xml file of your project:
<!-- Used for Excel 2003 or before (xls) -->
<dependency>
<groupId>org.apache.poi</groupId>
<artifactId>poi</artifactId>
<version>4.0.1</version>
</dependency>
<!-- Used for Excel 2007 or later (xlsx) -->
<dependency>
<groupId>org.apache.poi</groupId>
<artifactId>poi-ooxml</artifactId>
<version>4.0.1</version>
</dependency>
Stand alone Java Project (Non Maven)
If you are going to create a standalone java project then you have to add the following jar files into your java build path:
- poi-4.0.1.jar
- poi-ooxml-4.0.1.jar
- poi-ooxml-schemas-4.0.1.jar
- xmlbeans-3.0.2.jar
- curvesapi-1.05.jar
- commons-codec-1.11.jar
- commons-collections4-4.2.jar
- commons-compress-1.18.jar
- commons-math3-3.6.1.jar
You can easily download all the above jars in one place: Download Apache POI Jars
If you want to know the complete environment setup for Apache POI in Eclipse IDE, follow another article here: Apache POI – Getting Started
Apache POI – API basics to Read Excel file
The Excel file generally comes in two file formats:
- Excel 2003 or before (XLS)
- Excel 2007 or later (XLSX)
For the above two Excel file formats, the Apache POI library provides two different components HSSF(denotes the API is for working with Excel 2003 and earlier) and XSSF(denotes the API is for working with Excel 2007 and later).
Now, In order to work with Apache POI API for Excel files, the basic interfaces need to understand:
Workbook
: It represents the Excel workbook. The concrete implementation classes are:HSSFWorkbook
andXSSFWorkbook
.Sheet
: It represents the Excel worksheet. The implementation classes areHSSFSheet
andXSSFSheet
.Row
: It represents a row in a spreadsheet. The implementation classes areHSSFRow
andXSSFRow
.Cell
: It represents a cell in a row. The implementation classes areHSSFCell
andXSSFCell
.
A Sample Excel file to Read
Let’s assume the sample excel file which we are going to read looks like the below screenshot:
The above excel sheet contains the information about Employees including the Employee ID, Employee Name, Age, Email ID, Salary.
A Simple Example to Read Excel 2003(XLS) in Java
Below is the java code example that read excel’s first sheet and cell value of each row one by one:
ReadXLSExcelFile.java
package com.javacodepoint.excel;
import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.util.Iterator;
import org.apache.poi.hssf.usermodel.HSSFSheet;
import org.apache.poi.hssf.usermodel.HSSFWorkbook;
import org.apache.poi.ss.usermodel.Cell;
import org.apache.poi.ss.usermodel.Row;
public class ReadXLSExcelFile {
public static void main(String[] args) throws IOException {
// Creating a xls file object with specific file path to read
File xlsFile = new File("E:/Excel/employee.xls");
// Creating input stream
FileInputStream inputStream = new FileInputStream(xlsFile);
HSSFWorkbook workbook = new HSSFWorkbook(inputStream);
// Reading the first sheet of the excel file
HSSFSheet sheet = workbook.getSheetAt(0);
Iterator<Row> iterator = sheet.iterator();
// Iterating all the rows
while (iterator.hasNext()) {
Row nextRow = iterator.next();
Iterator<Cell> cellIterator = nextRow.cellIterator();
// Iterating all the columns in a row
while (cellIterator.hasNext()) {
Cell cell = cellIterator.next();
switch (cell.getCellType()) {
case STRING:
System.out.print(cell.getStringCellValue());
break;
case BOOLEAN:
System.out.print(cell.getBooleanCellValue());
break;
case NUMERIC:
System.out.print(cell.getNumericCellValue());
break;
default:
break;
}
System.out.print(" | ");
}
System.out.println();
}
// Closing the workbook and input stream
workbook.close();
inputStream.close();
}
}
OUTPUT:
Employee Id | Employee Name | Age | Email ID | Salary |
101 | John william | 30 | william.john@gmail.com | 15000$ |
102 | Harsh singh | 35 | harsh.singh@gmail.com | 20000$ |
A Simple Example to Read Excel 2007(XLSX) in Java
To read Excel 2007(.xlsx) file format, it is also almost similar to the Excel 2003(.xls). The only difference here is we have to use XSSFWorkbook
and XSSFSheet
classes instead of HSSFWorkbook
and HSSFSheet
.
Below is the java code example that read excel’s first sheet only and the cell value of each row one by one:
ReadXLSXExcelFile.java
package com.javacodepoint.excel;
import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.util.Iterator;
import org.apache.poi.ss.usermodel.Cell;
import org.apache.poi.ss.usermodel.Row;
import org.apache.poi.xssf.usermodel.XSSFSheet;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;
public class ReadXLSXExcelFile {
public static void main(String[] args) throws IOException {
// Creating a xlsx file object with specific file path to read
File xlsxFile = new File("E:/Excel/employee.xlsx");
// Creating input stream
FileInputStream inputStream = new FileInputStream(xlsxFile);
XSSFWorkbook workbook = new XSSFWorkbook(inputStream);
// Reading the first sheet of the excel file
XSSFSheet sheet = workbook.getSheetAt(0);
Iterator<Row> iterator = sheet.iterator();
// Iterating all the rows
while (iterator.hasNext()) {
Row nextRow = iterator.next();
Iterator<Cell> cellIterator = nextRow.cellIterator();
// Iterating all the columns in a row
while (cellIterator.hasNext()) {
Cell cell = cellIterator.next();
switch (cell.getCellType()) {
case STRING:
System.out.print(cell.getStringCellValue());
break;
case BOOLEAN:
System.out.print(cell.getBooleanCellValue());
break;
case NUMERIC:
System.out.print(cell.getNumericCellValue());
break;
default:
break;
}
System.out.print(" | ");
}
System.out.println();
}
// Closing the workbook and input stream
workbook.close();
inputStream.close();
}
}
A Generic Example to Read both XLS and XLSX in Java
To read both file formats in a single program, we have to use common interfaces Workbook
, Sheet
instead of specific classes like HSSFWorkbook
, XSSFWorkbook
, HSSFSheet
, XSSFSheet
, etc.
Below is the java code example that read the first sheet and the cell value of each row one by one:
ReadBothFormatExcelFile.java
package com.javacodepoint.excel;
import java.io.File;
import java.io.FileInputStream;
import java.util.Iterator;
import org.apache.poi.hssf.usermodel.HSSFWorkbook;
import org.apache.poi.ss.usermodel.Cell;
import org.apache.poi.ss.usermodel.Row;
import org.apache.poi.ss.usermodel.Sheet;
import org.apache.poi.ss.usermodel.Workbook;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;
public class ReadBothFormatExcelFile {
public static void main(String[] args) throws Exception {
// Creating a xlsx file object with specific file path to read
File xlsxFile = new File("E:/Excel/employee.xlsx");
// Creating input stream
FileInputStream inputStream = new FileInputStream(xlsxFile);
Workbook workbook = null;
// Based on type workbook will crate
if (xlsxFile.getName().endsWith(".xlsx")) {
workbook = new XSSFWorkbook(inputStream);
} else if (xlsxFile.getName().endsWith(".xls")) {
workbook = new HSSFWorkbook(inputStream);
} else {
inputStream.close();
throw new Exception("File not supported!");
}
// Reading the first sheet of the excel file
Sheet sheet = workbook.getSheetAt(0);
Iterator<Row> iterator = sheet.iterator();
// Iterating all the rows
while (iterator.hasNext()) {
Row nextRow = iterator.next();
Iterator<Cell> cellIterator = nextRow.cellIterator();
// Iterating all the columns in a row
while (cellIterator.hasNext()) {
Cell cell = cellIterator.next();
switch (cell.getCellType()) {
case STRING:
System.out.print(cell.getStringCellValue());
break;
case BOOLEAN:
System.out.print(cell.getBooleanCellValue());
break;
case NUMERIC:
System.out.print(cell.getNumericCellValue());
break;
default:
break;
}
System.out.print(" | ");
}
System.out.println();
}
// Closing the workbook and input stream
workbook.close();
inputStream.close();
}
}
Conclusion
Here you have seen how do you read the Excel file of 2003 or before and 2007 or later file format in java using Apache POI library.
If you want to download the above example programs developed using maven and eclipse, download them from the below link.