How to read Excel files in java using Apache POI?

The Excel spreadsheet is the very popular file format created by Microsoft. Apache POI is a pure java library developed for reading and writing Microsoft files(eg. Excel spreadsheet, Word, PowerPoint, etc.). In this article, you will see how to read Excel files using the Apache POI library.

Apache POI Setup in Eclipse IDE

There are two ways for installing Apache POI in Eclipse, based on your project type:

  1. Maven Project
  2. Stand alone Java Project (Non Maven)

Note: In this article, we have used the Apache POI-4.0.1 version of the library to demonstrate.

Maven Project

If you are going to create a maven project then you have to add the following maven dependency in the pom.xml file of your project:

<!-- Used for Excel 2003 or before (xls) -->
<dependency>
    <groupId>org.apache.poi</groupId>
    <artifactId>poi</artifactId>
    <version>4.0.1</version>
</dependency>

<!-- Used for Excel 2007 or later (xlsx)  -->
<dependency>
    <groupId>org.apache.poi</groupId>
    <artifactId>poi-ooxml</artifactId>
    <version>4.0.1</version>
</dependency>

Stand alone Java Project (Non Maven)

If you are going to create a standalone java project then you have to add the following jar files into your java build path:

  1. poi-4.0.1.jar
  2. poi-ooxml-4.0.1.jar
  3. poi-ooxml-schemas-4.0.1.jar
  4. xmlbeans-3.0.2.jar
  5. curvesapi-1.05.jar
  6. commons-codec-1.11.jar
  7. commons-collections4-4.2.jar
  8. commons-compress-1.18.jar
  9. commons-math3-3.6.1.jar

You can easily download all the above jars in one place: Download Apache POI Jars

If you want to know the complete environment setup for Apache POI in Eclipse IDE, follow another article here: Apache POI – Getting Started

Apache POI – API basics to Read Excel file

The Excel file generally comes in two file formats:

  1. Excel 2003 or before (XLS)
  2. Excel 2007 or later (XLSX)

For the above two Excel file formats, the Apache POI library provides two different components HSSF(denotes the API is for working with Excel 2003 and earlier) and XSSF(denotes the API is for working with Excel 2007 and later).

Now, In order to work with Apache POI API for Excel files, the basic interfaces need to understand:

  • Workbook: It represents the Excel workbook. The concrete implementation classes are: HSSFWorkbook and XSSFWorkbook.
  • Sheet: It represents the Excel worksheet. The implementation classes are HSSFSheet and XSSFSheet.
  • Row: It represents a row in a spreadsheet. The implementation classes are HSSFRow and XSSFRow.
  • Cell: It represents a cell in a row. The implementation classes are HSSFCell and XSSFCell.

A Sample Excel file to Read

Let’s assume the sample excel file which we are going to read looks like the below screenshot:

Sample Excel sheet

The above excel sheet contains the information about Employees including the Employee ID, Employee Name, Age, Email ID, Salary.

A Simple Example to Read Excel 2003(XLS) in Java

Below is the java code example that read excel’s first sheet and cell value of each row one by one:

ReadXLSExcelFile.java

package com.javacodepoint.excel;

import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.util.Iterator;

import org.apache.poi.hssf.usermodel.HSSFSheet;
import org.apache.poi.hssf.usermodel.HSSFWorkbook;
import org.apache.poi.ss.usermodel.Cell;
import org.apache.poi.ss.usermodel.Row;

public class ReadXLSExcelFile {

	public static void main(String[] args) throws IOException {

		// Creating a xls file object with specific file path to read
		File xlsFile = new File("E:/Excel/employee.xls");

		// Creating input stream
		FileInputStream inputStream = new FileInputStream(xlsFile);

		HSSFWorkbook workbook = new HSSFWorkbook(inputStream);
		
		// Reading the first sheet of the excel file
		HSSFSheet sheet = workbook.getSheetAt(0);
		
		Iterator<Row> iterator = sheet.iterator();

		// Iterating all the rows
		while (iterator.hasNext()) {
			Row nextRow = iterator.next();
			Iterator<Cell> cellIterator = nextRow.cellIterator();

			// Iterating all the columns in a row
			while (cellIterator.hasNext()) {
				
				Cell cell = cellIterator.next();

				switch (cell.getCellType()) {
				case STRING:
					System.out.print(cell.getStringCellValue());
					break;
				case BOOLEAN:
					System.out.print(cell.getBooleanCellValue());
					break;
				case NUMERIC:
					System.out.print(cell.getNumericCellValue());
					break;
				default:
					break;
				}
				System.out.print(" | ");
			}
			System.out.println();
		}

		// Closing the workbook and input stream
		workbook.close();
		inputStream.close();
	}

}

OUTPUT:

Employee Id | Employee Name | Age | Email ID | Salary |
101 | John william | 30 | william.john@gmail.com | 15000$ |
102 | Harsh singh | 35 | harsh.singh@gmail.com | 20000$ |

A Simple Example to Read Excel 2007(XLSX) in Java

To read Excel 2007(.xlsx) file format, it is also almost similar to the Excel 2003(.xls). The only difference here is we have to use XSSFWorkbook and XSSFSheet classes instead of HSSFWorkbook and HSSFSheet.

Below is the java code example that read excel’s first sheet only and the cell value of each row one by one:

ReadXLSXExcelFile.java

package com.javacodepoint.excel;

import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.util.Iterator;

import org.apache.poi.ss.usermodel.Cell;
import org.apache.poi.ss.usermodel.Row;
import org.apache.poi.xssf.usermodel.XSSFSheet;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;

public class ReadXLSXExcelFile {

	public static void main(String[] args) throws IOException {

		// Creating a xlsx file object with specific file path to read
		File xlsxFile = new File("E:/Excel/employee.xlsx");

		// Creating input stream
		FileInputStream inputStream = new FileInputStream(xlsxFile);

		XSSFWorkbook workbook = new XSSFWorkbook(inputStream);
		
		// Reading the first sheet of the excel file
		XSSFSheet sheet = workbook.getSheetAt(0);
		
		Iterator<Row> iterator = sheet.iterator();

		// Iterating all the rows
		while (iterator.hasNext()) {
			Row nextRow = iterator.next();
			Iterator<Cell> cellIterator = nextRow.cellIterator();

			// Iterating all the columns in a row
			while (cellIterator.hasNext()) {
				
				Cell cell = cellIterator.next();

				switch (cell.getCellType()) {
				case STRING:
					System.out.print(cell.getStringCellValue());
					break;
				case BOOLEAN:
					System.out.print(cell.getBooleanCellValue());
					break;
				case NUMERIC:
					System.out.print(cell.getNumericCellValue());
					break;
				default:
					break;
				}
				System.out.print(" | ");
			}
			System.out.println();
		}

		// Closing the workbook and input stream
		workbook.close();
		inputStream.close();
	}

}

A Generic Example to Read both XLS and XLSX in Java

To read both file formats in a single program, we have to use common interfaces Workbook, Sheet instead of specific classes like HSSFWorkbook, XSSFWorkbook, HSSFSheet, XSSFSheet, etc.

Below is the java code example that read the first sheet and the cell value of each row one by one:

ReadBothFormatExcelFile.java

package com.javacodepoint.excel;

import java.io.File;
import java.io.FileInputStream;
import java.util.Iterator;

import org.apache.poi.hssf.usermodel.HSSFWorkbook;
import org.apache.poi.ss.usermodel.Cell;
import org.apache.poi.ss.usermodel.Row;
import org.apache.poi.ss.usermodel.Sheet;
import org.apache.poi.ss.usermodel.Workbook;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;

public class ReadBothFormatExcelFile {

	public static void main(String[] args) throws Exception {

		// Creating a xlsx file object with specific file path to read
		File xlsxFile = new File("E:/Excel/employee.xlsx");

		// Creating input stream
		FileInputStream inputStream = new FileInputStream(xlsxFile);

		Workbook workbook = null;

		// Based on type workbook will crate
		if (xlsxFile.getName().endsWith(".xlsx")) {
			workbook = new XSSFWorkbook(inputStream);
		} else if (xlsxFile.getName().endsWith(".xls")) {
			workbook = new HSSFWorkbook(inputStream);
		} else {
			inputStream.close();
			throw new Exception("File not supported!");
		}

		// Reading the first sheet of the excel file
		Sheet sheet = workbook.getSheetAt(0);

		Iterator<Row> iterator = sheet.iterator();

		// Iterating all the rows
		while (iterator.hasNext()) {
			Row nextRow = iterator.next();
			Iterator<Cell> cellIterator = nextRow.cellIterator();

			// Iterating all the columns in a row
			while (cellIterator.hasNext()) {

				Cell cell = cellIterator.next();

				switch (cell.getCellType()) {
				case STRING:
					System.out.print(cell.getStringCellValue());
					break;
				case BOOLEAN:
					System.out.print(cell.getBooleanCellValue());
					break;
				case NUMERIC:
					System.out.print(cell.getNumericCellValue());
					break;
				default:
					break;
				}
				System.out.print(" | ");
			}
			System.out.println();
		}

		// Closing the workbook and input stream
		workbook.close();
		inputStream.close();
	}

}

Conclusion

Here you have seen how do you read the Excel file of 2003 or before and 2007 or later file format in java using Apache POI library.

If you want to download the above example programs developed using maven and eclipse, download them from the below link.


Related Articles

Share with friends

Leave a Comment

Your email address will not be published. Required fields are marked *