Most clients send their financials as PDF files, but to perform analysis, you need to import the data into Microsoft Excel. To help save you time, we will show you two ways to import a PDF into Excel:
- Save the PDF as a Word file and then copy and paste the desired tables into Excel.
- Use the Power Query tool to import the desired tables directly from the PDF into Excel. As we will see, Power Query allows you to transform the table if needed.
In this example, our goal will be to import table data from the Microsoft 2020 annual report PDF into Excel. On page 7 of the PDF, you can see the financial highlights table (see Figure 1). We want to import the data from the PDF into Excel.
Figure 1. Microsoft financial highlights 2016-2020 PDF table.
How to import a PDF into Excel using Word
After saving our PDF as a Word document, find the desired table on page 7 of the file. To import this table into Excel, simply copy and paste the table. The result is shown in Figure 2 and in the Word file of the workbook PDF.xlsx.
Figure 2. Microsoft financial highlights 2016-2020 Word table.
Pro tip: If the footnote references in parentheses bother you, you can manually remove them. And Power Query makes removing the square brackets a snap!
Using Power Query to import a PDF to Excel
The table we want to import is on page 7 of our PDF. Follow these steps to import data from a PDF to Excel via Power Query:
- From the Data tab, choose Get Data from the Get & Transform Group.
- From the dropdown arrow, choose From File and then From PDF.
- Browse to the relevant PDF and wait for Excel to process the PDF. As shown in Figure 3, you will see a list of tables in the PDF with the corresponding page numbers.
Figure 3. List of tables in the Microsoft 2020 annual report PDF.
- Browsing through the tables we see that Table003 on page 8 is what we need. If we select Transform Data, we can use Power Query to delete any unnecessary columns.
- Select Use First Row as Headers from the Home tab to ensure the years appear in the header row.
- Use the Control key to select the column headings for all non-numeric columns except the first column, and after right-clicking, click on Remove to remove the unneeded columns.
- From File, choose Close and Load To and then Table. The resulting table is shown in Figure 4.
Figure 4. Microsoft financials imported from PDF to Excel.
Learn more with Excel CPE
Excel offers a wide variety of functions, formulas, and automation that can save you hours of work while improving accuracy and outcomes. To help you make the most out of this software, we offer a wide variety of CPE courses to support your learning!
Check out Becker's wide range of CPE courses that teach you to make the most of this powerful tool:
- Excel: Technical Analysis Trading Strategies
- Excel: Enterprise Risk Management
- Excel: Magic with Excel
- Excel: Solve Hard Problems in Corporate Finance
- Excel Metrics: Best Practices
- Python for Excel Users: A Gentle Introduction
These and many more Excel-focused, CPE credit-earning courses are included in Becker's Prime CPE subscription. Sign up now for 12 months of access to over 1700 on-demand, webcast, and podcast CPE courses!