How to call all the excel file in a folder and run through the pythons one by one and have the results save in one excel file? - TagMerge
3How to call all the excel file in a folder and run through the pythons one by one and have the results save in one excel file?How to call all the excel file in a folder and run through the pythons one by one and have the results save in one excel file?

How to call all the excel file in a folder and run through the pythons one by one and have the results save in one excel file?

Asked 1 years ago
0
3 answers

So you want to read multiple files, get a specific cell and then create a new data frame and save it as a new Excel file:

cells = []
for f in glob.glob("*.xlsx"):
    data = pd.read_excel(f, 'Sheet1')
    cells.append(data.iloc[3,5])
pd.Series(cells).to_excel('file.xlsx')

In my particular example I took cell F4 (row=3, col=5) - you can obviously take any other cell that you like or even more than one cell and then save it to a different list, combining the two lists in the end. You could also have more complex logic where you could check one cell to decide which other cell to look at next.

The key point is that you want to iterate through a bunch of files and for each of them:

  • read the file
  • extract whatever data you are interested in
  • set this data aside somewhere

Once you've gone through all the files combine all the data in any way that you like and then save it to disk in a format of your choice.

Source: link

0

Let’s create some mock-up dataframes, so we have something to work with. We create 2 dataframes, the first one is a 20 row by 10 columns random numbers; and the second dataframe is 10 rows by 1 column.
import pandas as pd
import numpy as np


df_1 = pd.DataFrame(np.random.rand(20,10))
df_2 = pd.DataFrame(np.random.rand(10,1))
This is the method demonstrated on the official pandas documentation.
with pd.ExcelWriter('mult_sheets_1.xlsx') as writer1:
    df_1.to_excel(writer1, sheet_name = 'df_1', index = False)
    df_2.to_excel(writer1, sheet_name = 'df_2', index = False)
This is my personal preferred method. Let me show you how it looks like then tell you why I prefer this over method 1.
writer2 = pd.ExcelWriter('mult_sheets_2.xlsx')

df_1.to_excel(writer2, sheet_name = 'df_1', index = False)
df_2.to_excel(writer2, sheet_name = 'df_2', index = False)

writer2.save()

Source: link

0

Now that you’re aware of the benefits of a tool like openpyxl, let’s get down to it and start by installing the package. For this tutorial, you should use Python 3.7 and openpyxl 2.6.2. To install the package, you can do the following:
$ pip install openpyxl
After you install the package, you should be able to create a super simple spreadsheet with the following code:
from openpyxl import Workbook

workbook = Workbook()
sheet = workbook.active

sheet["A1"] = "hello"
sheet["B1"] = "world!"

workbook.save(filename="hello_world.xlsx")
>>>
>>> from openpyxl import load_workbook
>>> workbook = load_workbook(filename="sample.xlsx")
>>> workbook.sheetnames
['Sheet 1']

>>> sheet = workbook.active
>>> sheet
<Worksheet "Sheet 1">

>>> sheet.title
'Sheet 1'
>>>
>>> sheet["A1"]
<Cell 'Sheet 1'.A1>

>>> sheet["A1"].value
'marketplace'

>>> sheet["F10"].value
"G-Shock Men's Grey Sport Watch"
>>>
>>> sheet.cell(row=10, column=6)
<Cell 'Sheet 1'.F10>

>>> sheet.cell(row=10, column=6).value
"G-Shock Men's Grey Sport Watch"

Source: link

Recent Questions on python

    Programming Languages