The final example is the most involved but illustrates a powerful approach for blending the data analysis of python with the user interface of Excel.
It is possible to build complex excel with pandas but that approach can be very laborious. An alternative approach would be to build up the complex file in Excel, then do the data manipulation and copy the data tab to the final Excel output.
Here is an example of the Excel dashboard we want to create:
Yes, I know that pie charts are awful but I can almost guarantee that someone is going to ask you to put one in the dashboard at some point in time! Also, this template had a pie chart and I decided to keep it in the final output instead of trying to figure out another chart.
It might be helpful to take a step back and look at the basic process the code will follow:
Let%u2019s get started with the code.
import win32com.client as win32 import pandas as pd from pathlib import Path # Read in the remote data file df = pd.read_csv("https://github.com/chris1610/pbpython/blob/master/data/sample-sales-tax.csv?raw=True") # Define the full path for the data file file data_file = Path.cwd() / "sales_summary.xlsx" # Define the full path for the final output file save_file = Path.cwd() / "sales_dashboard.xlsx" # Define the template file template_file = Path.cwd() / "sample_dashboard_template.xlsx"
In the section we performed our imports, read in the data and defined all three files. Of note is that this process includes the step of summarizing the data with pandas and saving the data in an Excel file. We then re-open that file and copy the data into the template. It is a bit convoluted but this is the best approach I could figure out for this scenario.
Next we perform the analysis and save the temp Excel file:
# Do some summary calcs # In the real world, this would likely be much more involved df_summary = df.groupby('category')['quantity', 'ext price', 'Tax amount'].sum() # Save the file as Excel df_summary.to_excel(data_file, sheet_name="Data")
Now we use COM to merge the temp output file into our Excel dashboard tab and save a new copy:
# Use com to copy the files around excel = win32.gencache.EnsureDispatch('Excel.Application') excel.Visible = False excel.DisplayAlerts = False # Template file wb_template = excel.Workbooks.Open(template_file) # Open up the data file wb_data = excel.Workbooks.Open(data_file) # Copy from the data file (select all data in A:D columns) wb_data.Worksheets("Data").Range("A:D").Select() # Paste into the template file excel.Selection.Copy(Destination=wb_template.Worksheets("Data").Range("A1")) # Must convert the path file object to a string for the save to work wb_template.SaveAs(str(save_file))
The code opens up Excel and makes sure it is not visible. Then it opens up the dashboard template and data files. It uses the
Range("A:D").Select() to select all the data and then copies it into the template file.
The final step is to save the template as a new file.
This approach can be a very convenient shortcut when you have a situation where you want to use python for data manipulation but need a complex Excel output. You may not have an apparent need for it now but if you ever build up a complex Excel report, this approach is much simpler than trying to code the spreadsheet by hand with python.
|Python AdaBoost Mathematics Behind AdaBoost||421||1|
|Python PyCaret How to optimize the probability threshold % in binary classification||2069||0|
|Python K-means Predicting Iris Flower Species||1322||2|
|Python PyCaret How to ignore certain columns for model building||2624||0|
|Python PyCaret Experiment Logging||679||0|
|Python PyWin32 Open a File in Excel||941||0|
|Python Guppy GSL Introduction||219||2|
|Python Usage of Guppy With Example||1100||2|
|Python Naive Bayes Tutorial||552||2|
|Python Guppy Recent Memory Usage of a Program||892||2|
|Introduction to AdaBoost||289||1|
|Python AdaBoost Implementation of AdaBoost||512||1|
|Python AdaBoost Advantages and Disadvantages of AdaBoost||3713||1|
|Python K-Means Clustering Applications||332||2|
|Python Random Forest Algorithm Decision Trees||439||0|
|Python K-means Clustering PREDICTING IRIS FLOWER SPECIES||456||1|
|Python Random Forest Algorithm Bootstrap||476||0|
|Python PyCaret Util Functions||441||0|
|Python K-means Music Genre Classification||1763||1|
|Python PyWin Attach an Excel file to Outlook||1541||0|
|Python Guppy GSL Document and Test Example||248||2|
|Python Random Forest Algorithm Bagging||386||0|
|Python AdaBoost An Example of How AdaBoost Works||279||1|
|Python PyWin32 Getting Started PyWin32||602||0|
|Python Naive Bayes in Machine Learning||374||2|
|Python PyCaret How to improve results from hyperparameter tuning by increasing "n_iter"||1723||0|
|Python PyCaret Getting Started with PyCaret 2.0||356||1|
|Python PyCaret Tune Model||1325||1|
|Python PyCaret Create your own AutoML software||320||0|
|Python PyCaret Intoduction to PyCaret||296||1|
|Python PyCaret Compare Models||2696||1|
|Python PyWin Copying Data into Excel||1153||0|
|Python Guppy Error: expected function body after function declarator||413||2|
|Python Coding Random forest classifier using xgBoost||247||0|
|Python PyCaret How to tune "n parameter" in unsupervised experiments||658||0|
|Python PyCaret How to programmatically define data types in the setup function||1403||0|
|Python PyCaret Ensemble Model||805||1|
|Python Random forest algorithm Introduction||227||0|
|Python k-means Clustering Example||337||1|
|Python PyCaret Plot Model||1243||1|
|Python Hamming Distance||715||0|
|Python Understanding Random forest algorithm||311||0|
|Python PyCaret Sort a Dictionary by Keys||244||0|
|Python Coding Random forest classifier using sklearn||340||0|
|Python Guppy Introduction||368||2|
|Python How to use Guppy/Heapy for tracking down Memory Usage||1069||2|
|Python AdaBoost Summary and Conclusion||231||1|
|Python PyCaret Create Model||365||1|
|Python k -means Clusturing Introduction||325||2|
|Python k-means Clustering With Example||348||2|