如何在特定日期自动触发Python脚本实现月度DataFrame更新

阿华AIGC实验室

2026-5-8

Solution: Auto-Trigger Monthly Data Update Script

Alright, let's break this down into two key parts: creating the Python script that adds the new monthly data, and setting up the automatic trigger to run it on the 6th of every month.

Step 1: Write the Python Script

First, we need a script that loads your existing DataFrame, calculates the next month's details, appends the new row, and saves the updated data. This script handles date logic automatically, so you don't have to manually input month names or dates.

Here's the complete script (save it as update_monthly_data.py):

import pandas as pd
from datetime import datetime, timedelta
import calendar

def add_next_month_entry():
    # Load existing data (or create empty df if file doesn't exist)
    try:
        df = pd.read_csv('monthly_data.csv')
        # Convert date columns to datetime objects for calculations
        df['Start_Date'] = pd.to_datetime(df['Start_Date'], format='%m/%d/%Y')
        df['End_Date'] = pd.to_datetime(df['End_Date'], format='%m/%d/%Y')
        df['Updated_on'] = pd.to_datetime(df['Updated_on'], format='%m/%d/%Y')
    except FileNotFoundError:
        # Initialize empty df if starting fresh (matches your example structure)
        df = pd.DataFrame(columns=['Month', 'Product', 'Start_Date', 'End_Date', 'Updated_on'])
    
    if not df.empty:
        # Get the last entry's end date to find the next month's start
        last_end_date = df['End_Date'].iloc[-1]
        next_start_date = last_end_date + timedelta(days=1)
    else:
        # If no data exists yet, start with January 2020 (as per your example)
        next_start_date = datetime(2020, 1, 1)
    
    # Extract year and month for the new entry
    new_year, new_month = next_start_date.year, next_start_date.month
    # Get full month name (e.g., "June")
    month_name = next_start_date.strftime('%B')
    
    # Calculate last day of the new month
    last_day = calendar.monthrange(new_year, new_month)[1]
    next_end_date = datetime(new_year, new_month, last_day)
    
    # Calculate Updated_on: 6th day of the following month
    if new_month == 12:
        updated_year, updated_month = new_year + 1, 1
    else:
        updated_year, updated_month = new_year, new_month + 1
    updated_on = datetime(updated_year, updated_month, 6)
    
    # Create the new row
    new_row = pd.DataFrame({
        'Month': [month_name],
        'Product': ['Beverage'],  # Adjust this if your product varies
        'Start_Date': [next_start_date.strftime('%m/%d/%Y')],
        'End_Date': [next_end_date.strftime('%m/%d/%Y')],
        'Updated_on': [updated_on.strftime('%m/%d/%Y')]
    })
    
    # Append and save the updated DataFrame
    df = pd.concat([df, new_row], ignore_index=True)
    # Convert dates back to your original string format
    df['Start_Date'] = df['Start_Date'].dt.strftime('%m/%d/%Y')
    df['End_Date'] = df['End_Date'].dt.strftime('%m/%d/%Y')
    df['Updated_on'] = df['Updated_on'].dt.strftime('%m/%d/%Y')
    
    df.to_csv('monthly_data.csv', index=False)
    print(f"Successfully added entry for {month_name} {new_year}")

if __name__ == "__main__":
    add_next_month_entry()

Notes on the Script:

It loads your existing monthly_data.csv file (place this in the same folder as the script, or use an absolute path like '/home/you/data/monthly_data.csv').
If the file doesn't exist, it starts fresh with January 2020 (matching your example).
It automatically calculates the next month's start/end dates and the Updated_on value (6th of the following month).
It saves the updated data back to the CSV file, overwriting the original (keep a backup if needed!).

Step 2: Set Up Automatic Trigger

Now, we need to make this script run on the 6th of every month. The method depends on your operating system:

For Windows: Use Task Scheduler

Open Task Scheduler (search for it in the Start Menu).
Click Create Basic Task in the right-hand pane.
Name the task (e.g., "Monthly Beverage Data Update") and add a description, then click Next.
Select Monthly as the trigger, click Next.
Set it to run on the 6th day of every month, at your preferred time (e.g., 9:00 AM), then click Next.
Choose Start a program as the action, click Next.
In Program/script, browse to your Python executable (e.g., C:\Python311\python.exe). To find this, open Command Prompt and run where python.
In Add arguments, enter the full path to your script (e.g., "C:\Users\You\Scripts\update_monthly_data.py").
In Start in, enter the folder where your script and CSV file are located (e.g., "C:\Users\You\Scripts"), then click Next.
Check Open the Properties dialog for this task when I click Finish, then click Finish.
In the Properties window, go to the Settings tab and check Run whether user is logged on or not (so it runs even if you're not signed in). Click OK and enter your Windows password when prompted.

For macOS/Linux: Use Cron

Open Terminal.
Run crontab -e to edit your cron jobs (this will open a text editor, usually nano).
Add this line to the file (replace paths with your actual ones):
```
0 9 6 * * /usr/bin/python3 /home/you/scripts/update_monthly_data.py
```
- Breakdown:
  - 0: Run at the 0th minute (top of the hour)
  - 9: Run at 9 AM
  - 6: Run on the 6th day of the month
  - *: Every month
  - *: Every day of the week
  - /usr/bin/python3: Path to your Python 3 executable (find this with which python3)
  - /home/you/scripts/update_monthly_data.py: Full path to your script
Save and exit: In nano, press Ctrl+O, hit Enter, then Ctrl+X.
Verify the cron job is set up by running crontab -l.

Important Tips:

Test the script first: Run it manually once (python update_monthly_data.py) to make sure it adds the correct row to your CSV.
Use absolute paths: To avoid issues with working directories, use full paths for the CSV file in the script and for the script path in the scheduler/cron.
Check permissions: On macOS/Linux, ensure the script has execute permissions (run chmod +x /path/to/update_monthly_data.py if needed).

That's it! Your script will now automatically run on the 6th of every month and add the new month's data to your DataFrame.

内容的提问来源于stack exchange，提问作者harsh